In what cases will building aggregates not assist in reducing query run time i.e. if DB stats shows time is spent in cache etc and if so can someone explain the reasoning. Thanks
John Skrabak replied
As you mentioned, if the query is able to retrieve the data from the OLAP cache, the aggregate won't be used.
If the aggregate does not summarize the data very much, if may not help much at all - the tables associated with the base cube, if queried frequently, may reside in the DB's cache, and retrieving the base cube data from the DB cache would be faster than retrieving minimally summarized aggregate data form disk.
If indexes are missing or DB table statistics not current on the aggregate tables, poor execution plan could be used.
Sometimes having too may aggregates on a cube can cause problems when the query has multiple RKFs in it and each RKF results in separate SQL queries against different aggregates, with each query having to read all the required dimension and master data tables, rather than having a single query on the base cube that just accesses the dim and MD tables once. When you have large dim or MD tables involved, it is not unusual to have a situation where most of the query cost is accessing the dim and MD tables rather the fact tables.
If you run the query in RSRT in debug mode, you would see if mulitple aggregates are being hit. Then you jsut need to review the explain plans to see if the use of multiple quereis against different aggregates is more expensive than a single query hitting the base cube, or a larger more comprehensive aggregate, e.g.
Query cost against base cube = 1000 (200 fact table, 800 dim and MD tables)
Query cost on aggregate 1 = 800 (50 on aggr fact, 750 dim and MD tables)
Query cost on aggregate 2 = 650 (20 on aggr fact, 630 dim and MD tables)
Total cost of quereis using aggregates = 1450 vs 1000 when run against base cube.