on 04-25-2009 4:21 PM
The following query is taking more than 4hrs to execute.
select l_extendedprice , count(l_extendedprice) from dbo.lineitem group by l_extendedprice
Cardinality of table : 6001215 ( > 6 million)
Index on l_extendedprice is there
ReadAheadLobThreshold = 500
Database version 7.7.06.09
I need to optimize this query. Kindly suggest way out.
Thanks
Priyank
Platform: Windows XP
And yes we can see the maximum CPU usage while query executes.
As I am using MAXDB for my experimentation purpose, it wasn't difficult for me to setup a new Database with more Data CACHE and single Data Volume as data ~ 1GB.
Performance improved drastically as that query executed in 20 secs!!
But I have preserved the previous configuration would definitely like to solve the issue with that configuration.
I would return back with the Database Analyzer o/p and Command monitor o/p.
Thanks Lars for the help
Priyank
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
>
> Platform: Windows XP
>
> And yes we can see the maximum CPU usage while query executes.
>
> As I am using MAXDB for my experimentation purpose, it wasn't difficult for me to setup a new Database with more Data CACHE and single Data Volume as data ~ 1GB.
> Performance improved drastically as that query executed in 20 secs!!
Hmm.. that's really odd - as all the data necessary to run this query do already fit into the cache, more cache memory shouldn't change anything for this query.
And even more odd: usually we expect a much better paralell I/O by using multiple data volumes instead a single one.
Can you give more details on the storage configurations? How many physical spindels/disks are there?
> But I have preserved the previous configuration would definitely like to solve the issue with that configuration.
>
> I would return back with the Database Analyzer o/p and Command monitor o/p.
We're eager to see the data!
regards,
Lars
OK. Actually the query mentioned above is already running since last 6 hrs and still the execution is not complete.
So is there any way to get the information as required by you (as mentioned in the last post) while the query is in execution. Or is it required to stop this execution and start all over again?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I/O Data Cache : 81920 KB
Data Cache : 80296 KB
6 Data Volumes
first 3 of size 51.200 KB
other 3 of size 1,048,576 KB
Yes looking at the cardinality of the table lots of I/O required but still more than 4 hrs is quite unrealistic. Some tuning is required.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Data Cache : 80296 KB
Ok, that's 8 Gigs for cache.
The index takes 16335 pages à 8 KB = 130.680 KB = 128 MB.
Fits completely into RAM - the same is true for the additional temp resultset.
So once the index has been read to cache I assume the query is a lot quicker than 4 hours.
6 Data Volumes
first 3 of size 51.200 KB
other 3 of size 1,048,576 KB
Well, that's not the smartest thing to do.
That way the larger volumes will get double the I/O requests which eventually saturates the I/O channel.
Yes looking at the cardinality of the table lots of I/O required but still more than 4 hrs is quite unrealistic. Some tuning is required.
We're not talking about the cardinality here - you want all the data.
We talk pages then.
And as we've seen, the table is not touched for this query.
Instead the smaller index is completely read in a index only strategy.
Loading 128MB from disk, creating temporary data in the same size and spilling out the information (and thereby reading the 128 MB temp size again) in 4 hours add up to ca. 384 MB/4 hours = 96 MB/hour = 1,6 MB/minute.
Not too good really - I suspect that the I/O system here is not the quickest one.
You may want to activate the time measurement and set the DB Analyzer interval to 120 secs.
Then activate Command and Resourcemonitor and look for statements taking longer than 10 minutes.
Now run your statement again and let us know the information from Command/Resourcemonitor and check for warnings in the DBanalyzer output.
regards,
Lars
Execution Plan:
SCHEMANAME;TABLENAME;COLUMN_OR_INDEX;STRATEGY;PAGECOUNT
DBO LINEITEM INDEX_2 INDEX SCAN; 16335
ONLY INDEX ACCESSED;
JDBC_CURSOR_22 RESULT IS COPIED , COSTVALUE IS; 49005
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
>
SCHEMANAME;TABLENAME;COLUMN_OR_INDEX;STRATEGY;PAGECOUNT
> DBO LINEITEM INDEX_2 INDEX SCAN; 16335
> ONLY INDEX ACCESSED;
> JDBC_CURSOR_22 RESULT IS COPIED , COSTVALUE IS; 49005
Ok, so we see that the index is indeed used.
From execution strategy point of view, there's no way to have this better.
Anyhow, reading 16335 pages and building up a intermediate resultset of approx. the same size will cause quite some I/O if you're buffer cache is not big enough to take it all.
How big is your buffer cache?
How many data volumes are configured?
regards,
Lars
>
> The following query is taking more than 4hrs to execute.
>
>
select l_extendedprice , count(l_extendedprice) from dbo.lineitem group by l_extendedprice
>
> Cardinality of table : 6001215 ( > 6 million)
> Index on l_extendedprice is there
> ReadAheadLobThreshold = 500
Are you reading lob data ? Otherwise this setting is irrelevant for this query.
> Database version 7.7.06.09
> I need to optimize this query. Kindly suggest way out.
Ok, lets see.
We don't know yet, what the execution plan looks like.
Please post it.
regards,
Lars
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
90 | |
10 | |
10 | |
10 | |
7 | |
7 | |
6 | |
5 | |
4 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.