on 07-07-2011 11:20 AM
Hi,
we're using MaxDB 7.7 in a large e-commerce project.
Recently we're having performances issues and I'm trying to understand what's the problem.
This is the query
select articolo2_.ID as col_0_0_, articolo2_.CODICE as col_1_0_, articolo2_.CODICEREPOSITORY as col_2_0_, articolo2_.DESCRIZIONE as col_3_0_, marchio5_.NOME as col_4_0_, articolo2_.MODELLO as col_5_0_, categoria6_.NOME as col_6_0_, unitamisur7_.DESCRIZIONE as col_7_0_, articoloco1_.ID as col_8_0_, articoloco1_.CODICE as col_9_0_, articoloco1_.STATOCOMMERCIALE as col_10_0_, articoloco1_.DATA_STATOCOMMERCIALE as col_11_0_, articolo2_.ALTRO1 as col_12_0_, articolo2_.ALTRO2 as col_13_0_, articolo2_.ALTRO3 as col_14_0_, articolo2_.ALTRO4 as col_15_0_, articolo2_.ALTRO5 as col_16_0_, articoloco1_.ATTIVO as col_17_0_, articoloco1_.LOTTOORDINE as col_18_0_, articoloco1_.MINIMOORDINE as col_19_0_, articoloco1_.SCORTAMINIMA as col_20_0_, MIN(prezzo0_.PREZZO) as col_21_0_, SUM(dispmagazz4_.DISPONIBILITA) as col_22_0_, SUM(dispmagazz4_.DISPFUTURA) as col_23_0_, iva8_.CODICE as col_24_0_, iva8_.PERC as col_25_0_, articolo2_.STIMAPESO as col_26_0_, MAX(prezzo0_.FLAG) as col_27_0_, articolo2_.COMPOSTO as col_28_0_, articoloco1_.ELEMENTO_ARTICOLOCOMPOSTO as col_29_0_, articoloco1_.DATA_CREAZIONE as col_30_0_, articoloco1_.DATA_ULTIMAMODIFICA as col_31_0_, MIN(prezzo0_.PREZZOPREC) as col_32_0_, MIN(prezzo0_.DATAPREZZOPREC) as col_33_0_, MIN(prezzo0_.PREZZOPRECNOFLAG) as col_34_0_, articolo2_.ALTEZ as col_35_0_, articolo2_.LARG as col_36_0_, articolo2_.PROF as col_37_0_, articolo2_.TAGLIA as col_38_0_, articolo2_.COLORE as col_39_0_, articolo2_.TIPOMISURE as col_40_0_, articolo2_.STIMACOLLI as col_41_0_, articolo2_.KEYWORDS as col_42_0_, MIN(prezzo0_.PREZZOIVATO) as col_43_0_, MIN(prezzo0_.PREZZOPRECIVATO) as col_44_0_, MIN(prezzo0_.PREZZOPRECNOFLAGIVATO) as col_45_0_, articolo2_.CODSTRUTTURAVARIANTE as col_46_0_, articolo2_.VARIANTE1 as col_47_0_, articolo2_.VARIANTE2 as col_48_0_, articolo2_.VARIANTE3 as col_49_0_, articolo2_.VARIANTE4 as col_50_0_, articolo2_.VARIANTE5 as col_51_0_, articolo2_.VARIANTE6 as col_52_0_, articolo2_.VARIANTE7 as col_53_0_, articolo2_.VARIANTE8 as col_54_0_, articolo2_.CARATTERISTICA1 as col_55_0_, articolo2_.CARATTERISTICA2 as col_56_0_, articolo2_.CARATTERISTICA3 as col_57_0_, articolo2_.CARATTERISTICA4 as col_58_0_, articolo2_.CARATTERISTICA5 as col_59_0_, MIN(articolifo3_.ID_PUNTOPARTENZA) as col_60_0_, iva10_.CODICE as col_61_0_, iva10_.PERC as col_62_0_, articolo2_.QTACONFEZIONE as col_63_0_, articolo2_.QTACARTONE as col_64_0_, articolo2_.QTABANCALE as col_65_0_
from ECF3.PREZZO prezzo0_, ECF3.ARTICOLO_COMMERCIALE articoloco1_, ECF3.ARTICOLO articolo2_, ECF3.ARTICOLO_FORNITORE articolifo3_, ECF3.MARCHIO marchio5_, ECF3.CATEGORIA categoria6_, ECF3.UNITAMISURA unitamisur7_, ECF3.IVA iva8_, ECF3.IVA iva10_, ECF3.DISPMAGAZZINO dispmagazz4_, ECF3.MACROCATEGORIA macrocateg13_, ECF3.REPARTO reparto14_
where prezzo0_.ID_ARTICOLOCOMM=articoloco1_.ID and articoloco1_.ID_ARTICOLO=articolo2_.ID and articolo2_.ID=articolifo3_.ID_ARTICOLO(+) and articolo2_.ID_MARCHIO=marchio5_.ID and articolo2_.ID_CATEGORIA=categoria6_.ID and articolo2_.ID_UM=unitamisur7_.ID and articolo2_.ID_IVA=iva8_.ID and articolo2_.ID_IVAINGROSSO=iva10_.ID and categoria6_.ID_MACROCATEGORIA=macrocateg13_.ID and macrocateg13_.ID_REPARTO=reparto14_.ID and articolifo3_.ABITUALE='S' and dispmagazz4_.ID_ARTICOLOCOMM=articoloco1_.ID and articoloco1_.ID_AZIENDA=1 and articoloco1_.ID_CANALE=9 and articoloco1_.ATTIVO='A' and prezzo0_.ID_LISTINO=47 and prezzo0_.PREZZO>0 and reparto14_.NOME='ELDOM' and (upper(articolo2_.DESCRIZIONE) like '%mp3%' or upper(articolo2_.MODELLO) like '%mp3%' or upper(articolo2_.KEYWORDS) like '%mp3%')
group by articolo2_.ID , articolo2_.CODICE , articolo2_.CODICEREPOSITORY , articolo2_.DESCRIZIONE , marchio5_.NOME , articolo2_.MODELLO , categoria6_.NOME , unitamisur7_.DESCRIZIONE , articoloco1_.ID , articoloco1_.CODICE , articoloco1_.STATOCOMMERCIALE , articoloco1_.DATA_STATOCOMMERCIALE , articolo2_.ALTRO1 , articolo2_.ALTRO2 , articolo2_.ALTRO3 , articolo2_.ALTRO4 , articolo2_.ALTRO5 , articoloco1_.ATTIVO , articoloco1_.LOTTOORDINE , articoloco1_.MINIMOORDINE , articoloco1_.SCORTAMINIMA , iva8_.CODICE , iva8_.PERC , articolo2_.STIMAPESO , articolo2_.COMPOSTO , articoloco1_.ELEMENTO_ARTICOLOCOMPOSTO , articoloco1_.DATA_CREAZIONE , articoloco1_.DATA_ULTIMAMODIFICA , articolo2_.ALTEZ , articolo2_.LARG , articolo2_.PROF , articolo2_.TAGLIA , articolo2_.COLORE , articolo2_.TIPOMISURE , articolo2_.STIMACOLLI , articolo2_.KEYWORDS , articolo2_.CODSTRUTTURAVARIANTE , articolo2_.VARIANTE1 , articolo2_.VARIANTE2 , articolo2_.VARIANTE3 , articolo2_.VARIANTE4 , articolo2_.VARIANTE5 , articolo2_.VARIANTE6 , articolo2_.VARIANTE7 , articolo2_.VARIANTE8 , articolo2_.CARATTERISTICA1 , articolo2_.CARATTERISTICA2 , articolo2_.CARATTERISTICA3 , articolo2_.CARATTERISTICA4 , articolo2_.CARATTERISTICA5 , iva10_.CODICE , iva10_.PERC , articolo2_.QTACONFEZIONE , articolo2_.QTACARTONE , articolo2_.QTABANCALE
order by MIN(prezzo0_.PREZZO) ASC
and this is the explain result
SCHEMANAME TABLENAME COLUMN_OR_INDEX STRATEGY PAGECOUNT
UNITAMISUR7_ TABLE SCAN 1
PREZZO0_ PREZZO_LISTINO_idx JOIN VIA INDEXED COLUMN 13655
ID_LISTINO (USED INDEX COLUMN)
ARTICOLOCO1_ ID JOIN VIA KEY COLUMN 5401
REPARTO14_ IDX_NOME_MONDO JOIN VIA INDEXED COLUMN 1
TABLE HASHED
NOME (USED INDEX COLUMN)
ARTICOLIFO3_ IDX_ARTICOLO_FORNITORE_ARTICOLO JOIN VIA INDEXED COLUMN 5478
ID_ARTICOLO (USED INDEX COLUMN)
ARTICOLO2_ ID JOIN VIA KEY COLUMN 8098
MARCHIO5_ ID JOIN VIA KEY COLUMN 5
TABLE HASHED
CATEGORIA6_ ID JOIN VIA KEY COLUMN 8
TABLE HASHED
IVA8_ ID JOIN VIA KEY COLUMN 1
TABLE HASHED
IVA10_ ID JOIN VIA KEY COLUMN 1
TABLE HASHED
DISPMAGAZZ4_ DISPMAGAZZINO_IDARTICOLOCOMM_IDX JOIN VIA INDEXED COLUMN 801
ID_ARTICOLOCOMM (USED INDEX COLUMN)
MACROCATEG13_ ID JOIN VIA KEY COLUMN 1
TABLE HASHED
NO TEMPORARY RESULTS CREATED
INTERNAL TEMPORARY RESULT TABLE SCAN 1
JDBC_CURSOR_54 RESULT IS COPIED , COSTVALUE IS 289903
The query takes an average of 25 seconds in working hours. It returns 611 rows. MaxDB 7.7 running on Linux, 2 processors, 4 cores each, 32GB Ram. Cache size is 4GB for this db and data size is about 5GB. Cache hit 100%.
What I'm not sure is if the query is already optimized and the problem is the workload on the server or the query can be optimized.
The query is generated by Hibernate so I really can't tweak the SQL.
Thank you for any suggestion !
Trying to make it more readable...
Hi, we're using MaxDB 7.7 in a large e-commerce project. Recently we're having performances issues and I'm trying to understand what's the problem. This is the query
select articolo2_.ID as col_0_0_, articolo2_.CODICE as col_1_0_, articolo2_.CODICEREPOSITORY as col_2_0_,
articolo2_.DESCRIZIONE as col_3_0_, marchio5_.NOME as col_4_0_, articolo2_.MODELLO as col_5_0_, categoria6_.NOME
as col_6_0_, unitamisur7_.DESCRIZIONE as col_7_0_, articoloco1_.ID as col_8_0_, articoloco1_.CODICE as col_9_0_,
articoloco1_.STATOCOMMERCIALE as col_10_0_, articoloco1_.DATA_STATOCOMMERCIALE as col_11_0_,
articolo2_.ALTRO1 as col_12_0_, articolo2_.ALTRO2 as col_13_0_, articolo2_.ALTRO3 as col_14_0_, articolo2_.ALTRO4 as
col_15_0_, articolo2_.ALTRO5 as col_16_0_, articoloco1_.ATTIVO as col_17_0_, articoloco1_.LOTTOORDINE as col_18_0_,
articoloco1_.MINIMOORDINE as col_19_0_, articoloco1_.SCORTAMINIMA as col_20_0_, MIN(prezzo0_.PREZZO) as
col_21_0_, SUM(dispmagazz4_.DISPONIBILITA) as col_22_0_, SUM(dispmagazz4_.DISPFUTURA) as col_23_0_,
iva8_.CODICE as col_24_0_, iva8_.PERC as col_25_0_, articolo2_.STIMAPESO as col_26_0_, MAX(prezzo0_.FLAG) as
col_27_0_, articolo2_.COMPOSTO as col_28_0_, articoloco1_.ELEMENTO_ARTICOLOCOMPOSTO as col_29_0_,
articoloco1_.DATA_CREAZIONE as col_30_0_, articoloco1_.DATA_ULTIMAMODIFICA as col_31_0_,
MIN(prezzo0_.PREZZOPREC) as col_32_0_, MIN(prezzo0_.DATAPREZZOPREC) as col_33_0_,
MIN(prezzo0_.PREZZOPRECNOFLAG) as col_34_0_, articolo2_.ALTEZ as col_35_0_, articolo2_.LARG as col_36_0_,
articolo2_.PROF as col_37_0_, articolo2_.TAGLIA as col_38_0_, articolo2_.COLORE as col_39_0_,
rticolo2_.TIPOMISURE as col_40_0_, articolo2_.STIMACOLLI as col_41_0_, articolo2_.KEYWORDS as col_42_0_,
MIN(prezzo0_.PREZZOIVATO) as col_43_0_, MIN(prezzo0_.PREZZOPRECIVATO) as col_44_0_,
MIN(prezzo0_.PREZZOPRECNOFLAGIVATO) as col_45_0_, articolo2_.CODSTRUTTURAVARIANTE as col_46_0_,
articolo2_.VARIANTE1 as col_47_0_, articolo2_.VARIANTE2 as col_48_0_, articolo2_.VARIANTE3 as col_49_0_,
articolo2_.VARIANTE4 as col_50_0_, articolo2_.VARIANTE5 as col_51_0_, articolo2_.VARIANTE6 as col_52_0_,
articolo2_.VARIANTE7 as col_53_0_, articolo2_.VARIANTE8 as col_54_0_, articolo2_.CARATTERISTICA1 as col_55_0_,
articolo2_.CARATTERISTICA2 as col_56_0_, articolo2_.CARATTERISTICA3 as col_57_0_,
articolo2_.CARATTERISTICA4 as col_58_0_, articolo2_.CARATTERISTICA5 as col_59_0_,
MIN(articolifo3_.ID_PUNTOPARTENZA) as col_60_0_, iva10_.CODICE as col_61_0_, iva10_.PERC as col_62_0_,
articolo2_.QTACONFEZIONE as col_63_0_, articolo2_.QTACARTONE as col_64_0_, articolo2_.QTABANCALE as col_65_0_
FROM ECF3.PREZZO prezzo0_, ECF3.ARTICOLO_COMMERCIALE articoloco1_, ECF3.ARTICOLO articolo2_,
ECF3.ARTICOLO_FORNITORE articolifo3_, ECF3.MARCHIO marchio5_, ECF3.CATEGORIA categoria6_,
ECF3.UNITAMISURA unitamisur7_, ECF3.IVA iva8_, ECF3.IVA iva10_, ECF3.DISPMAGAZZINO dispmagazz4_,
ECF3.MACROCATEGORIA macrocateg13_, ECF3.REPARTO reparto14_
WHERE prezzo0_.ID_ARTICOLOCOMM=articoloco1_.ID and articoloco1_.ID_ARTICOLO=articolo2_.ID and
articolo2_.ID=articolifo3_.ID_ARTICOLO(+) and articolo2_.ID_MARCHIO=marchio5_.ID and
articolo2_.ID_CATEGORIA=categoria6_.ID and articolo2_.ID_UM=unitamisur7_.ID and articolo2_.ID_IVA=iva8_.ID and
articolo2_.ID_IVAINGROSSO=iva10_.ID and categoria6_.ID_MACROCATEGORIA=macrocateg13_.ID and
macrocateg13_.ID_REPARTO=reparto14_.ID and articolifo3_.ABITUALE='S' and
dispmagazz4_.ID_ARTICOLOCOMM=articoloco1_.ID and articoloco1_.ID_AZIENDA=1 and articoloco1_.ID_CANALE=9 and
articoloco1_.ATTIVO='A' and prezzo0_.ID_LISTINO=47 and prezzo0_.PREZZO>0 and reparto14_.NOME='ELDOM' and
(upper(articolo2_.DESCRIZIONE) like '%mp3%' or upper(articolo2_.MODELLO) like '%mp3%' or
upper(articolo2_.KEYWORDS) like '%mp3%')
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
And this is the explain result
SCHEMANAME TABLENAME COLUMN_OR_INDEX STRATEGY PAGECOUNT
UNITAMISUR7_ TABLE SCAN 1
PREZZO0_ PREZZO_LISTINO_idx JOIN VIA INDEXED COLUMN 13655
ID_LISTINO (USED INDEX COLUMN)
ARTICOLOCO1_ ID JOIN VIA KEY COLUMN 5401
REPARTO14_ IDX_NOME_MONDO JOIN VIA INDEXED COLUMN 1
TABLE HASHED
NOME (USED INDEX COLUMN)
ARTICOLIFO3_ IDX_ARTICOLO_FORNITORE_ARTICOLO JOIN VIA INDEXED COLUMN 5478
ID_ARTICOLO (USED INDEX COLUMN)
ARTICOLO2_ ID JOIN VIA KEY COLUMN 8098
MARCHIO5_ ID JOIN VIA KEY COLUMN 5
TABLE HASHED
CATEGORIA6_ ID JOIN VIA KEY COLUMN 8
TABLE HASHED
IVA8_ ID JOIN VIA KEY COLUMN 1
TABLE HASHED
IVA10_ ID JOIN VIA KEY COLUMN 1
TABLE HASHED
DISPMAGAZZ4_ DISPMAGAZZINO_IDARTICOLOCOMM_IDX JOIN VIA INDEXED COLUMN 801
ID_ARTICOLOCOMM (USED INDEX COLUMN)
MACROCATEG13_ ID JOIN VIA KEY COLUMN 1
TABLE HASHED
NO TEMPORARY RESULTS CREATED
INTERNAL TEMPORARY RESULT TABLE SCAN 1
JDBC_CURSOR_54 RESULT IS COPIED , COSTVALUE IS 289903
The query takes an average of 25 seconds in working hours. It returns 611 rows. MaxDB 7.7 running on Linux, 2 processors, 4 cores each, 32GB Ram. Cache size is 4GB for this db and data size is about 5GB. Cache hit 100%.
What I'm not sure is if the query is already optimized and the problem is the workload on the server or the query can be optimized.
The query is generated by Hibernate so I really can't tweak the SQL.
Thank you for any suggestion !
After further testing I found that adding the CATEGORIA, MARCHIO and REPARTO fields to ARTICOLO2_ table instead of using an ID and join relative tables to take the value (so removing 4 tables from the joins), the query runs in just 2 seconds that is an acceptable result.
Any hint on why these tables (that are very small, the bigger has 1500 rows) affect so much the results ?
Thanks
Hi there,
without further analysis of your data it won't be possible to come up with a super good explanation, but apparently none of the selection criteria appeared interesting enough (or maybe they aren't accessible via a decent key or index access) to reduce the intermediate result set:
and articolifo3_.ABITUALE='S'
and articoloco1_.ID_AZIENDA=1
and articoloco1_.ID_CANALE=9
and articoloco1_.ATTIVO='A'
and prezzo0_.ID_LISTINO=47
and prezzo0_.PREZZO>0
and reparto14_.NOME='ELDOM'
and (upper(articolo2_.DESCRIZIONE) like '%mp3%'
or upper(articolo2_.MODELLO) like '%mp3%'
or upper(articolo2_.KEYWORDS) like '%mp3%')
The last three predicates generally won't be supported by any index lookup.
The others might be quite interesting - depending on the data distribution - so creating an appropriate index might have helped a bit here.
This inability to early filter out non wanted data is also the reason for the optimizer to go for HASH joins.
These imply to read all the tables fully (which is done by one-by-one page acesses) and to create intermediate HASH tables.
Moreover, there are two parameters in MaxDB that specify how many iterations the optimizer should perform to come up with the optimal execution plan.
See [JoinSearchLevel|http://wiki.sdn.sap.com/wiki/display/MaxDB/ExtendedOptimizerParameters+7.7#ExtendedOptimizerParameters7.7-JoinSearchTableThreshold9] on details for this.
As a general hint: try to keep the number of joined tables low for a statement and try to provide access structures that allow early data filtering and suitable join transitions.
best regards,
Lars
Dear Lars,
thanks for your help.
I think the problem is on some joins because removing them and using an added field in place of them (de-normalyzing information) the query is much faster.
First, all the joins are backed by indexes with integer keys.
These are the tables involved (i removed the fields not related to joins for clarity), query time are not average, they're a single run times in working hours :
CREATE TABLE ECF3.ARTICOLO (
ID INTEGER NOT NULL,
DESCRIZIONE VARCHAR() ASCII(512) NOT NULL,
MODELLO VARCHAR() ASCII(100) NOT NULL,
KEYWORDS VARCHAR() ASCII(512),
ID_UM INTEGER,
ID_IVA INTEGER,
ID_IVAINGROSSO INTEGER,
ID_CATEGORIA INTEGER,
ID_MARCHIO INTEGER,
PRIMARY KEY (ID)
);
The table contains 115.729 rows, 4.857 rows match the search condition on this table fields, like on '%MP3%' (query runs in 0.4 sec).
CREATE TABLE ECF3.ARTICOLO_COMMERCIALE (
ID INTEGER NOT NULL,
ID_AZIENDA INTEGER,
ID_CANALE INTEGER,
ID_ARTICOLO INTEGER,
ATTIVO CHAR() ASCII(1) NOT NULL,
PRIMARY KEY (ID)
);
The table contains 413.916 rows, 45.086 rows match the search condition on this table field, ID_AZIENDA=1, ID_CANALE=9, ATTIVO='A' (query runs in 0.48 sec).
CREATE TABLE ECF3.ARTICOLO_FORNITORE (
ID INTEGER NOT NULL,
ABITUALE CHAR() ASCII(1),
ID_ARTICOLO INTEGER,
PRIMARY KEY (ID)
);
The table contains 115.503 rows, all match the search condition on this table field, ABITUALE='S' (query runs in 0.7 sec).
CREATE TABLE ECF3.PREZZO (
ID INTEGER NOT NULL,
PREZZO FIXED(14,3),
ID_LISTINO INTEGER,
ID_ARTICOLOCOMM INTEGER,
PRIMARY KEY (ID)
);
The table contains 2.246.518 rows, 58.396 match the search condition on this table field, ID_LISTINO=47 AND PREZZO>0 (query runs in 0.4 sec)
CREATE TABLE ECF3.DISPMAGAZZINO (
ID_MAGAZZINO INTEGER NOT NULL,
ID_ARTICOLOCOMM INTEGER NOT NULL,
PRIMARY KEY (ID_MAGAZZINO,ID_ARTICOLOCOMM)
);
Table contains 404.664 rows, no conditions on query, only joins with other table ARTICOLO_COMMERCIALE.
There are then the smaller tables
CREATE TABLE ECF3.REPARTO (
ID INTEGER NOT NULL,
NOME VARCHAR() ASCII(100) NOT NULL,
PRIMARY KEY (ID)
);
CREATE TABLE ECF3.MACROCATEGORIA (
ID INTEGER NOT NULL,
NOME VARCHAR() ASCII(100) NOT NULL,
ID_REPARTO INTEGER DEFAULT 0 NOT NULL,
PRIMARY KEY (ID)
);
CREATE TABLE ECF3.CATEGORIA (
ID INTEGER NOT NULL,
ID_MACROCATEGORIA INTEGER NOT NULL,
NOME VARCHAR() ASCII(100) NOT NULL,
PRIMARY KEY (ID)
);
This is a 'tree', reparto has only 5 rows, macrocategoria has 29 rows, categoria has 1120 rows. The search condition on reparto gets to 765 categoria rows.
CREATE TABLE ECF3.MARCHIO (
ID INTEGER NOT NULL,
NOME VARCHAR() ASCII(100) NOT NULL,
PRIMARY KEY (ID)
);
The table contains 1189 rows
CREATE TABLE ECF3.IVA (
ID INTEGER NOT NULL,
PRIMARY KEY (ID)
);
This table contains 15 rows.
CREATE TABLE ECF3.UNITAMISURA (
ID INTEGER NOT NULL,
PRIMARY KEY (ID)
);
This table contains 6 rows.
The CATEGORIA, MARCHIO, MACROCATEGORIA, REPARTO was added later to the structure. Before this we had a field CATEGORIA, MARCHIO, REPARTO in the ARTICOLO table.
Using the fields instead of the joins the query is fast (2 sec.). Adding these tables has slowed down to 25 sec. (15 in non working hours).
This is what I don't understand, they're small tables, I understand the joins add complexity but the time difference between the two version of the model is very high !
Thanks for help and sorry for the long post.
> The others might be quite interesting - depending on the data distribution - so creating an appropriate index might have helped a bit here.
Are single indexes sufficient or is it better to provide combined indexes too ?
The tables have all "single field" indexes but most don't have combined indexes for common query fields combination.
> Moreover, there are two parameters in MaxDB that specify how many iterations the optimizer should perform to come up with the optimal execution plan.
>
I looked at these parameters before writing on the forum, they're set to standard values, i.e. 8 and 16.
The query we're analyzing has 12 tables, they was 8 before adding the new model tables that caused the problem.
If I understood how these parameters work, this changed the algorithm from 'Permutation' to 'Internal', correct ?
Anyway I've tried to set the smaller value to 12 (no restart) and query times hasn't changed.
> Are single indexes sufficient or is it better to provide combined indexes too ?
> The tables have all "single field" indexes but most don't have combined indexes for common query fields combination.
>
>
Well, during join processing indexes of a joined table aren't combined.
So, when the join transition is done via a KEY or an index, the other filter predicates will be evaluated via table lookup and not by using indexes.
So, carefully choosing combined indexes may very well help a great deal.
> I looked at these parameters before writing on the forum, they're set to standard values, i.e. 8 and 16.
> The query we're analyzing has 12 tables, they was 8 before adding the new model tables that caused the problem.
> If I understood how these parameters work, this changed the algorithm from 'Permutation' to 'Internal', correct ?
> Anyway I've tried to set the smaller value to 12 (no restart) and query times hasn't changed.
You understood correctly.
Sometimes, especially for BW like queries, it pays to raise the lower border for the permutation algorithm.
That way parsing and optimizing the statement will take longer, but eventually the optimal execution plan is found.
regards,
Lars
^^^
I'll try to create combined indexes where multiple fields are used in joins.
Do you have any other idea on why adding that 4 simple tables (joined with primary key, a single integer field) has slowed down so much the query ?
I'll try to put again the lower border to 12 (or 13). Does it takes some time to see the effects ? I haven't seen improvements when tested but I just run a couple queries after increasing and put it back to 8.
Thank you very much for your help !
Dear Lars,
> thanks for your help.
>
> I think the problem is on some joins because removing them and using an added field in place of them (de-normalyzing information) the query is much faster.
Hmm... hard to really tell right now - the guess sound plausible, but that doesn't mean it has to be true in this case.
> First, all the joins are backed by indexes with integer keys.
Yeah - I saw that.
You're really into these ID-key are you?
The point here is: if you've got other columns that may suit as a primary key (a.k.a. key candidates), then introducing artificial keys like ID won't improve the performance.
For example your table PREZZO: I have the impression that ID_LISTINO actually is the semantical primary key for this table.
But instead of directly joining and filtering on this, you join to ID and then have to check for the filter conditions.
Since I won't be able to completely tune your query here, I just try to give a general hint:
Think about which is the most important filter condition in your qurey (the one that is most specific and reduces most of the unwanted rows from the result).
Try to come up with an indexing for this condition (e.g. the '%mp3%' condition will never be handled efficiently by any index in MaxDB - so don't got for that!) so that the corresponding table becomes the start-table for the whole join.
From there on, try to have indexes that allow good join transitions and filtering.
Hope that helps a bit.
Best regards,
Lars
> You're really into these ID-key are you?
> The point here is: if you've got other columns that may suit as a primary key (a.k.a. key candidates), then introducing artificial keys like ID won't improve the performance.
>
> For example your table PREZZO: I have the impression that ID_LISTINO actually is the semantical primary key for this table.
> But instead of directly joining and filtering on this, you join to ID and then have to check for the filter conditions.
No, in this case ID_LISTINO can't be the primary key because there are many rows with the same ID_LISTINO.
The key could be formed by ID_LISTINO, QTA_DA, QTA_A, ID_MODPAG, ID_MODSPED, ID_ARTICOLOCOMM.
Since they're a lot of fields in this case we used an ID instead and indexed the other fields.
> Since I won't be able to completely tune your query here, I just try to give a general hint:
> Think about which is the most important filter condition in your qurey (the one that is most specific and reduces most of the unwanted rows from the result).
> Try to come up with an indexing for this condition (e.g. the '%mp3%' condition will never be handled efficiently by any index in MaxDB - so don't got for that!) so that the corresponding table becomes the start-table for the whole join.
>
> Lars
All search fields have indexes (not combined for multiple fields from the same table, yet).
Looking at the Explain result above, why do you think the database is using Hashing (that if I have understand it means that it reads all the table) instead of using indexes when reading the smaller tables ? Is it probably considering that since they're small it's better to read all them and keep them in memory ?
I'm still confused about why adding the 4 tables the query pass from 2 seconds to 25 seconds.. And these tables are joined with simple 1:1 integer index.
No, in this case ID_LISTINO can't be the primary key because there are many rows with the same ID_LISTINO.
The key could be formed by ID_LISTINO, QTA_DA, QTA_A, ID_MODPAG, ID_MODSPED, ID_ARTICOLOCOMM.
Since they're a lot of fields in this case we used an ID instead and indexed the other fields.
I see your point.
All search fields have indexes (not combined for multiple fields from the same table, yet).
Looking at the Explain result above, why do you think the database is using Hashing (that if I have understand it means that it reads all the table) instead of using indexes when reading the smaller tables ? Is it probably considering that since they're small it's better to read all them and keep them in memory ?
MaxDB will use hashing, when it assumes that most rows of the two tables will need to be joined. That's the case when nested loop join becomes very labor intensive ( O (n x m) => O (n²) !!) . The hash join in turn always operates on O (n + m) => O (2n) efficiency and thus is more efficient for larger n.
Also, the tables need to fit into the hash join thresholds (see parameters).
I'm still confused about why adding the 4 tables the query pass from 2 seconds to 25 seconds.. And these tables are joined with simple 1:1 integer index.
Hmm... likekly this is due to the rather inefficient handling of intermediate results (which cannot be avoided for hash joins) and the large scans.
regards,
Lars
Turned OFF hash joins but the query time hasn't changed. I've also raised to 13 the lower param for algorithm decision.
The new EXPLAIN output is
SCHEMANAME TABLENAME COLUMN_OR_INDEX STRATEGY PAGECOUNT
UNITAMISUR7_ TABLE SCAN 1
PREZZO0_ PREZZO_LISTINO_idx JOIN VIA INDEXED COLUMN 13655
ID_LISTINO (USED INDEX COLUMN)
ARTICOLOCO1_ ID JOIN VIA KEY COLUMN 5401
REPARTO14_ IDX_NOME_MONDO JOIN VIA INDEXED COLUMN 1
NOME (USED INDEX COLUMN)
ARTICOLIFO3_ IDX_ARTICOLO_FORNITORE_ARTICOLO JOIN VIA INDEXED COLUMN 5478
ID_ARTICOLO (USED INDEX COLUMN)
ARTICOLO2_ ID JOIN VIA KEY COLUMN 8223
MARCHIO5_ ID JOIN VIA KEY COLUMN 5
CATEGORIA6_ ID JOIN VIA KEY COLUMN 8
IVA8_ ID JOIN VIA KEY COLUMN 1
IVA10_ ID JOIN VIA KEY COLUMN 1
DISPMAGAZZ4_ DISPMAGAZZINO_IDARTICOLOCOMM_IDX JOIN VIA INDEXED COLUMN 801
ID_ARTICOLOCOMM (USED INDEX COLUMN)
MACROCATEG13_ ID JOIN VIA KEY COLUMN 1
NO TEMPORARY RESULTS CREATED
INTERNAL TEMPORARY RESULT TABLE SCAN 1
JDBC_CURSOR_7 RESULT IS COPIED , COSTVALUE IS 288206
Updating statistics for table PREZZO (the bigger table) and using combined indexes has a good result (still not great however).
This is the explain result
SCHEMANAME TABLENAME COLUMN_OR_INDEX STRATEGY PAGECOUNT
UNITAMISUR7_ TABLE SCAN 1
ARTICOLOCO1_ IDX_ARTICOLOCOMM_COMB1 JOIN VIA RANGE OF MULTIPLE INDEXED COL. 5401
ID_AZIENDA (USED INDEX COLUMN)
ID_CANALE (USED INDEX COLUMN)
ATTIVO (USED INDEX COLUMN)
REPARTO14_ IDX_NOME_MONDO JOIN VIA INDEXED COLUMN 1
TABLE HASHED
NOME (USED INDEX COLUMN)
ARTICOLO2_ ID JOIN VIA KEY COLUMN 8223
MARCHIO5_ ID JOIN VIA KEY COLUMN 5
TABLE HASHED
PREZZO0_ PREZZO_ARTICOLO_LISTINO_idx JOIN VIA MULTIPLE INDEXED COLUMNS 28813
ID_ARTICOLOCOMM (USED INDEX COLUMN)
ID_LISTINO (USED INDEX COLUMN)
ARTICOLIFO3_ IDX_ARTICOLO_FORNITORE_ARTICOLO JOIN VIA INDEXED COLUMN 5478
ID_ARTICOLO (USED INDEX COLUMN)
CATEGORIA6_ ID JOIN VIA KEY COLUMN 8
TABLE HASHED
IVA8_ ID JOIN VIA KEY COLUMN 1
TABLE HASHED
IVA10_ ID JOIN VIA KEY COLUMN 1
TABLE HASHED
DISPMAGAZZ4_ DISPMAGAZZINO_IDARTICOLOCOMM_IDX JOIN VIA INDEXED COLUMN 801
ID_ARTICOLOCOMM (USED INDEX COLUMN)
MACROCATEG13_ ID JOIN VIA KEY COLUMN 1
TABLE HASHED
NO TEMPORARY RESULTS CREATED
INTERNAL TEMPORARY RESULT TABLE SCAN 1
JDBC_CURSOR_22 RESULT IS COPIED , COSTVALUE IS 416260
I noticed that hash or not has has no impact (or it seems to have no impact).
Query time is now 6 seconds. A big improvement but still 3 times what it was before adding the last joins.
PREZZO table's statistics were two years old, but automatic statistic updates are ON for the db, shouldn't they be refreshed automatically ?
Finally, updating all tables statistics (with all rows information), the query time is now back to 2 seconds.
We're also testing 7.8 in another server and times are lower (about 1.3 seconds).
Now I have to understand why automatic database statistics update hasn't worked.
I also noticed another thing that may be interesting:
In 7.7 setting the lower value to select the algorithm for join optimization to 13 (default 😎 has no impact on the explain time (will test again execution time off working hours).
In 7.8 it has a big impact instead. It takes SECONDS only to do the explain.
Why the two version are so different ? I want to migrate to 7.8 later this summer so this is important to me.
Thank you
Finally, updating all tables statistics (with all rows information), the query time is now back to 2 seconds.
>
> We're also testing 7.8 in another server and times are lower (about 1.3 seconds).
>
> Now I have to understand why automatic database statistics update hasn't worked.
Well, the most likely reasons would be:
a) either the change of statistic values was within the threshold that defines when new statistics need to be calculated
or
b) for some reason the automatic update task was not running anymore.
You may check the docu [here|http://maxdb.sap.com/doc/7_6/84/d8d1fb570411d4aa82006094b92fad/frameset.htm] for more on that.
> I also noticed another thing that may be interesting:
>
> In 7.7 setting the lower value to select the algorithm for join optimization to 13 (default 😎 has no impact on the explain time (will test again execution time off working hours).
>
> In 7.8 it has a big impact instead. It takes SECONDS only to do the explain.
>
> Why the two version are so different ? I want to migrate to 7.8 later this summer so this is important to me.
>
> Thank you
There really shouldn't be a difference for this parameters with these two versions.
But for versions up to 7.7 it's always necessary to restart the database in order to activate the new parameter (although it appears to be online changeable).
regards,
Lars
Updating statistics for table PREZZO (the bigger table) and using combined indexes has a good result (still not great however).
>
> This is the explain result
> Query time is now 6 seconds. A big improvement but still 3 times what it was before adding the last joins.
Since adding a table or a predicate to a join always influences the overall join optimization it's fairly possible that the execution time gets worse. Nothing to wonder about here, as the problem the optimizer has to solve becomes way more complex.
> PREZZO table's statistics were two years old, but automatic statistic updates are ON for the db, shouldn't they be refreshed automatically ?
Depends on what threshold was setup for new statistics and how much of the data actually changed.
For joins cardinality estimation is a very important factor and for this, the number of distinct values per column is the relevant statistic. Now even if there's only a small volume of data that has been changed in total, this value might have changed a lot.
So, taking care of important statistics yourself might be a very good idea.
In NetWeaver the autostats feature is not even supported. We take care about the statistics ourself.
The automatic statistics feature is there to suit low profile no-DBA-at-all systems.
BW-like systems like the one you seem to use there never had been the kind of system for which this function was introduced.
regards,
Lars
Why isn't the
code tag
working ?
It works here but not on the above post...
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
94 | |
11 | |
11 | |
10 | |
9 | |
7 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.