on 04-29-2008 7:02 PM
Hi,
Many tools that make ORM (Object Relational Mapping) like Castor, Hibernate, Toplink, JPOX, etc.., have the one table per concrete class feature that maps objects to follow structure:
CREATE TABLE ABSTRACTPRODUCT (
ID VARCHAR(8) NOT NULL,
DESCRIPTION VARCHAR(60) NOT NULL,
PRIMARY KEY(ID)
)
//
CREATE TABLE PRODUCT (
ID VARCHAR(8) NOT NULL REFERENCES ABSTRACTPRODUCT(ID),
CODE VARCHAR(10) NOT NULL,
PRICE DECIMAL(12,2),
PRIMARY KEY(ID)
)
//
CREATE UNIQUE INDEX iProduct ON Product(code)
//
CREATE TABLE BOOK (
ID VARCHAR(8) NOT NULL REFERENCES PRODUCT(ID),
AUTHOR VARCHAR(60) NOT NULL,
PRIMARY KEY (ID)
)
//
CREATE TABLE COMPACTDISK (
ID VARCHAR(8) NOT NULL REFERENCES PRODUCT(ID),
ARTIST VARCHAR(60) NOT NULL,
PRIMARY KEY(ID)
)
//
there is a way to improve queries like
SELECT
pd.code CODE,
abpd.description DESCRIPTION,
DECODE(bk.id,NULL,cd.artist,bk.author) PERSON
FROM
ABSTRACTPRODUCT abpd,
PRODUCT pd,
BOOK bk,
COMPACTDISK cd
WHERE
pd.id = abpd.id AND
bk.id(+) = abpd.id AND
cd.id(+) = abpd.id AND
pd.code like '101%'
or like this:
SELECT
pd.code CODE,
abpd.description DESCRIPTION,
DECODE(bk.id,NULL,cd.artist,bk.author) PERSON
FROM
ABSTRACTPRODUCT abpd,
PRODUCT pd,
BOOK bk,
COMPACTDISK cd
WHERE
pd.id = abpd.id AND
bk.id(+) = abpd.id AND
cd.id(+) = abpd.id AND
abpd.description like '%STARS%' AND
pd.price BETWEEN 1 AND 10
think in a table with many rows, then exists something inside MaxDB to improve this type of queries? like some anotations on SQL? or declare tables that extends another by PK? on other databases i managed this using Materialized Views, but i think that this can be faster just using PK, i'm wrong? the better is to consolidate all tables in one table? what is the impact on database size with this consolidation?
note: with consolidation i will miss NOT NULL constraint at database side.
thanks for any insight.
Clóvis
Hi Clovis,
we don't know yet:
- database version
- execution plan for the queries
- statistics for the tables/indexes
- runtime actual / expected
- database setup
I would propose you check the MaxDB wiki on how to gather the required information [Poor Performance when executing Individual SQL Statements|https://wiki.sdn.sap.com/wiki/x/rmk].
For general information on how to check on performance issues with MaxDB see [Tuning MaxDB|https://wiki.sdn.sap.com/wiki/x/jRI].
Looking at the DDLs is nice, but does only let room for guesswork.
Maybe it would be a good idea to make CODE in the PRODUCT table the primary key (as ID is just a surrogate key without meaning).
Anyhow - we need more information.
Best regards,
Lars
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi, Lars
The Version is: 7.6.03.07
i dont post here the execution plan, statistics and runtime, because its just a sample, i dont have data for these tables, here in production the tables are more complex, but follow that structure with much more columns and rows.
i'm tunning my database and checked that when i have queries like the 2nd case explained in this post, the things goes too slow, the maxdb first walk, master table, then check child table, and the query dont have a good select ratio.
the idea that i think is, if would possible we have "fast join" for two tables with Indexes, since that tables have one primary key and one side have a foreign key to another, something like:
CREATE INDEX iJoinedIndex ON AbstractProduct(description) AND Product(price)
note, that command is just a "dream", but with this we can tell the optimizer how to execute that query better.
regards
Clóvis
Hi Clovis,
without any data in the tables - how will you know what execution plan the optimizer will choose?
This decision is totally dependend on the data and the statistics.
There is just no way to check the execution plan, without data and statistics (this is not a rule based optimizer ... ).
Only-Index-Accesses can and will be used for joins. They are used daily in huge SAP databases.
Why do you think, the optimizer will always first read all the "master table" (I believe you refer to the ABSTRACTPRODUCT table here, do you) ?.
Well - anyhow:
What's the reason here for ABSTRACTPRODUCT at all?
As far as I can see, there is no other table that is "derived" from the ABSTRACTPRODUCT.
So just put the DESCRIPTION field to the PRODUCT table and you've spared one join.
With that it's an easy three table join and for the queries you've given us all the selection criteria are based on the DESCRIPTION, PRICE or the CODE of the PRODUCT table.
PRICE and CODE will surely benefit from beeing indexed.
For DESCRIPTION you'll have to check if it really improves performance to do so.
KR Lars
BTW: usually it's not sensible to try to perform "general" performance tuning.
Better check what is the problem with a specific query - if this problem reoccurs, OK. Then you might have solved several issues at once. But most often it's the tiny differences that makes one query a showstopper and the other blazingly fast.
Hi, Lars,
Maybe you misunderstand what i wish with this post, the idea is to discuss or know how to deal with this type of table structure, the DDL posted here, is just for example, i'm sure you that in another scenario ABSTRACTPRODUCT have more derived tables, i put just one case for us to see the table hierarchy, and think that ABSTRACTPRODUCT have several rows, but to improve our discussion i will add more info below:
1) DDL to create INDEX for description:
CREATE INDEX iAbstractProduct ON AbstractProduct(description)
//
2) script to insert some data:
INSERT INTO ABSTRACTPRODUCT VALUES ('AAAAAAAA','BOOK A')
//
INSERT INTO PRODUCT VALUES('AAAAAAAA','0001',5)
//
INSERT INTO BOOK VALUES('AAAAAAAA','AUTHOR A')
//
INSERT INTO ABSTRACTPRODUCT VALUES ('AAAAAAAB','BOOK B')
//
INSERT INTO PRODUCT VALUES('AAAAAAAB','0002',10)
//
INSERT INTO BOOK VALUES('AAAAAAAB','AUTHOR A')
//
INSERT INTO ABSTRACTPRODUCT VALUES ('AAAAAAAC','BOOK c')
//
INSERT INTO PRODUCT VALUES('AAAAAAAC','0003',15)
//
INSERT INTO BOOK VALUES('AAAAAAAC','AUTHOR A')
//
INSERT INTO ABSTRACTPRODUCT VALUES ('AAAAAAAD','CD A')
//
INSERT INTO PRODUCT VALUES('AAAAAAAD','1001',5)
//
INSERT INTO COMPACTDISK VALUES('AAAAAAAD','BAND A')
//
INSERT INTO ABSTRACTPRODUCT VALUES ('AAAAAAAE','CD B')
//
INSERT INTO PRODUCT VALUES('AAAAAAAE','1002',10)
//
INSERT INTO COMPACTDISK VALUES('AAAAAAAE','BAND A')
//
INSERT INTO ABSTRACTPRODUCT VALUES ('AAAAAAAF','CD C')
//
INSERT INTO PRODUCT VALUES('AAAAAAAF','1003',15)
//
INSERT INTO COMPACTDISK VALUES('AAAAAAAF','BAND A')
//
3) SQL to explain:
SELECT
pd.code CODE,
abpd.description DESCRIPTION,
DECODE(bk.id,NULL,cd.artist,bk.author) PERSON
FROM
ABSTRACTPRODUCT abpd,
PRODUCT pd,
BOOK bk,
COMPACTDISK cd
WHERE
pd.id = abpd.id AND
bk.id(+) = abpd.id AND
cd.id(+) = abpd.id AND
abpd.description like 'BOOK%' AND
pd.price BETWEEN 1 AND 10
//
4) result of explain:
OWNER|TABLENAME |COLUMN_OR_INDEX|STRATEGY |PAGECOUNT
|ABPD | |TABLE SCAN | 1
|PD |ID |JOIN VIA KEY COLUMN | 1
|BK |ID |JOIN VIA KEY COLUMN | 1
|CD |ID |JOIN VIA KEY COLUMN | 1
| | | NO TEMPORARY RESULTS CREATED |
|JDBC_CURSOR_38| | RESULT IS COPIED , COSTVALUE IS| 5
5) result of explain of a production system here:
OWNER|TABLENAME |COLUMN_OR_INDEX |STRATEGY |PAGECOUNT
|TM |ITIPOMOVIMENTO1 |INDEX SCAN | 19
| | |ONLY INDEX ACCESSED |
|MF |IMOVIMENTOFINANCEIROX1|JOIN VIA RANGE OF MULTIPLE INDEXED COL.| 9270
| |MY_SACADO | (USED INDEX COLUMN) |
| |TIPO | (USED INDEX COLUMN) |
|PS |OID |JOIN VIA KEY COLUMN | 1339
|TC |OID |JOIN VIA KEY COLUMN | 5279
|BOL |OID |JOIN VIA KEY COLUMN | 490
| | | NO TEMPORARY RESULTS CREATED |
|JDBC_CURSOR_16| | RESULT IS COPIED , COSTVALUE IS | 1168
the item 1 was created just if the data on table increases, the optmizer can use the index to avoid TABLE SCAN at first line of item 4, i know that used TABLE SCAN because table have low data.
then as i understand, maybe i'm wrong, the MaxDB, (with index or no) will walk on all rows that have reached by expression "abpd.description like 'BOOK%'" and if matches then join with PD and then check the expression "pd.price BETWEEN 1 AND 10" if true, the row is returned, right?
if i'm right then think that table ABSTRACTPRODUCT have 1 million of rows, and the expression "abpd.description like 'BOOK%'" selects 50% of this rows, and expression "pd.price BETWEEN 1 AND 10" selects 50% of rows in table product, but the "join" only returns 100 qualified rows.
the only way to change this scenario, and speed queries that i know, is consolidate all hierarchy to one table, using the strategy, one table per class hierarchy, and create the right index.
see for example the item 5 the explain its like the item 4, but uses more pages, because its from a production system that have data, , this SQL is one that i get from diagnose monitor, and the selectivity is less than 1%
the ways to solve that, i think, without changing table structures, is if MaxDB support Materialized views, or a way to use a "Joined Index" that Optmizer can use to better Walk between joins.
if you have a better example of how hierarchical tables, with several data, and criteria to select few rows, is only achieved after join, please post here, i just used that sample, to have something to discuss this problem, the source that i'm based to make this sample DDL is from http://www.jpox.org/docs/1_1/inheritance.html, that have a graph that can explain better the table hierarchy.
best regards
Clóvis
HI Clovis,
thanks for the Explain plans.
They show exactly what I wanted to tell you: the chosen execution plan depends on the data in the tables.
In your testcase, the tables only have one page each. So the quickest access possible is: read whole tables/indexes.
In the example of the production system you see that there is no tablescan done at all.
Instead an Only-Index-Access is used.
After that the joins via OID are done via the KEY - which is also the fastest join option (perhabs you are aware of the fact, that in MaxDB tables are in fact B*Trees like the indexes).
Anyhow - let's check what happens to your queries, when we load some data:
insert into product
(Select id, 'x' & id, 99.90 from abstractproduct where id not like 'A%')
//
insert into abstractproduct
(Select rowno, 'MY DESC' from tables a, tables b where rowno <100000)
//
update stat *
//
explain SELECT
pd.code CODE,
abpd.description DESCRIPTION,
DECODE(bk.id,NULL,cd.artist,bk.author) PERSON
FROM
ABSTRACTPRODUCT abpd,
PRODUCT pd,
BOOK bk,
COMPACTDISK cd
WHERE
pd.id = abpd.id AND
bk.id(+) = abpd.id AND
cd.id(+) = abpd.id AND
abpd.description like '%STARS%' AND
pd.price BETWEEN 1 AND 10
TABLENAME COLUMN_OR_INDEX STRATEGY PAGECOUNT
ABPD IABSTRACTPRODUCT RANGE CONDITION FOR INDEX 354
ONLY INDEX ACCESSED
DESCRIPTION (USED INDEX COLUMN)
PD ID JOIN VIA KEY COLUMN 394
BK ID JOIN VIA KEY COLUMN 1
TABLE HASHED
CD ID JOIN VIA KEY COLUMN 1
NO TEMPORARY RESULTS CREATED
JDBC_CURSOR_46 RESULT IS COPIED , COSTVALUE IS 526
As you see - you're right At first the index (not the table!) IABSTRACTPRODUCT is scanned for the hits in DESCRIPTION.
Since in all secondary indexes the primary key is stored, the ID is known afterwards, so no table access is necessary here.
With that the product table has to be read for those matching rows to check the prices.
This - as well as the other joins - can be done via primary key access.
So with the current index-structure - this is the optimal access.
Anyhow - the query can be improved. Let's create an index for the price:
create index i_prodprice on product (price)
//
update stat product
//
TABLENAME COLUMN_OR_INDEX STRATEGY PAGECOUNT
PD I_PRODPRICE RANGE CONDITION FOR INDEX 394
PRICE (USED INDEX COLUMN)
ABPD ID JOIN VIA KEY COLUMN 354
BK ID JOIN VIA KEY COLUMN 1
CD ID JOIN VIA KEY COLUMN 1
NO TEMPORARY RESULTS CREATED
JDBC_CURSOR_44 RESULT IS COPIED , COSTVALUE IS 5
>the only way to change this scenario, and speed queries that i know, is consolidate all hierarchy to one table, using the strategy, one table per
> class hierarchy, and create the right index.
As you see - not true
> the ways to solve that, i think, without changing table structures, is if MaxDB support Materialized views, or a way to use a "Joined Index" that
> Optmizer can use to better Walk between joins.
Once you provide the right indexes, the join order will be adopted to get better results. For queries like this, neither MVs nor "Joined Indexes" are necessary.
What's important for MaxDB as well as for all other RDBMS is:
Tuning won't be done in a "general" fashion.
Design the tables/indexes and queries to perform. The data structure that is given to you by some modelling tool is not meant to be fast - it's meant to be "general". That usually leads to performance issues and bringing up database features is the wrong direction to solve them.
Better change the cause for the issue!
KR Lars
Hi, Lars, thanks for your explanations,
but with your solution you still have a low selectivity, walk many rows to return only few rows, the only way to really achieve a better selectivity ratio today in MaxDB is to consolidate all tables in one, i tried to find in another databases what i try to explain here about index join, i think that i found the ideia in:
http://publib.boulder.ibm.com/infocenter/idshelp/v111/topic/com.ibm.ddi.doc/ddi244.htm#sii111003305
the ideia is not to change the DDL for tables, but implement a new feature that where clauses that uses that type of structure, can act as a consolidate table.
best regards.
Clóvis
Hi Clovis,
true - the selectivity does not change here.
You still want to see only a few rows.
But the performance will be increased to a sufficient level.
The concept you wish for does exists in other datbases. E.g. Oracle has this - and for SAP systems this functionality has been disabled...
The point in query tuning cannot be to think up database features that may or may not help to bring up a optimal 100% selectivity for an outer join (which would render it ridiculous).
The goal has to be to bring the response time to the desired level.
What response time do you need? What do you get?
BTW: the prejoined indexes also need to be maintained! Think about it - everytime you insert, update, delete one row in one of the tables, the database has to actually perform the join to set up the index again.
So, I believe I got your point. New features are always nice to play with, but most often they are not the salvation for badly designed databases.
KR Lars
Hi Lars,
Today the query to return first row, reply after 382128ms its about 6 minutes, and the problem explained by the diagnose monitor is the selectivity, well, with our discusion, to continue using MaxDB and have future speed i must consolidate all table hierarchy to one table, as i dont understand the internal structures of MaxDB, how much space will be consumed if i change the DDL to:
CREATE TABLE PRODUCT (
ID VARCHAR(8) NOT NULL,
DESCRIPTION VARCHAR(60) NOT NULL,
CODE VARCHAR(10) NOT NULL,
PRICE DECIMAL(12,2),
AUTHOR VARCHAR(60),
ARTIST VARCHAR(60),
PRIMARY KEY(ID)
)
//
i'm right to think that colums dont used by a Row, dont consume space? then i dont have a great impact on storage?
best regards
Clóvis
> Hi Lars,
Hi again Clovis!
> Today the query to return first row, reply after 382128ms its about 6 minutes, and the problem explained by the diagnose monitor is the selectivity, well, with our discusion, to continue using MaxDB and have future speed i must consolidate all table hierarchy to one table, as i dont understand the internal structures of MaxDB, how much space will be consumed if i change the DDL to:
Hmm... basically if you really want to "consolidate" your table, go ahead. It's your table after all
But until now I don't see how you're approach is going to make anything faster.
You still will need the right indexes to support your queries. If you query for a certain price and that field has no index, the database will have to read all the rows. The same goes for the description.
6 Minutes is really a long time - but it does not tell us, where the time was spent. How many rows should have been returned for that query?
How did the query look like? What looked the involved tables/indexes like?
If you give us the necessary information, we will try to help you with the performance issue. Otherwise it's just guessing around...
You see, the major issue is not whether to denormalize or not. But to see where the time is spend and if it is necessary that the time is spend there.
That cannot be done - as already said - in a general manner.
With the tables and indexes you create the data structures for your application. These have to be fitted to your application.
The one size fits all approach simply does not work here.
>
CREATE TABLE PRODUCT ( > ID VARCHAR(8) NOT NULL, > DESCRIPTION VARCHAR(60) NOT NULL, > CODE VARCHAR(10) NOT NULL, > PRICE DECIMAL(12,2), > AUTHOR VARCHAR(60), > ARTIST VARCHAR(60), > PRIMARY KEY(ID) > ) > //
>
> i'm right to think that colums dont used by a Row, dont consume space? then i dont have a great impact on storage?
Nope, sorry, if you put NULL in a column, it has to be denoted - so it will use some space (but of course not the whole maximum possible size).
Ah - BTW - if I look at your proposal for "consolidation": Why not combine the fields AUTHOR/ARTIST? As far as I've seen these fields have the very same semantic meaning for any product: CREATOR.
If the product is a CD - then it's a ARTIST. If it's a book, then it's an AUTHOR. If you like you can add a PRODUCTTYPE to distinguish both types and create views for BOOKS and CDS and rename the CREATOR field appropriately.
> best regards
>
> Clóvis
KR Lars
Hi Lars,
The SQL statement of above explain is:
SELECT
MF.OID OID,
MF.MY_SACADO MY_SACADO,
MF.MY_CEDENTE MY_CEDENTE,
PS.ABREVIACAO NOME_CEDENTE,
MF.VALOR VALOR,
MF.DATA DATA,
MF.MY_TIPO_MOVIMENTO MY_TIPO_MOVIMENTO,
TM.ID_SYS ID_SYS_TIPO_MOVIMENTO,
MF.SITUACAO SITUACAO,
MF.OBSERVACAO OBSERVACAO,
TC.NUMERO NUMERO,
TC.SUBCLASS SUBCLASS,
TC.TIPO_COBRANCA TIPO_COBRANCA,
TC.DATA_EMISSAO DATA_EMISSAO,
TC.DATA_PAGAMENTO DATA_PAGAMENTO,
TC.DATA_VENCIMENTO DATA_VENCIMENTO,
TC.VLR_PAGO VLR_PAGO,
TC.VLR_MULTA VLR_MULTA,
TC.VLR_JUROS VLR_JUROS,
TC.VLR_DESCONTO VLR_DESCONTO,
BOL.MY_CONFIGURACAO_BOLETO CFG_BOL
FROM MOVIMENTOFINANCEIRO MF,
TITULOCOBRANCA TC,
BOLETO BOL,
PESSOA PS,
TIPOMOVIMENTO TM
WHERE TC.OID = MF.OID AND
BOL.OID(+) = TC.OID AND
MF.MY_SACADO = 'AAAAAZAT' AND
PS.OID = MF.MY_CEDENTE AND
TM.OID = MF.MY_TIPO_MOVIMENTO AND
MF.SITUACAO BETWEEN '0' AND '7' AND
MF.TIPO = 'CE' AND
TC.DATA_VENCIMENTO BETWEEN '2008-01-01 00:00:00.000' AND '2008-05-03 23:59:59.000'
ORDER BY TC.DATA_VENCIMENTO, TC.NUMERO
explanation
1) all OID columns is the primary key.
2) the table Boleto extends TituloCobranca that extends MovimentoFinanceiro. (joined by OID fields)
3) the execution time is 331140 ms and returns 2033 rows
4) the count of MovimentoFinanceiro with above where clause is 22662 rows.
5) the count of MovimentoFinanceiro without where is 821021 rows.
6) the count of TituloCobranca with above where clause is 52271 rows. (this table have a index on data_Vencimento field).
7) the count of TituloCobranca without where is 390493.
note: i post the Product idea just for example, and to have an English example, this table is the real table running in my application and i dont know how to translate to a English like, but this maybe the better translation:
MovimentoFinanceiro = Financial movement
TituloCobranca = Receivables for collection
Boleto = Bank payment slip
TipoMovimento = Type of Movement
Pessoa = Person
then what is your suggestion to make this execute fast?
KR
Clóvis
Hi Clovis,
Ok, let's see.
We don't have the exact statistics for all tables, we don't know about the index-structure and we don't have the execution plan.
Not too much to discuss here...
Anyhow, are there indexes for these conditions in place?
- MF.MY_SACADO = 'AAAAAZAT'
- MF.MY_CEDENTE
- MF.MY_TIPO_MOVIMENTO
- MF.SITUACAO BETWEEN '0' AND '7'
- MF.TIPO = 'CE' AND
Specifically an index on TC (DATA_VENCIMENTO, NUMERO) might be helpful.
How many different TIPO_MOVIMENTO entries are there?
It might be a good idea to make the TIPO the primary key of the table TipoMovimento. That way you can save a join to the table here.
The same question goes for CFG_BOL - could this be a prim key on BOLETO?
for that query a index on MF (OID, MY_SACADO, MY_TIPO_MOVIMENTO, SITUACAO) may be helpful
(if the TIPO is selectable via the primkey, include it into the index after OID).
That way, the join could be started at TC (52.271 rows), joined against the MF (22.662 rows left)
Did you update the statistics for the tables? Did you update statistics for the join columns?
(UPDATE STAT ) (UPDATE STAT COLUMN () FOR <tablename>)
What does the application exactly do with those 2033 rows? If some kind of aggregration should be done, than it might be worth to consider doing it in the database.
Is the ORDER BY really needed?
Due to the lack of further data I can only give you these general hints.
KR Lars
Hi Lars,
i will reply on every question.
> Hi Clovis,
>
> Ok, let's see.
> We don't have the exact statistics for all tables, we don't know about the index-structure and we don't have the execution plan.
execution plan
OWNER TABLENAME COLUMN_OR_INDEX STRATEGY PAGECOUNT TM ITIPOMOVIMENTO1 INDEX SCAN 22 ONLY INDEX ACCESSED MF IMOVIMENTOFINANCEIROX1 JOIN VIA RANGE OF MULTIPLE INDEXED COL. 9270 MY_SACADO (USED INDEX COLUMN) TIPO (USED INDEX COLUMN) PS OID JOIN VIA KEY COLUMN 1339 TC OID JOIN VIA KEY COLUMN 6509 BOL OID JOIN VIA KEY COLUMN 490 NO TEMPORARY RESULTS CREATED JDBC_CURSOR_17 RESULT IS COPIED , COSTVALUE IS 1234
indexes of MovimentoFinanceiro
INDEXNAME COLUMNNAME SORT COLUMNNO DATATYPE LEN INDEX_USED FILESTATE DISABLED IMOVIMENTOFINANCEIROX1 MY_SACADO ASC 1 CHAR 8 448025 OK NO IMOVIMENTOFINANCEIROX1 TIPO ASC 2 CHAR 2 448025 OK NO IMOVIMENTOFINANCEIROX1 DATA ASC 3 TIMESTAMP 20 448025 OK NO IMOVIMENTOFINANCEIROX2 MY_CEDENTE ASC 1 CHAR 8 154709 OK NO IMOVIMENTOFINANCEIROX2 TIPO ASC 2 CHAR 2 154709 OK NO IMOVIMENTOFINANCEIROX2 DATA ASC 3 TIMESTAMP 20 154709 OK NO IMOVIMENTOFINANCEIRO3 MY_CEDENTE ASC 1 CHAR 8 9770 OK NO IMOVIMENTOFINANCEIRO3 MY_SACADO ASC 2 CHAR 8 9770 OK NO IMOVIMENTOFINANCEIRO3 TIPO ASC 3 CHAR 2 9770 OK NO IMOVIMENTOFINANCEIRO3 SITUACAO ASC 4 CHAR 1 9770 OK NO
indexes of TituloCobranca
INDEXNAME COLUMNNAME SORT COLUMNNO DATATYPE LEN INDEX_USED FILESTATE DISABLED ITITULOCOBRANCA DATA_EMISSAO ASC 1 TIMESTAMP 20 1310 OK NO ITITULOCOBRANCA1 DATA_VENCIMENTO ASC 1 TIMESTAMP 20 28696 OK NO ITITULOCOBRANCA2 DATA_PAGAMENTO ASC 1 TIMESTAMP 20 128 OK NO ITITULOCOBRANCA3 NUMERO ASC 1 CHAR 12 35861 OK NO
OPTIMIZERSTATISTICS of MovimentoFinanceiro
TABLENAME INDEXNAME COLUMNNAME DISTINCTVALUES PAGECOUNT MOVIMENTOFINANCEIRO ? DATA 2666 ? MOVIMENTOFINANCEIRO ? MY_CEDENTE 12643 ? MOVIMENTOFINANCEIRO ? MY_SACADO 72828 ? MOVIMENTOFINANCEIRO ? MY_TIPO_MOVIMENTO 522 ? MOVIMENTOFINANCEIRO ? OID 675349 ? MOVIMENTOFINANCEIRO ? TIPO 8 ? MOVIMENTOFINANCEIRO IMOVIMENTOFINANCEIROX1 ? ? 3926 MOVIMENTOFINANCEIRO IMOVIMENTOFINANCEIROX2 ? ? 2751 MOVIMENTOFINANCEIRO IMOVIMENTOFINANCEIRO3 ? ? 2282 MOVIMENTOFINANCEIRO ? TABLE STATISTICS 675349 9270
OPTIMIZERSTATISTICS of TituloCobranca
TABLENAME INDEXNAME COLUMNNAME DISTINCTVALUES PAGECOUNT TITULOCOBRANCA ? DATA_EMISSAO 2454 ? TITULOCOBRANCA ? DATA_PAGAMENTO 2076 ? TITULOCOBRANCA ? DATA_VENCIMENTO 2933 ? TITULOCOBRANCA ? NUMERO 207798 ? TITULOCOBRANCA ? OID 390457 ? TITULOCOBRANCA ITITULOCOBRANCA2 ? ? 902 TITULOCOBRANCA ITITULOCOBRANCA ? ? 895 TITULOCOBRANCA ITITULOCOBRANCA1 ? ? 885 TITULOCOBRANCA ITITULOCOBRANCA3 ? ? 1440 TITULOCOBRANCA ? TABLE STATISTICS 390457 6509
files and roots of MovimentoFinanceiro
TYPE TABLENAME INDEXNAME ENTRYCOUNT TREEINDEXSIZE TREELEAVESSIZE LOBSIZE INDEX MOVIMENTOFINANCEIRO IMOVIMENTOFINANCEIRO3 144571 1008 18264 0 TABLE MOVIMENTOFINANCEIRO ? 821639 336 91832 0 INDEX MOVIMENTOFINANCEIRO IMOVIMENTOFINANCEIROX1 374533 160 31408 0 INDEX MOVIMENTOFINANCEIRO IMOVIMENTOFINANCEIROX2 204274 120 22008 0
files and roots of TituloCobranca
TYPE TABLENAME INDEXNAME ENTRYCOUNT TREEINDEXSIZE TREELEAVESSIZE LOBSIZE TABLE TITULOCOBRANCA ? 390857 92 52128 0 INDEX TITULOCOBRANCA ITITULOCOBRANCA2 2076 816 7216 0 INDEX TITULOCOBRANCA ITITULOCOBRANCA 2455 968 7160 0 INDEX TITULOCOBRANCA ITITULOCOBRANCA1 2933 608 7080 0 INDEX TITULOCOBRANCA ITITULOCOBRANCA3 208003 64 11520 0
if i miss something, please tell me.
> Not too much to discuss here...
>
> Anyhow, are there indexes for these conditions in place?
> - MF.MY_SACADO = 'AAAAAZAT'
> - MF.MY_CEDENTE
> - MF.MY_TIPO_MOVIMENTO
> - MF.SITUACAO BETWEEN '0' AND '7'
> - MF.TIPO = 'CE' AND
>
not in this exact sequence, see above
> Specifically an index on TC (DATA_VENCIMENTO, NUMERO) might be helpful.
>
today have a index for tc.data_vencimento
> * How many different TIPO_MOVIMENTO entries are there?
the table TipoMovimento have 1029 rows, and the SQL have just only 125 different references.
> It might be a good idea to make the TIPO the primary key of the table TipoMovimento. That way you can save a join to the table here.
Tipo, because? since the table have the primary key OID ? i dont know because what the primary key OID isnt used for table TipoMovimento, since the where clause is TM.OID = MF.MY_TIPO_MOVIMENTO and the TM.OID is the primary key.
>
> * The same question goes for CFG_BOL - could this be a prim key on BOLETO?
no, all my primary keys is OID.
>
> * for that query a index on MF (OID, MY_SACADO, MY_TIPO_MOVIMENTO, SITUACAO) may be helpful
> (if the TIPO is selectable via the primkey, include it into the index after OID).
i dont understand because what a index with primary key as the first column may increase the speed of query, since this dont create any sequential range, you can explain me?
>
> That way, the join could be started at TC (52.271 rows), joined against the MF (22.662 rows left)
>
> * Did you update the statistics for the tables? Did you update statistics for the join columns?
> (UPDATE STAT ) (UPDATE STAT COLUMN () FOR <tablename>)
>
i'm using auto update statistics this dont works?
> * What does the application exactly do with those 2033 rows? If some kind of aggregration should be done, than it might be worth to consider doing it in the database.
its used to show a report, detailed, and after the group of Data_Vencimento, have a sum of the values.
>
> * Is the ORDER BY really needed?
its used to group rows for report internal calculations.
>
> Due to the lack of further data I can only give you these general hints.
i think that now i post the required data, if i miss something, please tell me, and i post the required data.
>
> KR Lars
regards
Clóvis
HI Clovis,
not a easy one here, that's for sure ...
Anyhow...
The problem seems to be that the starting table is not a good choice here.
At the first step a index of TM is completely read to obtain a list of TM.OID and TM.ID_SYS. This list is then 1029 entries long.
Question: Is ID_SYS also UNIQUE?
If this would be the primary key of TM instead of OID, then you would have the values already in MF.MY_TIPO_MOVIMENTO and save a join.
(* sorry, I mixed up the field in my last reply *)
To join the intermediate list of 1029 rows, the MF table is accessed but only TIPO and MY_SACADO can be used for filtering the rows on this index access. For further processing the table has to be read as well.
So for this join step, all rows of MF that fit to MY_SACADO = 'AAAAAZAT' and MF.TIPO = 'CE' are read and compared to MF.MY_TIPO_MOVIMENTO= TM.OID.
If there would be an index including MY_TIPO_MOVIMENTO the join could be done more efficient here.
The next join is a kind of lookup to the persons table - should be ok like it is.
But then the join to the TC table is not too nice again, as the range on the time condition is not exploited.
Since the join condition is MF.OID = TC.OID, the optimizer uses the KEY to perform the join.
If there would be an index like (ID, DATA_VENCIMENTO) than it might be taken as it reduces the list of fitting rows earlier.
Concerning the aggregation of data in your application: I would try to leave this to the database as it might optimize the access further.
Since the tables are not that big I'm wondering how your instance is setup. Could you post some detaiils on the machine config (CPUs, Cache size, No. of Volumes, Size of Volums, OPTIM*-Parameters)? Somehow the execution time does not fit what I would expect from such a plan.
Did you use the DBAnalyzer already for bottleneck analysis?
Would it be possible for you to export the tables and send the data to me so that I can reproduce the statement in-house at SAP?
It's quite difficult to try to tune a statement just 'on paper'.
best Regards,
Lars
Hi Lars,
i dont understand because the optimizer get that Index for TM at execution plan, and because dont use the join via KEY column, note the WHERE clause is "TM.OID = MF.MY_TIPO_MOVIMENTO" by the key column, and the optimizer uses an INDEX that the indexed column is ID_SYS, that isnt and cant be a primary key, because its not UNIQUE, follow the index columns:
indexes of TipoMovimento
INDEXNAME COLUMNNAME SORT COLUMNNO DATATYPE LEN INDEX_USED FILESTATE DISABLED
ITIPOMOVIMENTO TIPO ASC 1 VARCHAR 2 220546 OK NO
ITIPOMOVIMENTO ID_SYS ASC 2 CHAR 6 220546 OK NO
ITIPOMOVIMENTO MY_CONTA_DEBITO ASC 3 CHAR 8 220546 OK NO
ITIPOMOVIMENTO MY_CONTA_CREDITO ASC 4 CHAR 8 220546 OK NO
ITIPOMOVIMENTO1 ID_SYS ASC 1 CHAR 6 567358 OK NO
ITIPOMOVIMENTO2 DESCRICAO ASC 1 VARCHAR 60 94692 OK NO
after i create the index iTituloCobrancaX7 on TituloCobranca(OID,DATA_VENCIMENTO) in a backup instance and get surprised with the follow explain:
OWNER TABLENAME COLUMN_OR_INDEX STRATEGY PAGECOUNT
TC ITITULOCOBRANCA1 RANGE CONDITION FOR INDEX 5368
DATA_VENCIMENTO (USED INDEX COLUMN)
MF OID JOIN VIA KEY COLUMN 9427
TM OID JOIN VIA KEY COLUMN 22
TABLE HASHED
PS OID JOIN VIA KEY COLUMN 1350
BOL OID JOIN VIA KEY COLUMN 497
NO TEMPORARY RESULTS CREATED
JDBC_CURSOR_19 RESULT IS COPIED , COSTVALUE IS 988
note that now the optimizer gets the index ITITULOCOBRANCA1 as i expected, if i drop the new index iTituloCobrancaX7 the optimizer still getting this execution plan, with this the query executes at 110 ms, with that great news i do same thing in the production system, but the execution plan dont changes, and i still getting a long execution time this time at 413516 ms. maybe the problem is how optimizer measure my tables.
i checked in DBAnalyser that the problem is catalog cache hit rate (we discussed this at [catalog cache hit rate, how to increase?|;
) and the low selectivity of this SQL command, then its because of this that to achieve a better selectivity i must have an index with, MF.MY_SACADO, MF.TIPO and TC.DATA_VENCIMENTO, as explained in previous posts, since this type of index inside MaxDB isnt possible, i have no choice to speed this type of query without changing tables structure.
MaxDB developers can develop this type of index? or a feature like this dont have any plans to be made?
if no, i must create another schema, to consolidate tables to speed queries on my system, but with this consolidation i will get more overhead, i must solve the less selectivity because i think if the data on tables increase, the query becomes impossible, i see that CREATE INDEX supports FUNCTION, maybe a FUNCTION that join data of two tables can solve this?
about instance configuration it is:
Machine:
Version: '64BIT Kernel'
Version: 'X64/LIX86 7.6.03 Build 007-123-157-515'
Version: 'FAST'
Machine: 'x86_64'
Processors: 2 ( logical: 8, cores: 8 )
data volumes:
ID MODE CONFIGUREDSIZE USABLESIZE USEDSIZE USEDSIZEPERCENTAGE DROPVOLUME TOTALCLUSTERAREASIZE RESERVEDCLUSTERAREASIZE USEDCLUSTERAREASIZE PATH
1 NORMAL 4194304 4194288 379464 9 NO 0 0 0 /db/SPDT/data/data01.dat
2 NORMAL 4194304 4194288 380432 9 NO 0 0 0 /db/SPDT/data/data02.dat
3 NORMAL 4194304 4194288 379184 9 NO 0 0 0 /db/SPDT/data/data03.dat
4 NORMAL 4194304 4194288 379624 9 NO 0 0 0 /db/SPDT/data/data04.dat
5 NORMAL 4194304 4194288 380024 9 NO 0 0 0 /db/SPDT/data/data05.dat
log volumes:
ID CONFIGUREDSIZE USABLESIZE PATH MIRRORPATH
1 51200 51176 /db/SPDT/log/log01.dat ?
parameters:
KERNELVERSION KERNEL 7.6.03 BUILD 007-123-157-515
INSTANCE_TYPE OLTP
MCOD NO
_SERVERDB_FOR_SAP YES
_UNICODE NO
DEFAULT_CODE ASCII
DATE_TIME_FORMAT ISO
CONTROLUSERID DBM
CONTROLPASSWORD
MAXLOGVOLUMES 2
MAXDATAVOLUMES 11
LOG_VOLUME_NAME_001 /db/SPDT/log/log01.dat
LOG_VOLUME_TYPE_001 F
LOG_VOLUME_SIZE_001 6400
DATA_VOLUME_NAME_0005 /db/SPDT/data/data05.dat
DATA_VOLUME_NAME_0004 /db/SPDT/data/data04.dat
DATA_VOLUME_NAME_0003 /db/SPDT/data/data03.dat
DATA_VOLUME_NAME_0002 /db/SPDT/data/data02.dat
DATA_VOLUME_NAME_0001 /db/SPDT/data/data01.dat
DATA_VOLUME_TYPE_0005 F
DATA_VOLUME_TYPE_0004 F
DATA_VOLUME_TYPE_0003 F
DATA_VOLUME_TYPE_0002 F
DATA_VOLUME_TYPE_0001 F
DATA_VOLUME_SIZE_0005 524288
DATA_VOLUME_SIZE_0004 524288
DATA_VOLUME_SIZE_0003 524288
DATA_VOLUME_SIZE_0002 524288
DATA_VOLUME_SIZE_0001 524288
DATA_VOLUME_MODE_0005 NORMAL
DATA_VOLUME_MODE_0004 NORMAL
DATA_VOLUME_MODE_0003 NORMAL
DATA_VOLUME_MODE_0002 NORMAL
DATA_VOLUME_MODE_0001 NORMAL
DATA_VOLUME_GROUPS 1
LOG_BACKUP_TO_PIPE NO
MAXBACKUPDEVS 2
LOG_MIRRORED NO
MAXVOLUMES 14
LOG_IO_BLOCK_COUNT 8
DATA_IO_BLOCK_COUNT 64
BACKUP_BLOCK_CNT 64
_DELAY_LOGWRITER 0
LOG_IO_QUEUE 50
_RESTART_TIME 600
MAXCPU 8
MAX_LOG_QUEUE_COUNT 0
USED_MAX_LOG_QUEUE_COUNT 8
LOG_QUEUE_COUNT 1
MAXUSERTASKS 500
_TRANS_RGNS 8
_TAB_RGNS 8
_OMS_REGIONS 0
_OMS_RGNS 7
OMS_HEAP_LIMIT 0
OMS_HEAP_COUNT 8
OMS_HEAP_BLOCKSIZE 10000
OMS_HEAP_THRESHOLD 100
OMS_VERS_THRESHOLD 2097152
HEAP_CHECK_LEVEL 0
_ROW_RGNS 8
RESERVEDSERVERTASKS 16
MINSERVERTASKS 28
MAXSERVERTASKS 28
_MAXGARBAGE_COLL 1
_MAXTRANS 4008
MAXLOCKS 120080
_LOCK_SUPPLY_BLOCK 100
DEADLOCK_DETECTION 4
SESSION_TIMEOUT 180
OMS_STREAM_TIMEOUT 30
REQUEST_TIMEOUT 5000
_IOPROCS_PER_DEV 2
_IOPROCS_FOR_PRIO 0
_IOPROCS_FOR_READER 0
_USE_IOPROCS_ONLY NO
_IOPROCS_SWITCH 2
LRU_FOR_SCAN NO
_PAGE_SIZE 8192
_PACKET_SIZE 131072
_MINREPLY_SIZE 4096
_MBLOCK_DATA_SIZE 32768
_MBLOCK_QUAL_SIZE 32768
_MBLOCK_STACK_SIZE 32768
_MBLOCK_STRAT_SIZE 16384
_WORKSTACK_SIZE 8192
_WORKDATA_SIZE 8192
_CAT_CACHE_MINSIZE 262144
CAT_CACHE_SUPPLY 131072
INIT_ALLOCATORSIZE 262144
ALLOW_MULTIPLE_SERVERTASK_UKTS NO
_TASKCLUSTER_01 tw;al;ut;2000*sv,100*bup;10*ev,10*gc;
_TASKCLUSTER_02 ti,100*dw;63*us;
_TASKCLUSTER_03 equalize
_DYN_TASK_STACK NO
_MP_RGN_QUEUE YES
_MP_RGN_DIRTY_READ DEFAULT
_MP_RGN_BUSY_WAIT DEFAULT
_MP_DISP_LOOPS 2
_MP_DISP_PRIO DEFAULT
MP_RGN_LOOP -1
_MP_RGN_PRIO DEFAULT
MAXRGN_REQUEST -1
_PRIO_BASE_U2U 100
_PRIO_BASE_IOC 80
_PRIO_BASE_RAV 80
_PRIO_BASE_REX 40
_PRIO_BASE_COM 10
_PRIO_FACTOR 80
_DELAY_COMMIT NO
_MAXTASK_STACK 512
MAX_SERVERTASK_STACK 500
MAX_SPECIALTASK_STACK 500
_DW_IO_AREA_SIZE 50
_DW_IO_AREA_FLUSH 50
FBM_VOLUME_COMPRESSION 50
FBM_VOLUME_BALANCE 10
_FBM_LOW_IO_RATE 10
CACHE_SIZE 262144
_DW_LRU_TAIL_FLUSH 25
XP_DATA_CACHE_RGNS 0
_DATA_CACHE_RGNS 64
XP_CONVERTER_REGIONS 0
CONVERTER_REGIONS 8
XP_MAXPAGER 0
MAXPAGER 64
SEQUENCE_CACHE 1
_IDXFILE_LIST_SIZE 2048
VOLUMENO_BIT_COUNT 8
OPTIM_MAX_MERGE 500
OPTIM_INV_ONLY YES
OPTIM_CACHE NO
OPTIM_JOIN_FETCH 0
JOIN_SEARCH_LEVEL 0
JOIN_MAXTAB_LEVEL4 16
JOIN_MAXTAB_LEVEL9 5
_READAHEAD_BLOBS 32
CLUSTER_WRITE_THRESHOLD 80
CLUSTERED_LOBS NO
RUNDIRECTORY /var/opt/sdb/data/wrk/SPDT
OPMSG1 /dev/console
OPMSG2 /dev/null
_KERNELDIAGFILE knldiag
KERNELDIAGSIZE 800
_EVENTFILE knldiag.evt
_EVENTSIZE 0
_MAXEVENTTASKS 2
_MAXEVENTS 100
_KERNELTRACEFILE knltrace
TRACE_PAGES_TI 2
TRACE_PAGES_GC 20
TRACE_PAGES_LW 5
TRACE_PAGES_PG 3
TRACE_PAGES_US 10
TRACE_PAGES_UT 5
TRACE_PAGES_SV 5
TRACE_PAGES_EV 2
TRACE_PAGES_BUP 0
KERNELTRACESIZE 5369
EXTERNAL_DUMP_REQUEST NO
_AK_DUMP_ALLOWED YES
_KERNELDUMPFILE knldump
_RTEDUMPFILE rtedump
_UTILITY_PROTFILE dbm.utl
UTILITY_PROTSIZE 100
_BACKUP_HISTFILE dbm.knl
_BACKUP_MED_DEF dbm.mdf
_MAX_MESSAGE_FILES 0
_SHMKERNEL 44601
__PARAM_CHANGED___ 0
__PARAM_VERIFIED__ 2008-05-03 23:12:55
DIAG_HISTORY_NUM 2
DIAG_HISTORY_PATH /var/opt/sdb/data/wrk/SPDT/DIAGHISTORY
_DIAG_SEM 1
SHOW_MAX_STACK_USE NO
SHOW_MAX_KB_STACK_USE NO
LOG_SEGMENT_SIZE 2133
_COMMENT
SUPPRESS_CORE YES
FORMATTING_MODE PARALLEL
FORMAT_DATAVOLUME YES
OFFICIAL_NODE
UKT_CPU_RELATIONSHIP NONE
HIRES_TIMER_TYPE CPU
LOAD_BALANCING_CHK 30
LOAD_BALANCING_DIF 10
LOAD_BALANCING_EQ 5
HS_STORAGE_DLL libhsscopy
HS_SYNC_INTERVAL 50
USE_OPEN_DIRECT YES
USE_OPEN_DIRECT_FOR_BACKUP NO
SYMBOL_DEMANGLING NO
EXPAND_COM_TRACE NO
JOIN_TABLEBUFFER 128
SET_VOLUME_LOCK YES
SHAREDSQL YES
SHAREDSQL_CLEANUPTHRESHOLD 25
SHAREDSQL_COMMANDCACHESIZE 262144
MEMORY_ALLOCATION_LIMIT 0
USE_SYSTEM_PAGE_CACHE YES
USE_COROUTINES YES
FORBID_LOAD_BALANCING YES
MIN_RETENTION_TIME 60
MAX_RETENTION_TIME 480
MAX_SINGLE_HASHTABLE_SIZE 512
MAX_HASHTABLE_MEMORY 5120
ENABLE_CHECK_INSTANCE YES
RTE_TEST_REGIONS 0
HASHED_RESULTSET YES
HASHED_RESULTSET_CACHESIZE 262144
CHECK_HASHED_RESULTSET 0
AUTO_RECREATE_BAD_INDEXES NO
AUTHENTICATION_ALLOW
AUTHENTICATION_DENY
TRACE_AK NO
TRACE_DEFAULT NO
TRACE_DELETE NO
TRACE_INDEX NO
TRACE_INSERT NO
TRACE_LOCK NO
TRACE_LONG NO
TRACE_OBJECT NO
TRACE_OBJECT_ADD NO
TRACE_OBJECT_ALTER NO
TRACE_OBJECT_FREE NO
TRACE_OBJECT_GET NO
TRACE_OPTIMIZE NO
TRACE_ORDER NO
TRACE_ORDER_STANDARD NO
TRACE_PAGES NO
TRACE_PRIMARY_TREE NO
TRACE_SELECT NO
TRACE_TIME NO
TRACE_UPDATE NO
TRACE_STOP_ERRORCODE 0
TRACE_ALLOCATOR 0
TRACE_CATALOG 0
TRACE_CLIENTKERNELCOM 0
TRACE_COMMON 0
TRACE_COMMUNICATION 0
TRACE_CONVERTER 0
TRACE_DATACHAIN 0
TRACE_DATACACHE 0
TRACE_DATAPAM 0
TRACE_DATATREE 0
TRACE_DATAINDEX 0
TRACE_DBPROC 0
TRACE_FBM 0
TRACE_FILEDIR 0
TRACE_FRAMECTRL 0
TRACE_IOMAN 0
TRACE_IPC 0
TRACE_JOIN 0
TRACE_KSQL 0
TRACE_LOGACTION 0
TRACE_LOGHISTORY 0
TRACE_LOGPAGE 0
TRACE_LOGTRANS 0
TRACE_LOGVOLUME 0
TRACE_MEMORY 0
TRACE_MESSAGES 0
TRACE_OBJECTCONTAINER 0
TRACE_OMS_CONTAINERDIR 0
TRACE_OMS_CONTEXT 0
TRACE_OMS_ERROR 0
TRACE_OMS_FLUSHCACHE 0
TRACE_OMS_INTERFACE 0
TRACE_OMS_KEY 0
TRACE_OMS_KEYRANGE 0
TRACE_OMS_LOCK 0
TRACE_OMS_MEMORY 0
TRACE_OMS_NEWOBJ 0
TRACE_OMS_SESSION 0
TRACE_OMS_STREAM 0
TRACE_OMS_VAROBJECT 0
TRACE_OMS_VERSION 0
TRACE_PAGER 0
TRACE_RUNTIME 0
TRACE_SHAREDSQL 0
TRACE_SQLMANAGER 0
TRACE_SRVTASKS 0
TRACE_SYNCHRONISATION 0
TRACE_SYSVIEW 0
TRACE_TABLE 0
TRACE_VOLUME 0
CHECK_BACKUP NO
CHECK_DATACACHE NO
CHECK_KB_REGIONS NO
CHECK_LOCK NO
CHECK_LOCK_SUPPLY NO
CHECK_REGIONS NO
CHECK_TASK_SPECIFIC_CATALOGCACHE NO
CHECK_TRANSLIST NO
CHECK_TREE NO
CHECK_TREE_LOCKS NO
CHECK_COMMON 0
CHECK_CONVERTER 0
CHECK_DATAPAGELOG 0
CHECK_DATAINDEX 0
CHECK_FBM 0
CHECK_IOMAN 0
CHECK_LOGHISTORY 0
CHECK_LOGPAGE 0
CHECK_LOGTRANS 0
CHECK_LOGVOLUME 0
CHECK_SRVTASKS 0
OPTIMIZE_AGGREGATION YES
OPTIMIZE_FETCH_REVERSE YES
OPTIMIZE_STAR_JOIN YES
OPTIMIZE_JOIN_ONEPHASE YES
OPTIMIZE_JOIN_OUTER YES
OPTIMIZE_MIN_MAX YES
OPTIMIZE_FIRST_ROWS YES
OPTIMIZE_OPERATOR_JOIN YES
OPTIMIZE_JOIN_HASHTABLE YES
OPTIMIZE_JOIN_HASH_MINIMAL_RATIO 1
OPTIMIZE_OPERATOR_JOIN_COSTFUNC YES
OPTIMIZE_JOIN_PARALLEL_MINSIZE 1000000
OPTIMIZE_JOIN_PARALLEL_SERVERS 0
OPTIMIZE_JOIN_OPERATOR_SORT YES
OPTIMIZE_QUAL_ON_INDEX YES
DDLTRIGGER YES
SUBTREE_LOCKS NO
MONITOR_READ 2147483647
MONITOR_TIME 2147483647
MONITOR_SELECTIVITY 0
MONITOR_ROWNO 0
CALLSTACKLEVEL 0
OMS_RUN_IN_UDE_SERVER NO
OPTIMIZE_QUERYREWRITE OPERATOR
TRACE_QUERYREWRITE 0
CHECK_QUERYREWRITE 0
PROTECT_DATACACHE_MEMORY NO
LOCAL_REDO_LOG_BUFFER_SIZE 0
FILEDIR_SPINLOCKPOOL_SIZE 10
TRANS_HISTORY_SIZE 0
TRANS_THRESHOLD_VALUE 60
ENABLE_SYSTEM_TRIGGERS YES
DBFILLINGABOVELIMIT 70L80M85M90H95H96H97H98H99H
DBFILLINGBELOWLIMIT 70L80L85L90L95L
LOGABOVELIMIT 50L75L90M95M96H97H98H99H
AUTOSAVE 1
BACKUPRESULT 1
CHECKDATA 1
EVENT 1
ADMIN 1
ONLINE 1
UPDSTATWANTED 1
OUTOFSESSIONS 3
ERROR 3
SYSTEMERROR 3
DATABASEFULL 1
LOGFULL 1
LOGSEGMENTFULL 1
STANDBY 1
USESELECTFETCH YES
USEVARIABLEINPUT NO
UPDATESTAT_PARALLEL_SERVERS 0
UPDATESTAT_SAMPLE_ALGO 1
SIMULATE_VECTORIO IF_OPEN_DIRECT_OR_RAW_DEVICE
COLUMNCOMPRESSION YES
TIME_MEASUREMENT NO
CHECK_TABLE_WIDTH NO
MAX_MESSAGE_LIST_LENGTH 100
SYMBOL_RESOLUTION YES
PREALLOCATE_IOWORKER NO
CACHE_IN_SHARED_MEMORY NO
INDEX_LEAF_CACHING 2
NO_SYNC_TO_DISK_WANTED NO
SPINLOCK_LOOP_COUNT 30000
SPINLOCK_BACKOFF_BASE 1
SPINLOCK_BACKOFF_FACTOR 2
SPINLOCK_BACKOFF_MAXIMUM 64
ROW_LOCKS_PER_TRANSACTION 50
USEUNICODECOLUMNCOMPRESSION NO
about send you the data from tables, i dont have permission to do that, since all data is in a production system, the customer dont give me the rights to send any information. sorry about that.
best regards
Clóvis
Hmmm... a hash join is quite Ok there.
But why the DATA_VENCIMENTO is not taken on the production system is weired.
Please execute this command for each of the tables for once:
UPDATE STAT COLUMN (*) FOR <tablename>
Does the plan change afterwards?
Some general other remarks:
If possible you should change the parameter SIMULATE_VECTORIO to NEVER.
For LOG_QUEUE_COUNT better set 0 - so MaxDB automatically creates the optimal number of LOG Queues.
Depending on the number of waits you get for writing into the log you may want to increase LOG_IO_QUEUE to say 1000.
HASHED_RESULTSET should be better set to NO.
Try to set the following parameters to larger values:
MAX_HASHTABLE_MEMORY = 24000 KB
MAX_SINGLE_HASHTABLE_SIZE = 4000 KB
BTW:
LOG_MIRRORED NO ...
Do you have a RAID 1+0 for your log volume?
If there is no mirroring for the log in place, than you should better activate the MaxDB log mirroring for a bit more backup security.
Concerning your development request:
As already explained I don't see that this would work well for the MaxDB data structure.
For queries like yours a better approach would be to use multiple single indexes and merge primary key lists from these indexes.
Anyhow - the problem with your query is more the data modell:
Since MF is the only connection between all tables all joins that are done after the visit to MF can not make use of the restrictions on the following tables.
Right now I would propose to check on why the access path on the production system does not choose TM as entry table.
KR Lars
Hi Lars,
UPDATE STATS dont work, still same plan, i dropped that index, then optimizer get the UNIQUE index for TM, iow, still getting the wrong path, the only way to avoid the use of TM table in the execution plan is to remove that table from SQL query, but if i remove, the TC table still dont enter as entry path, the table choose by optimizer is the MF, its a strange behavior, i cant understand what is the reason and dont know how to debug how optimizer choose this.
the parameters i will change today night, the database in production system is RAID 1 hardware, for log, data and operating system.
about a feature like Joined Index, what is the right place to make a proposal of this?
KR
Clóvis
Hi Clovis
UPDATE STATS dont work, still same plan, i dropped that index, then optimizer get the UNIQUE index for TM, iow, still getting the wrong path, the only way to avoid the use of TM table in the execution plan is to remove that table from SQL query, but if i remove, the TC table still dont enter as entry path, the table choose by optimizer is the MF, its a strange behavior, i cant understand what is the reason and dont know how to debug how optimizer choose this.
Hmm... debugging the optimizer is no option for anybody outside the optimizer development team - really, the trace is not comparable...
Anyhow - we're not done yet ...
I reviewed the query and found that BOL can actually also joined to MF directly - is this correct?
The connection MF : TC must be a 1:1..n connection for this - is this the case?
That given, it might pay off to join BOL directly to MF instead or in addition to the join TC:MF.
Moreover please create the index on MF (OID, MY_SACADO, MY_TIPO_MOVIMENTO, SITUACAO) I already proposed.
If the optimizer decides to start with TC it can directly join via this index and apply the restrictions from your where clause.
This should reduce the number of rows to be checked as well. If OID is not in the first place here - the index cannot be used for the join transition.
Also I'd like you to execute a "EXPLAIN JOIN" on the statement and post the output. Perhabs we can see the reason for the optimizer decision then.
> about a feature like Joined Index, what is the right place to make a proposal of this?
You're already in the very right place with that here.
The developers, the development support as well as the global sap support (like me) is reading this forum.
To stress it, you can post a seperate thread for it so that your wish does not get burried in this lengthly optimization thread.
regards,
Lars
Hi, Lars,
The conection MF:TF:BO is just 1:1 (one to one) the connection is by the primary key OID, because in design BO extends TF that extends MF, when Join is by OID column (ObjectID), then exists just one row that match the join.
i created the Index and changed the parameters with your suggestions, less the parameters SIMULATE_VECTORIO(i can change this to never on Linux? the note means something about a bug on linux kernel) and LOG_QUEUE_COUNT (i must stop the database to change).
after i created the index on MF (OID, MY_SACADO, MY_TIPO_MOVIMENTO, SITUACAO) and run UPDATE STAT for all tables, and changed the where clause of bol.oid(+) = tc.oid to bol.oid(+) = mf.oid
but the plan remains unchanged.
the explain join result
STRATEGY OWNER TABLENAME MAX ADDNL. ROWS MULTIPLIER REVERSE MULTIPLIER ADDITIONAL PAGES ADDITIONAL ROWS ACCUMULATED_COSTS
TM 1012 1 1 5.24352331606218 1012 9
JOIN VIA MORE THAN ONE FIELD MF 13632.0001524817 9 1 314.068965517241 9108 489.83516270034
JOIN VIA SINGLE KEY PS 92925 1 1 337.333333333333 9108 971.739924605102
JOIN VIA SINGLE KEY TC 54751.9977176785 1 1 535.764705882353 9108 1661.87751184339
JOIN VIA SINGLE KEY BOL 106656 1 1 535.764705882353 9108 2249.39221772574
i created the index iTituloCobrancaX4 ON TC(OID,DATA_VENCIMENTO) too and the execution plan is:
OWNER TABLENAME COLUMN_OR_INDEX STRATEGY PAGECOUNT
TM ITIPOMOVIMENTO INDEX SCAN 22
ONLY INDEX ACCESSED
MF IMOVIMENTOFINANCEIROX1 JOIN VIA RANGE OF MULTIPLE INDEXED COL. 11513
MY_SACADO (USED INDEX COLUMN)
TIPO (USED INDEX COLUMN)
PS OID OIN VIA KEY COLUMN 1475
TC ITITULOCOBRANCAX4 JOIN VIA RANGE OF MULTIPLE INDEXED COL. 6535
ADDNL. QUALIFICATION ON INDEX
OID (USED INDEX COLUMN)
BOL OID JOIN VIA KEY COLUMN 607
NO TEMPORARY RESULTS CREATED
JDBC_CURSOR_33 RESULT IS COPIED , COSTVALUE IS 2296
well in any case the query executes in 317860 ms, as you said i'm getting a weird thing here,
about the feature, i think that is a great way to improve performance of joins, another feature that can be made, maybe have a way to one table extends another, then have a way to create an index for that extended table, something like:
CREATE TABLE TituloCobranca EXTENDS MovimentoFinanceiro...
then we can create index like
CREATE INDEX ON TituloCobranca (MY_SACADO, TIPO, DATA_VENCIMENTO)
thanks so much for your attention
KR
Clóvis
> Hi, Lars,
>
> The conection MF:TF:BO is just 1:1 (one to one) the connection is by the primary key OID, because in design BO extends TF that extends MF, when Join is by OID column (ObjectID), then exists just one row that match the join.
Hi Clovis,
if the connection between these three tables is really always 1:1:1 (i doubt that from the data you send), then a) you don't need to outer join BO and b) it seems that the data belongs together. That is it should be stored into one table - if that's what you wanted to 'consolidate' - then I've to excuse as you're right then.
Anyhow - the statistics data showed 675349 OID values for MF but only 390457 values for OID on TC (I assume TF was a typo).
That is not a 1:1 relationship, but a 1:0..n (MF:TM).
> i created the Index and changed the parameters with your suggestions, less the parameters SIMULATE_VECTORIO(i can change this to never on Linux? the note means something about a bug on linux kernel)
It references to a bug in the 2.6 Linux Kernel.
See PTS 1132876 [http://maxdb.sap.com/webpts] for details on that.
If you're on a newer Linux Kernel you should set this parameter to NEVER.
> after i created the index on MF (OID, MY_SACADO, MY_TIPO_MOVIMENTO, SITUACAO) and run UPDATE STAT for all tables, and changed the where clause of bol.oid(+) = tc.oid to bol.oid(+) = mf.oid
>
> but the plan remains unchanged.
>
> the explain join result
>
STRATEGY OWNER TABLENAME MAX ADDNL. ROWS MULTIPLIER REVERSE MULTIPLIER ADDITIONAL PAGES ADDITIONAL ROWS ACCUMULATED_COSTS
> TM 1012 1 1 5.24352331606218 1012 9
> JOIN VIA MORE THAN ONE FIELD MF 13632.0001524817 9 1 314.068965517241 9108 489.83516270034
> JOIN VIA SINGLE KEY PS 92925 1 1 337.333333333333 9108 971.739924605102
> JOIN VIA SINGLE KEY TC 54751.9977176785 1 1 535.764705882353 9108 1661.87751184339
> JOIN VIA SINGLE KEY BOL 106656 1 1 535.764705882353 9108 2249.39221772574
Hmm... ok - this picture does not fit with the original explain plan you send:
OWNER TABLENAME COLUMN_OR_INDEX STRATEGY PAGECOUNT
TM ITIPOMOVIMENTO INDEX SCAN 22
ONLY INDEX ACCESSED
MF IMOVIMENTOFINANCEIROX1 JOIN VIA RANGE OF MULTIPLE INDEXED COL. 11513
MY_SACADO (USED INDEX COLUMN)
TIPO (USED INDEX COLUMN)
PS OID OIN VIA KEY COLUMN 1475
TC ITITULOCOBRANCAX4 JOIN VIA RANGE OF MULTIPLE INDEXED COL. 6535
ADDNL. QUALIFICATION ON INDEX
OID (USED INDEX COLUMN)
BOL OID JOIN VIA KEY COLUMN 607
NO TEMPORARY RESULTS CREATED
JDBC_CURSOR_33 RESULT IS COPIED , COSTVALUE IS 2296
Something is quite wrong here and I doubt that we can find it w/o beeing able to actually have a look into the system or get the data for in-house reproduction.
You wrote that the execution plan on your development machine has already changed. Did it change on your Q/A system as well?
Could you refresh your Q/A system with the current state of the production database, so that the dbs contain the same data?
Are there any differences in the table-definitions or the way you access the data from your dev-system to the prod-system?
> about the feature, i think that is a great way to improve performance of joins, another feature that can be made, maybe have a way to one table extends another, then have a way to create an index for that extended table, something like:
> CREATE TABLE TituloCobranca EXTENDS MovimentoFinanceiro...
> then we can create index like
> CREATE INDEX ON TituloCobranca (MY_SACADO, TIPO, DATA_VENCIMENTO)
Well - I've written enough on why I don't see that this feature won't help much.
It's just a patch for bad Object to Relation mapping and not a very good one though.
Perhabs one of the developers feels lucky and decides to implement such a thing, but I would not wait for it...
BTW: Clovis, it looks like you're developing quite intensively with MaxDB.
Perhabs you don't know yet, but there is the http://maxdb.sap.com site as well as the [MaxDB Wiki|https://wiki.sdn.sap.com/wiki/x/gjc] with lots of documentation, examples and internals.
Might be worth a look.
Best regards,
Lars
User | Count |
---|---|
81 | |
10 | |
10 | |
9 | |
7 | |
6 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.