on 11-04-2012 9:50 PM
Hi,
just want to share my experience with performance of select statements using FOR ALL ENTRIES and APPENDING TABLE.
once I had a situation at one worldwide insurance company where an FI consultant insisted and provided some docu trying to prove that APPENDING TABLE is faster, well I had doubts but didn't have time to do a real test and finally I found a time to this test.
So test case data are not fictional dummy data but rather real one:
approx 5 mil.(4.924.731) in mt_vtref_tab were used to test select statements:
Both statements were performed at the same server with same test data
"// Select
SELECT insobject insobjecttyp insobjectext FROM dimaiob
INTO TABLE lt_insobject
FOR ALL ENTRIES IN mt_vtref_tab
WHERE insobjecttyp IN me->mrt_insobjecttypc
AND insobjectext = mt_vtref_tab-vtref.
VERSUS
LOOP AT me->mt_vtref_tab ASSIGNING <vtref_line>.
SELECT insobject insobjecttyp insobjectext FROM dimaiob
APPENDING TABLE lt_insobject
WHERE insobjecttyp IN me->mrt_insobjecttypc
AND insobjectext = <vtref_line>-vtref.
ENDLOOP.
17 343 Seconds
although this version does removal of duplicates for us, we should always check if a table used in for all entries is not empty
SAP documentation says "Use FOR ALL ENTRIES only in exceptional cases", why?
29 392 Seconds
this version has its own disadvantages
After selection the resulting Internal table should be sorted and then duplicates must be deleted
if you have faced other results with same data on same server please share =),
BR,
Kairat
Hi Kairat,
It is good to see example measurement and real values
I think the topic is a bit misleading as actually you are comparing FOR ALL ENTRIES vs LOOP solution and not vs APPENDING TABLE. On the other hand you could compare INTO TABLE vs APPENDING TABLE, for same command like:
SELECT insobject insobjecttyp insobjectext FROM dimaiob
INTO TABLE lt_insobject
FOR ALL ENTRIES IN mt_vtref_tab
WHERE insobjecttyp IN me->mrt_insobjecttypc
AND insobjectext = mt_vtref_tab-vtref.
vs
SELECT insobject insobjecttyp insobjectext FROM dimaiob
APPENDING TABLE lt_insobject
FOR ALL ENTRIES IN mt_vtref_tab
WHERE insobjecttyp IN me->mrt_insobjecttypc
AND insobjectext = mt_vtref_tab-vtref.
But here I guess results would be similar.
In fact two factors are advantage for FOR ALL ENTRIES IN vs LOOP, already mentioned above:
- LOOP has worse performance because you ask as many times as entries count in mt_vtref_tab, while FOR ALL ENTRIES IN is optimized and split into packages with input values, so finally there are less ask-reply steps between program and database.
- FOR ALL ENTRIES removes duplicates automatically. Although it is not done on database side, as Randolf mentioned, we have simpler result automatically. Even if we add DISTINCT to SELECT in LOOP, duplicates will be removed many times for each single query, but we may have final duplicates appended from different loop iterations. So this will be still less efficient.
On the other hand if we need duplicates for some reason (like to count statistics), then we should use LOOP as FOR ALL ENTRIES IN looses this information.
Regards,
Adam
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Kairat,
in my opinion the LOOP is not an option. It does not matter how concretely the "FOR ALL ENTRIES" is translated for concrete database, becasue in any case the principle of "bulking" is applied.
With the LOOP we will always waste time in roundtrips to the database, as also clear from your measurement result.
If your times are that high - is it because of high number of records retrieved or from missing database indexes for the sql 'where' clause? In case you retrieve many records, this is what you need to optimize logically or pre-fetch in asynchronous task...
Best Regards,
Sylvia
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Sylvia,
The secondary indices exist, but the amount of selected data is huge as I mentioned nearly 5mil. rows.(although in reality program runs in 60 parallel work processes, but I did a test case just to see how it performs for huge data selecetion).
As for Where condition it is same for both statements FAE and LOOP same index is used in both cases, and are actually optimized for the best index.
Hi Kairat,
I fully support what Adam Kraqczuk explained to you (see reply 25.03.) and I have nothing more to add.
Only from perspective of general theory, in every concrete case few principles should be followed:
a) minimum number of calls to the database (bulking, mass calls)
b) minimum data retrieval (no identical records , minimal set of fields in the result set)
c) absolute match of statement and supporting database indexes
If a program is complient to it, it will have optimal possible performance.
Best regards,
Sylvia
Hi Sylvia,
thanks for your posts,
everything what you and Adam have mentioned were taken into account before the test was done, the goal of this post was to share this test outcomes and possibly discuss other experiences, maybe even on other DB and/or OS 😃
There are a lot of guys that think that Looping over is much faster 😃 and this post might help them(hopefully) when they do search =).
Thanks for collaboration
Hi Kairat,
The removal of duplicates in the FOR ALL ENTRIES is only done for the result set of the query, that is for the data that is returned from the database. Therefore it is essential to eliminate duplicates (with respect to the chosen WHERE-clause) already from the internal table on the ABAP side. In your example, there should be no duplicate entries with respect to vtref in mt_vtref_tab.
The response time of your SELECT-statements will depend on the underlying database platform. The FOR ALL ENTRIES is an ABAP OpenSQL specific statement that is translated into different SQL-statements depending on the database and some profile parameters.
Which SAP documentation says "Use FOR ALL ENTRIES only in exceptional cases"? This statement alone would be nonsense and should be corrected.
Best Regards,
Randolf
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hallo Randolf,
thanks for the reply,
Yes I am informed about the prerequisites of the usage of For all entries,
as well as the DBMS impact on it and how it is run during runtime on DB level
expected probles are also described in BC490 ABAP Performance tuning page 184
and here what is in course
If you want to read large data volumes,you should only use SELECT FOR ALL ENTRIES in exceptional cases.
page 185
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.