Solved: Performance: The optimal way to change the content...

MikeB · ‎06-14-2013

In ABAP we can update the content of internal table at least by two ways:

1. Pass over the internal table with LOOP-AT statement, assign the iteration value to FIELD-SYMBOL pointer and update the desired attribute of the structure.

2. Use MODIFY statement for internal table with TRANSPORTING-WHERE addition.

The question is what way is more optimal from performance point of view for large and short tables.

It will be great if some theoretical explanation will be supplied, why exactly this way is better, than the other one.

I tried to check the subject in Tips & Tricks of SE30, but I'm not sure, that these results are statistically reliable.

Thanks.

MikeB · ‎06-14-2013

Thanks guys, let me specify the question with some example.

Let say I have an internal table, where one of the columns is a language, e.g. «EN-US», «EN-CA», «FR-CA» etc., now I want to replace all language values from «EN-US» to «EN-GB».

I can do it with:

LOOP AT itab ASSIGNING <wa>.
IF <wa>-lang = 'EN-US'.
<wa>-lang = 'EN-GB'.
ENDIF.
ENDLOOP.

or with:

wa-lang = 'EN-GB'.
MODIFY itab FROM wa TRANSPORTING lang WHERE lang = 'EN-US'.

My question, what is the optimal approach from performance point of view to implement such logic for case with long and short itabs.

In some places I heard, that the second one is better, but I would like to get some theoretically supported explanation.

Thanks.

Former Member · ‎06-14-2013

Hi Mike,

The main difference beteween:

As per your question we can write the following two codes to loop into and modify the internal tables:

1) LOOP AT itab INTO wa.

MODIFY itab..........

ENDLOOP.

2) LOOP AT itab ASSIGNING <fs>

Change contents of <fs>

ENDLOOP.

In the first case, while using workarea "wa", the internal table records are copied into it per loop iteration. Here just the value of one itab record is copied in wa.

However, for the 2nd case, while using field-symbol <fs>, it points to a row in itab in each iteration of the loop. It just works as a pointer to the original row in memory. values of the row are not copied here.

Is that in the first case you are transfering itab data to the work area, line by line. In the second case, you are creating a pointer to each line of the table, and no data transfer is done.

So here to modify the itab, if we use <fs>, when we change the content of any column though <fs>, it directly changes the value. We dont need to write the MODIFY statement again.

But while using "wa", since it is not pointing to the original row in memory that is why we need to write the MODIFY statement here.

To conclude, while modifying internal table, field-symbol is more efficient than using workarea because it reduces the overhead.

Please see the below link also to get the advantages of field-symbols:

http://help.sap.com/saphelp_nw70/helpdata/en/fc/eb3605358411d1829f0000e829fbfe/content.htm

Thanks,

Arnab

jens_becher · ‎06-14-2013

Hello Arnab,

I understand Mike question as comparing

loop at itab assigning ... where ... .

with code

modfiy itab from ... where ... transporting...

Regards,

Jens

jens_becher · ‎06-14-2013

Hello Mike,

that is a good question.

As far as I know there are some aspects to be considered too:

- type of internal table (hash, ...)

- kind of where-clause (non-/key-fields, type of compared fields, size of where clause ...)

- etc.

This sounds some complex to me. All I could find was tips doing this or doing that, no theoretical and complete explanation .

Maybe some guy knowing the compilation of the statements to exe can answer here.

Regards,

Jens

naimesh_patel · ‎06-14-2013

Hello Mike,

The syntax are used for different purposes - Like syntax MODIFY.. WHERE is to modify bunch of records at a time, where as statement MODIFY in the LOOP ... MODIFY .. ENDLOOP, would MODIFY single record at a time.

Now, if you have quite a few records in your LOOP and you want to modify them conditionally say, every alternate row, you would need to use LOOP.. MODIFY.. ENDLOOP. You wont be able to use MODIFY..WHERE. When you use MODIFY in LOOP, you can either use the Field Symbol or the Workarea. FieldSymbols are far better when you have huge records in the table. Read Use of Field-symbols vs Work area | ABAP Help Blog

When you use the MODIFY..WHERE, the effect is somewhat similar to LOOP AT.. WHERE.. MODIFY.. ENDLOOP. Since 731 you can specify Secondary Keys to your internal table. If you have done so, the MODIFY using the Secondary Key would be faster with compared to MODIFY without the key. The reason is MODIFY.. WHERE would need to scan the entire standard table and the access is not optimal.

Thanks,
Naimesh Patel

MikeB · ‎06-14-2013

Thanks guys, let me specify the question with some example.

Let say I have an internal table, where one of the columns is a language, e.g. «EN-US», «EN-CA», «FR-CA» etc., now I want to replace all language values from «EN-US» to «EN-GB».

I can do it with:

LOOP AT itab ASSIGNING <wa>.
IF <wa>-lang = 'EN-US'.
<wa>-lang = 'EN-GB'.
ENDIF.
ENDLOOP.

or with:

wa-lang = 'EN-GB'.
MODIFY itab FROM wa TRANSPORTING lang WHERE lang = 'EN-US'.

My question, what is the optimal approach from performance point of view to implement such logic for case with long and short itabs.

In some places I heard, that the second one is better, but I would like to get some theoretically supported explanation.

Thanks.

Former Member · ‎06-14-2013

In theory and reality, an assigned field symbol is going to be better from a performance point of view because there is less data movement. That is, moving an itab row from the internal table to a work-area and back to the internal table is going to take more cycle counts than simply adjusting the memory field of the addressed row of the internal table.

Herein, the term cycle counts refers to CPU cycles. You can take this description down to whether it is faster to directly address memory in assembly language or use a memory buffer with faster access speeds. Either way, at a level such as ABAP, use the field symbols whenever you can.

MikeB · ‎06-14-2013

I agree with your opinion, but there are two moments, that be important while answering on my question.

1. Generally, field-symbol is a pointer and in order to fill-in its values you have to first of all to allocate the memory and as far as I know, memory allocation is very expensive operation, so probably it's «cheaper» to read and write data from work area, rather than allocate the memory? Or because we have to allocate the memory only once, at the first iteration, and all other iterations will use the same allocated memory, it doesn't really matter?

2. Does execution of MODIFY command is cheaper than assigning the values inside of LOOP to each iteration via ASSIGNING command and subsequent updating one of its attribute?

Thanks.

arindam_m · ‎06-15-2013

Hi,

Looking at the code the statement

MODIFY itab FROM wa TRANSPORTING lang WHERE lang = 'EN-US'.

Always performs better when the internal table is a Hashed Type. But if its not I think it will be at par with the other one as it will have to end up evaluating all the lines to check the the field contents = «EN-US». Basically the Modify statement with hashed itab will help you get to your entries faster compared to a line-by-line check.

Also the variations can only be prominently visible only when a very large dataset is in consideration and also the variety of data involved say in a pool of 100000 records you have just one entry with «EN-US». then that means 99999 false iterations in a LOOP..ENDLOOP construct wasted CPU cycles. In such a case a Hashed table would search the entries quiet quickly using less CPU cycles. A hashed search is always quicker than a linear search.

Cheers,
Arindam

jens_becher · ‎06-17-2013

Hello Arindam,

thats exactly the point I would emphasize, because in my opinion statements above should be divided into the part of identifying the rows to change and changing the rows. For the latter it is absolutely true that field symbols are in 99,9 % the best choice because of pointers instead of data transfer.

But when identifiying the rows to change it is not obvious that

MODIFY itab FROM wa TRANSPORTING lang WHERE lang = 'EN-US'.

will be faster compared to

LOOP AT itab assigning ...WHERE lang = 'EN-US'.

...

ENDLOOP.

The difference - if existing - should be come into effect when a large itab is scanned with just a few rows fulfilling the where condition.

I think here it is interesting how the abap statements "modify" and "loop where" are compiled to assembly.

Additionally I would assume that both statements may produce the same assembly code (just for the part identifying the rows to change, not the change itself).

What do you mean?

Regards,

Jens

MikeB · ‎06-17-2013

All you wrote here is really interesting, unfortunately, there is no way to see «real» code, that runs behind of ABAP, and compare the performance…

jens_becher · ‎06-25-2013

Yes, unfortunately yes . Here only some "insider" like Horst Keller or so might give answers...

former_member219762 · ‎06-14-2013

Hi,

Field symbols give better performance.But if you want to change the key fields of table we cannot use field symbols because we cannot change key fields of table in the loop using field symbols.

Regards,

Sreenivas.

Former Member · ‎06-25-2013

Hi Mike,

Field-Symbols are the direct memory locations for the values present in the internal table.

So, we can avoid the MODIFY statement inorder to change the internal table based upon the WorkArea.

So, Field Symbols will increase the performance.

Let me know if any clarification/help required.

Thanks & Best Regards.

Pavan Neerukonda.

Former Member · ‎07-01-2013

This message was moderated.

Performance: The optimal way to change the content of internal table