cancel
Showing results for 
Search instead for 
Did you mean: 

STAD : Processing time when calling c program

Former Member
0 Kudos

Hi,

when I look at the central instance of our system, I see that much of the response time is recorded as "Processing time". Much more than 2 * cpu time. According to sap performance optimisation guide, this implies CPU or network bottleneck. We have no such bottleneck at this time.

I am wondering if the time spent in a c program when called from a function is recorded as processing time because the system has no way of calculating its ressource usage ?

Thank you

Thierry

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Apart from me stupidly adding CPU to my list of causes, I feel my call was succinct and answers your question mostly but I forgot to add that you are correct, the system has no way of measuring that time apart from as "call time" as already mentioned.

Please be careful with Tilak's response.. comments like Also, increase number of dialog workprocesses should only be done when it is clear that your Wp's are full due to there not being enough as opposed to another performance problems causing a queue... It is probably the most often suggested thing and the least required. each Wp uses 50-70 MB or non swappable RAM!!!

Answers (1)

Answers (1)

Former Member
0 Kudos

Calling a C program from a function module will be roll+wait and CPIC/RFC time.

If processing time if more than CPU time then you are looking at a server or sap-instance issue. Processing time is the time spent in the work process... and if that time is not CPU time then it is due to

CPU time

Paging

possibly disk

Buffers like table and natetab

regards

Graham

Former Member
0 Kudos

Hi Thierry,

I feel i should proceed for this issue in this way as you forgot to mention which kind of system showing this problem, wether XI or any!!

Just right to find out the solution of these questions:

1. Have any changes been made to the system or to transactions/reports?

2. Does the performance degradation take place at certain periods or

is it ongoing?

In ST03N just try to findout response time is high for dialog workprocess or RFC/CPIC or for both?

Check wether the customer written or Z* transction that the processing time

is quite high and this could be due to the long running Synchronous RFC

calls, where the CPU on that particular application server might not be

used though the work process is still holding the user context. In this

situation we can see high processing time whereas CPU time will be low.

Analyzing and Tuning RFC calls should help in this case.

- High RFC load investigate whether RFC load can be spread over a longer

period of time. Also, increase number of dialog workprocesses.

You can work with abap developers for tuning customer transactions/programs

and monitor transaction ST02 for buffer swaps.

Please let me know the breakup of processing time so that we can proceed some what more!!

Hope it will help!!

Please provide your input

Thank you,

Tilak

Former Member
0 Kudos

Hi Tilak,

our system is IS-U 4.72. (R/3 4.70).

On our central instance, we always get the processing times higher than our cpu time... It is often more than 2 times greater.

We noticed that some functions always have high processing times. For example , SALC_MT_GET_TID_BY_NAME . We noticed that this function calls a c program.

I dont see the processing time match with the RFC/CPIC time.

We do however have high RFC load most of the time. (3500 000 calls per day).

Here is one example I have in stad :

Analysis of time in work process

CPU time 10 ms Number Roll ins 1

RFC+CPIC time 0 ms Roll outs 1

Enqueues 0

Total time in workprocs 1 054 ms Load time Program 0 ms

Response time 1 054 ms Screen 0 ms

CUA interf. 0 ms

Wait for work process 0 ms

Processing time 1 053 ms Roll time Out 1 ms

Load time 0 ms In 1 ms

Generating time 0 ms Wait 0 ms

Roll (in) time 1 ms

Database request time 0 ms Frontend No.roundtrips 0

Enqueue time 0 ms GUI time 0 ms

Net time 0 ms

Also, as you can see, we should not need extra workprocesses as the wait time is 0.

We also checked in sm50 and some workprocess are always available.

Thanks.

Thierry

Former Member
0 Kudos

In ST06 - detailed Analaysis - Memory, how many page ins / page outs are you having per hour?

If more that 600,000 page ins on Windows or Pageouts on Unix then it is quite possible this is where your missing processing time is.

Having thought about it again though, if you are not paging and do not have serious buffer problems then it could in fact be the C call. I had always imagined it was rolled out during a C call but perhaps it is the WP itself doing this, I'll chase someone up at SAP.

Basically processing time is just a bucket of what is not included in wait,roll-in,load,generation,enqueue, DB and roll+wait. It is an interesting one no-one has ever questioned me on before and I teach ADM315!

Former Member
0 Kudos

Hi Thierry

I am wondering if the time spent in a c program when called from a function is recorded as processing time because the system has no way of calculating its ressource usage ?

The answer is that a C call is made from the work process and the work process goes into a CPIC WAIT status (normally very quickly so you are lucky to see it).

Any WAIT status is marked down as processing time

Former Member
0 Kudos

Thanks Graham,

from your experience, is there any parameter that could cause a bottleneck in CPIC calls ?

Thierry

Former Member
0 Kudos

Not directly, no. A WP call a C program and it would be insane to slow it down with a limitation parameter... I have plaughed through the params many times in the past and certainly not seen one.

However, I am intrigued. The function module you are quoting is a good example, and I know it well as I have a fair amount of CCMS skill. It should not take as long as it does. So I am thinking along the lines of what could cause it.

I originally stated Paging, this is th emost common cause and also would show as high processing times. Can you confirm what your poages per second and page (outs for UNIX, ins for Windows) are per hour are in ST06? Naturally any action requires data to be in memory including a C program.

CPU is another one. STAD shows what the WP is using CPU wise but not what the CPU is. Perhaps there are many tens of thousands of interrupts indicating a hardware fault. This can drive a server performance down or even kill it. Equally other CPU hungry processes like virus checkers or vendor programs can chew all CPU up.

Poor disk would not be a cause unless directly related to some badly paged buffers so can you check ST02?

I am wondering why you suspect the C call, many many programs do a quick C call buried deep in a function somewhere... I would only suspect it directly is many refreshes during a poor period in SM50 show programs spending over a second of wait in on CPIC that are not communicating via RFC.

A work process trace at level 3 coud give an indication of where but would be a bugger to read. It takes a while to get used to em and they tend to be long.

Thinking more I would be tempted to use SE30 with no aggregation (do a note search on SE30 and aggregation for instructions) This would show you which individual statements are taking the most time for something that has high processing time. From there you can examine code and make better assumptions.

good luck

regards

Graham

Former Member
0 Kudos

Hi Graham,

thanks for trying to help me with this problem.

As for paging, we have 0 page outs and we are on unix.

CPU is between 40 and 50 % idle but we do have arround 50 million context switches and 30 million interrupts / hour.

Disks seem fine.

We suspected C calls because when we looked at the central instance we saw those calls which had no cpu or db . We looked at the code and saw it made a c call so we figured this might be where the time was lost.

Right now , we are thinking that although we are only using 35 % of the CPU or so, the OS has a problem handling SAP configuration we have. This is strange because we have 16 cpu hp (hp 8420) system that runs only the CI and database. The CI does not do much other than message serving and enqueue processing and the DB performs quite well.

For the cpic wait, we dont see the processes in that state in sm50.

As for se30, I'll try to run a function on the CI and see what takes the most time.

Thanks again Graham.

Former Member
0 Kudos

Thierry,

it is no problem to help because I enjoy this stuff and there is always something for me to learn here as well.

Basically you are having an issue only on your central instance where your processing time is higher than CPU and this means there is an issue on the server. It is not just narrowing it down.

I am not very strong on UNIX but know in Windows that high interrupts can mean a hardware issue. I believe this is also the case on UNIX so would recommend you get someone to check. At least it rules it out. Looking on the web it looks like you can identify by the type of interrupt so it should be fairly easy.... I just have not done this.... you can always do me a favour and find out how

Now what else could it be?.... no disk, CPU medium.... that is a worry, and no paging. OK, You need to get frequent values out of your CPU rather than hourly. In RZ23N you use the SAP Minute collection with no aggregation against the CPU_UTILIZATION MTE to get a minute value or run something simple at the OS level to collect this. It may show some significant short term spikes that cause some of your processing to be high. Certainly with the RFC's and the amount of context switches the system looks busy. So how many CPU's are on this server and how many WP's (across all systems).

I would be very tempted to get an ABAPPer to whip up two simple programs. 1 calls the same C program until it has about a second of duration time and the other does some calculations in memory for a duration of a second. These durations should be values on a box that is not the CI in question. Then run them both on the CI and see who wins the race.

But to be simple, if you have made it this far I would have already checked the standard dialog response time and also logged this in RZ23N (r3dialogdeftime If I remember) with minute aggregation. This is a program that runs every 5 minutes 3 times in sucession, It adds data to an empoty table, changes it, reads it and deletes it and then does some calculations in memory and finishes. The system then averages the 3 values and places the result in RZ20 memory. This is by far the best measure of performance and you should get a flat line if your system is performing well. This value should also be the same across all instances of the same platform type. I would reckon you should get a value between 17 & 20 milliseconds

I think thats all I can think of, I would doubt that a C program would be slower than normal SAP as the WP's themselves are C. but you have to keep an open mind in this game.

good luck! and let me know

Graham

Former Member
0 Kudos

Hi,

we actually have the same problem on many servers, not only our central instance. I use the central instance as an example because it is the most obvious case where we waste time (processing time).

I think this rules out an actual hardware problem.

We ran a kitrace this week on one of the servers and we are waiting for results.

At this point, I have two theories : either the problem is because we stack too many processes on one server or we actually have a parameter in the os or the instance that is slowing us down. I tend towards the first possibility because on certain servers we have removed some processes and they perform better. Perhaps the answer is a combination of both, the parameter prevents us from stacking the number of processes we need. The funny thing is our system has a limit of arround 40 000 active processes and we have arround 800.

If this is the case, then I wish there would be recommandations from SAP and HP in the future on how much you can actually stack on one server since the servers are become more and more powerful and we want to use them to their full potential. It is then natural to want to combine applications on them to optmize their usage.

For the frequent check of the CPU, I use HP Performance Manager which can give you detail at 5 minute interval. I also use Glance to look at more detailled ressource usage. We never have peaks at 90 %...

Thierry

Former Member
0 Kudos

how about idle time near 0 :)... that number of context switches has to be an overhead

I must admit that the number of context swicthes seems massive. It certainly used to be a limiting factor that I think I entered earlier. I believe a context switch causes an interrupt.

So, cmon and surprise me, how many of those 800 are WP's?

I have a couple of UNIX experts at work, one is our top Architect, I can try and interest around this.... so any info on platform and virtual environments and number of physical and virtual CPU's. Also any estimation of transaction throughpout on the server as a whole?

This is an interesting thread

Former Member
0 Kudos

Hi Graham,

I'll get back to you with those specs. Just wondering about this standard dialog response time. Does any special setup have to be done in ccms to collect this. We have many application servers and I often do extractions from st03 to display the difference in performance between our application servers. I would love to have this done automagicaly

The graphs actually helped us to suspect that some of our servers are probably overloaded with processes that are sleeping. Actually , when you look at the processes they in unix they are marked as stoped on semaphore.

As for the number of actual workprocesses on the server, a rough estimate since I'm not in front of the box now is arround 450 ( we have 4 instances) and that runs on a 16 cpu box.

The CI has arround 2000 processes on it and most of them are for oracle. It also is on a 16 cpu box. We have no virtualisation and run hp-ux 11i if I remember well.

Take care.

Former Member
0 Kudos

Hi Thierry,

For your CPU's.... if they are 16 cores in total then yes, you appear overloaded and semaphore waits can occur with so many things sharingthe OS memory areas.

For monitoring STD Dialog in fact it is easier than you think. Here are 2 processes that you should find handy.

Creating a usable monitor in RZ20

1. Go to RZ20 and from the menu choose Extras - Activate Maintenenace Function

2. Click on the Create button (blank page button) And in the popup, press enter and in the next popup type Thierry and press enter.

3. Under My Favourites you should now see Thierry, single click this and click the Create button again.

4. On the next screen single click on the work New Monitor and click the Create button again, press enter on the pop up and on the next popup Type Standard Dialog Response Time (or similar) and press enter

5. Now single click on the line that says Standard Dialog Response Time and click the Create button again.

6. In the Popup change the radio button option to Rule Node and press enter

7. On the next Popup, in the Rule field do a drop down and double click on CCMS_GET_MTE_BY_CLASS and press enter

8. On the next popup in the R3System Field do a dropdown and double click on the word <CURRENT> (it is important to choose this option) and in the MTECLASS field paste the word R3DialogDefLoadTime and press enter.

9. Now you are backon the main screen, press the standard Save button, in the Popup type the word Performance and press enter.

10.Now expand your monitor set and double click on the performance word to get the monitor, expand Standard Dialog Response time and you will be able to see the current performance of all instances.

Now you have an RZ20 performance monitor you can add other nodes in here like dialog steps per minute, CPU, Paging and even context switches. Have a look at the SAP CCMS Monitor Templates - Entire System - Application Server - <instance> - R3Services - Dialog for some really useful ones. To get the MTECLASS single click on a noce and click the Properties button.

Because you set up your collector system as <CURRENT> you can now export you defined monitor and import it into another system within seconds!... very very handy. It is just an XML you can have saved on your PC.

Saving Standard Dialog Response Time History in RZ23N

1. Go to transaction RZ23N and click on the button Assign Procedure

2. In the middle panel of the top three panels click on the binoculars and find R3DialogDefLoadTime and come out of the find popup with a cancel (not a tick) so that R3DialogDefLoadTime is highlighted.

3. In the left hand pane single click on your system (or if it is the only system and you do system copies, choos ethe * option)

4. In the right hand pane single click on the option SAP_MIN_COLLECTION__NO_AGGREGATION

5. Now you have the system, the MTECLASS and the collection method highlighted, click the button Assign Procedure

and thats it!.... you will now keep 8 days (or a few more worth of data) but it reorgs itself... two jobs will now be running to collect and reorg the data. Go to the main RZ23N screen and press Collection/reorganisation jobs to see them without the hasstle of SM37.

Click on the Maintain Schemas button to even create your own collection schemas. You may want to have one that aggregates values all the way up to yearly values. This is a lovely transaction. I recommend avoiding using weekly collection until you read teh help.

Click in the 'Overview of Available Data' to be able to see the data collected and even graph it or copy it out to excel. You can run the reports in place also but its too much to mention here

The Rolls Royce Reporter

BI now has standard cubes to collect CCMS data from RZ23N and graph it. We do this for all our SLA data for customers and we in fact do it ourselves without BI help directly in the Central monitoring system so we have control and can design our own graphs that are far nicer than the horrible Solution Manager ones.

feedback? your opinion?

regards

G

Former Member
0 Kudos

I just realised, a little belatedly that you probably have a dispatcher bottleneck.

If you look at your dispatcher queues in an instance SM51 - goto - server name - information - queue information what does it show under the max req wait column for the NOWP?, please check on each instance.

This is a classic reason why I get annoyed when people recommend increasing WP's without understanding the issue

Former Member
0 Kudos

We have maximums of arround 150 for NOWP. We had checked this before.

I really think that our servers are too busy managing processes which basically do nothing.

Also, the HP kitrace provided no clue on the problem.

Thanks for the procedure for ccms history. I tried it but can't find the MTE Class when I want to define the history program. I'll go through the procedure again and read up on the documentation.

Thanks again.

Thierry

Former Member
0 Kudos

150 is a lot to be honest, especiually if it is happening on a frequent Basis which is harder to tell.

Equally as you say, the system has to perform a lot of context switches just to handle the WP loops.

I would say that if you have a server with 4 instances each with around 110 WP's I would be tempted just to reduce the WP's on one of those and see if the performance just improves on that one instance. It should give you an idea if it is the dispatcher or the general system that is slowing things down. I suspect the dispatcher because the CPU is never 0% idle.

I may have typed the MTE class wrong (not at a SAP system at the mo.

good luck, will leave you in peace