cancel
Showing results for 
Search instead for 
Did you mean: 

Production Stalling, Very large Wait times

Former Member
0 Kudos

Hi

We has a problem with our production ERP ECC 5 system yesterday, we are running EEC5 on Windows 2003 64 Bit with 24GB RAM, 8 CPU's, MS SQL 2000.

Yesterday afternoon we had an intermittent problem for about 1 hour, where SAP was freezing at the users PC in lots of transactions, I could see the system was slow from my own admin transactions, SM50 was showing lots of load programs and generally was very slow updating with about 25 dialog process showing 'running', and having having an action/report against them, the UPD process were also very slow at updating, normally we have a few dialog processes active and they change quickly as performance is pretty good..

The system returned to normal after about an hour, looking at ST03N I have found very large WAITTIME per dialog step times at the time we bad the issues, in expert mode under time profile, our usual average wait time per dialog step is about 0.5 MS, yesterday when we had the problem it was just over 2000 MS ! for that hour, total wait per dialog step time is normally about 7s, yesterday during the problem it was just over 24,000 S, this then dropped down to about 7s the next hour with average wait time per dialog step about 0.4 again, so back to normal. Average GUI time was also about 3 times the usual amount, but I guess this is a knock on from the large WAITTIME.

In ST06 at the same time CPU was about 90% idle 8% user and 2% system, we have 8 CPU's, so they were not being hammerred.

I have checked SM37 and there were no long running background jobs around the time that could have caused this.

A few months ago we went live in one of our big european countries and concurrent users on the system has gone from 100 users to about 180 users, yesterday when I looked at SM04 we had 172 users active in 240 sessions.

On the system we currently have 30 Dialog sessions, 7 UPD, 8 BGD, 2 SPO, 2UP2. Looking at the SAP help in ST03, it says large wait times are generally down to high CPU useage, ours looked ok at the time, also says could be the number of work processes, that was my thoughts also, as the dispatcher was stalling trying to allocate work processes, I remember from the ADM100 course SAP generally said about 7 users to 1 dialog work process as a rule of thumb, so I guess with 30 Dialogs we may need to increase that.

Memory usage was good, EM used about 5GB with 5GB free, ST06 was showing free memory at about 17GB

I spoke to our network team at the time and they said all was ok from their view.

Has anyone got any ideas what could cause this problem or an area I can look into further?

Thanks for any help.

Accepted Solutions (0)

Answers (2)

Answers (2)

Former Member
0 Kudos

Hello ecnrip,

Check which task type have a high value for "wait time"

In ST03 select the day where you have see this value that you send

us.

Click on button dialogue, background, RFC,etc and verify each time the value

"wait time".

When you see the task type that make your high wait time.

Click on button "time" profile.

You will see when the problem appear in the column "wait time"

You can double click to see what was running when the problem appear.

It may be you have not enough WP of type "diag" or "background" or you can also

change the scheduling of some background if it is possible or add more WP if

you have enough resource on your system or reduce // processing.

Have you any CPU resource problem?

Regards,

Jafer

Former Member
0 Kudos

Hi Jafer

The CPU for the time we had the problem was 90% idle, we have 4 dual core multi thread processors, task manager and SAP see's 8 processors.

Former Member
0 Kudos

Dear encrip, Did you try increasing the number of dialog work processes?

Regards

Shantanu

Former Member
0 Kudos

Hi Shantanu

I have not changed the work processes yet, as this is a production instance it is not easy to get downtime as we run 24*7 factories, so before I do anything I want to make sure it is correct (as much as you can !!!!)

I found this documentation from SAP :

http://help.sap.com/saphelp_nw04/helpdata/en/02/962817538111d1891b0000e8322f96/content.htm

Where it states for a 4 CPU windows platform, which we have, you should have 20-25 Work processes , I am more used to Unix where the rule of thumb was to use your concurrent users as a measure for amount of Dialog work porocesses, around 7 users to 1 work processes.

We have dual core multithread processors, so SAP/Windows actually recognise 8 CPU's, whereas we actually have 4 physical CPU's on the server, so I'm not sure what the SAP recommendatiuons for that is ???

Former Member
0 Kudos

Dear encrip,

SAP collects the hardware related information using SAPOSCOL who is simply collecting data from the Operating System. As far as memory or cpu is concerned this is where the information is coming from.

Hence, I think that if your OS recognizes 8 CPUS's then SAP will function accordingly and you should configure the number of work processes based on that.

As I said, the switch in work process numbers between dialog and background can also be done using Operation Modes and you do not need to restart the system for that.

I was wondering if you could try that.

Here is some info on that

http://help.sap.com/saphelp_nw70/helpdata/en/c4/3a5e4f505211d189550000e829fbbd/frameset.htm

Regards

Shantanu

Former Member
0 Kudos

Hi Shantanu

I have OP modes setup already, reducing dialogs and increasing BGD for the evening, I cannot really increase the DIA during the day as BGD are frequently used as they are setup, so I do not want to create a bottle neck on the jobs processing, we run quite a lot of frequent jobs as part of our daily operation, so I do not want to impact them if there are not enough BGD work processes.

We have had no repeat of the problem since Tuesday, so I am going to monitor performance over the next week, WAITTIME response has been between 0.2 and 0.8 per hour average, so is good.

When I look at the CPU times in SM50, the last 4 dialog processes all have times below 2 minutes utilisation, the system has been up since the start of September, so if there was a lack of dialog's I would expect their utilisation numbers to be higher as they would be used frequently.

I agree about the OS Collector, I just never realised on Windows that number of work proceesses was linked to the amount of CPU's, rather than the user base/memory which is more significent on Unix.

Former Member
0 Kudos

Man.. I just realized i jumbled up your name lol

Sorry bout that ecnrip!!

Former Member
0 Kudos

Let me clarify the requirement of dialog work processes in the system;

Normally when you size for the first time I would recommed to choose the dialog process numbers = 4 * number of CPU cores. In your case = 4 * 8 = 32

The problem that you have seen with dialog process bottleneck can happen because of many reason. First I will look at the 'User' and 'type' of running processes. Most of the times it may cause because of transactional RFCs.

In this case, you may look at your RFC resource configurations and adjust number of dialog processes assigned for RFCs.

The av. wait time for at least a month workload is a deciding factor for adjusting the number of work processes. Still, the question is whether you can increase the number of processes in the same system or you need an additional application server. Deciding factor here is the Av. CPU time for dialog tasks.

/Manoj

Former Member
0 Kudos

Hello encrip,

Also check if some Dialog processes had gone into PRIV mode at that time. As a result of some transaction accessing large amounts of data from the database.

This would result in those work processes not being able to multiplex and hence the effective available work processes reduces causing high wait time.

Regards

Shantanu

Former Member
0 Kudos

Hi Shantanu

No work processes went into PRIV mode at the time, I have checked memory statistics through ST03 and no Private Memory was used yesterday.

Former Member
0 Kudos

Hey,

In that case, my view is high wait time can only be caused due to insufficient number of work processes. Try increasing the number of work processes in the instance profile if you can afford to reboot the box, or else please try and configure operation modes and for day time, reduce the number of BG processes to 4 and number of Dialog WP's increase by 4. Although remember, total number of wp's should remain constant.

Hope this helps

Regards

Shantanu A Sardeshmukh