cancel
Showing results for 
Search instead for 
Did you mean: 

High CPU System Utilization : DIA Agents

Farid
Active Participant
0 Kudos

Hello,

The problem described here is different than the one described here :

http://scn.sap.com/thread/3165911

So I created a dedicated thread.

Even though the consequences are the same (namely : the diagnostic agents, are eating all the CPU)

In our case the SMD agents do not appear as top CPU processes, but the system CPU utilization is Huge.

We had the problems for months, we thought it was a hardware problem ... until today, when I switched off the Diagnostic agents

We are running Sap Solution manager 7.1 SPS6 with several other SAP production systems on our HP-UX Superdome.B.11.31 U ia64. SAP Diagnostic Agent "On The Fly" feature has been activated

We are dealing wit huge CPU performance issues, on average 30% of the system CPU was used by "something"

We have opened SAP customer calls, and HP-UX calls, performed firmware update without any tangible result.

I did not think the DIA Agent were responsible since they were not listed in top cpu processes,

in transaction st06, but I am starting to think that CPU system usage is not reflected here :

Anyway, after reading that thread I thought I would try to stop the diagnostic Agent :

Before Stopping the Diagnostic Agent

After Stopping the diagnostic Agent

I think the screenshots speak for themselves, the bloody Diagnostic Agents were eating all our CPU

(USer and System CPUs)

Bravo to me, I have solved our CPU bottleneck  ... but our Production systems are no longer monitored, anything can happen now, (DB,SAP,Unix,Application) ....we won't be alerted, which might be a problem to explain to the business

Our SAP Host Agent PAtch Number has been updatdd to 168

Our SOL LM-Service patch level is SPS6 patch 2 , SAP asked us to patch it to level 3, I will do that, and I am quite certain it will solve nothing, based on what is said in the previous post.

Anyone from SAP reading  this thread?

Thanks and Regards

Accepted Solutions (1)

Accepted Solutions (1)

0 Kudos

Hello Raoul,

I have encountered this kind of unusually high CPU consumption due to Diagnostics Agents from time to time. Probably caused by jstart.exe process.

The very first thing I would do : updating LMSERVICE06P to the latest patch level.

NB : patch 5 is available.

Most of the issues I have experienced with RCA/MAI on different SP level were finally solved by a LMSERVICE or Wily patch.

Only problem is that sometime I had to wait for many weeks for the proper patch 😉

Worth a try at least...

Sebastien

Answers (5)

Answers (5)

Former Member
0 Kudos

We upgraded to the very  latest LM-Service  SP3

Have you checked your ulimit settings for the agent user id  (like smdadm)? SAP suggested

these

http://wiki.scn.sap.com/wiki/display/SMSETUP/Diagnostics+Agent+Troubleshooting

 

(look for ulimit)

We tried this on our test  server which was running high CPU for the agents but its behaving better now . Currently making the same change to our assurance box.

Farid
Active Participant
0 Kudos

Hello,

I had the autorisation from the business team to restart the DIA agents this week ;

Thoses are the changes I have done beforehand:

  • Updating Solutoon manager component LM-Service to latest patch sap.com LM-SERVICE 7.10 SP6 (1000.7.10.6.5.20130904125700)
  • We have set the parameter job.dbinfo.disable = true for the main agent + agents on the fly
  • We have checked the ulimit settings of user da1adm ,they are OK

We noticed some improvements after restarting the Dia Agents :

When the SMD Agents are down CPU Idle is around 40-60 %, when SMD agents are up and running CPU Idle is now around 15-35 %,

It means that we can let the DIA Agents run during period of low/moderate business activity, but we will have to shut down the Dia Agents during month end.

I am obviously still trying to improve things with SAP Support, but the problems seems complex,

when having a look at the top CPU processes (glance, top, ST06) the da1adm processes are not displayed as top consumers, which would suggest that the Agents are not directly  causing the high CPU usage but rather the calls that hey make.

FYI, here is a screenshot of table E2E_RESOURCES

Thanks

Former Member
0 Kudos

Can you attach list of top processes?

Farid
Active Participant
0 Kudos

Hi Roman,

Here is the output of the top command :

Farid
Active Participant
0 Kudos

Hello All,

FYI.

SAP Support ahs analyzed the latest thread dumps and they suggest the following :

In order to isolate the problem, I kindly ask you to temporarily disablethe e2dcc dbinfo job.

You can do this by updating the following property from Agent

Administration, application com.sap.smd.agent.application.e2edcc, for

the concerned agents (main agent + agents on-the-fly):

job.dbinfo.disable = true

I am still waiting for the Business Team to give me the approval to restart the Diagnostic Agents, so that I can test the new settings. I will let you know

Have a nice day

Lluis
Active Contributor
0 Kudos

Hello Raoul,

We get a similar situation after a solman upgrade, for us the problem was the configuration of table E2E_RESOURCES.

Can you show us the content of that table ?

Former Member
0 Kudos

I have similiar situation with on the fly option 
with  HP-UX Superdome.B.11.31 U ia64  , can you tell us what changes  you made to this table

E2E_RESOURCES?

TomCenens
Active Contributor
0 Kudos

Hi Raoul

You should really go forward through SAP support for this. It's good that you notify the community of potential problems. Based on this discussion for example, I've asked infrastructure to check if we see strange behaviour like this here but I'm not so sure because it can be very case specific - bound to combinations of OS / params / software / ...

I assume that this is not general behaviour though. SAP cannot possibly test every combination that exists because there are too many elements involved, too many combinations that can be made.

If you're not getting anywhere with SAP support in the end, I can try to get hold of the right persons who would be interested in this but normally you should get there through SAP support also.

Best regards

Tom

Farid
Active Participant
0 Kudos

Hello Tom,

I have indeed opened a sap customer call, last week :

SAP Support asked me to generate a thread dump of the DIA Agent, I sent them the results,

Here is another BEFORE/AFTER comparison, when I started the SMD agent last week, in order to generate the dumps , it is quite impressive

BEFORE :

Moderate User CPU utilization

Very Low System CPU utilization

CPU Idle is High

AFTER :less than 15 minutes atfer the first screenshot

Very High User CPU utilization

High System CPU utilization

CPU Idle is almost null

Phone is ringing , users are yelling :" SAP is Slow !"

Each one of the 10 CPUs on the server is impacted .. and we only have 7 Diagnostic agents

TomCenens
Active Contributor
0 Kudos

Hi Raoul

I've received feedback from our infrastructure support team who have monitored the DIA agents because I notified them about your thread discussion and we don't seem the same behaviour. CPU usage is very low of the DIA agents here.

We also have DIA agents on the fly mechanism set up and running. If you want I can get the technical details of the combo that is in use.

Best regards


Tom

Former Member
0 Kudos

Can you attach thread dump generated?

Farid
Active Participant
0 Kudos

Hi Roman,

Here is the thread dump ; I just changed the server name

thanks and Regards

Former Member
0 Kudos

Can you in addition to CPU utilization also provide output from vmstat command (OS level) or screenshot for Memory utilization (os07n) before and after starting of smdagent?

Farid
Active Participant
0 Kudos

Hi Roman,

I will try to restart the DIA Agents, but since we are approaching the month end, I might need to wait for a few days ...

I haven't had any answer from SAP Support yet

Former Member
0 Kudos

Please collect cpu utilization per core with any OS tool (like sar). Also at the same time collect output of vmstat command. Attach results to message.