cancel
Showing results for 
Search instead for 
Did you mean: 

Very slow transactions/ SQL queries - CPU=100% idle and Memory=70% free

former_member193294
Active Participant
0 Kudos

Hi all.

I have a very weird problem with a test SAP system after it was refreshed from PROD copy. Many users are complaining that their transactions are geting hang. And indeed I see in SM66 their processes running for a long time and then time out. Some of the users are calling Workflow transactions which takes ~50 min to get a response . Some other are running some customized programms which run some select queries from Z tables.

However, in ST06 the CPU is continiously 100% idle (IBM p6 Series, 24 CPUs Application Server, 4 CPUs the database server - different LPARs) . Total memory 72 GB and 50GB is always free. It is amazing , it is such a powerfull machine and users are complaining.

I checked in ST03n and I see long wait times for RFCc.

Could you please though an idea where else I could search for hints? I am running out of ideas. I have checked all possible logs, transactions (no dumps, SM21 clean, O/S IBM AIX wp_disp,dev_w* showed no errors.)

Thanks in advance.

Loukas

Accepted Solutions (0)

Answers (2)

Answers (2)

Former Member
0 Kudos

Any hints in ST02? SAP or database memory parameters configured much too low perhaps? Where did you get your current parameters from?

What is your database by the way?

former_member193294
Active Participant
0 Kudos

Thanks guys for the replies.

Database is Oracle 10.2 which by the way after the refresh patched/upgraded to 10.2.0.4 (PatchSet3).

If the patch would be the issue then the transactions would be also slow on the second test machine which everything runs as expected.

ST03n shows :

~28.000ms response time for DIA

~7.000ms response tiem for RFC

~3.000ms response time for BTC/AutoABAP

The rest are about 1.000ms or less.

Long "Roll Wait Time" also for RFC ~6.300ms

Regardins indexes I have compered the tables+indexes of the main transactions which are impacted between the 2 systems using the same PROD copy and they are identical.

I have not run SQL Traces yet . They are deactivated . I should activate and then deactivate them at the time that the users are running the transactions. It is a bit difficult to synchronize as the users are mainly in France and Spain, I am in Germany.

The parameters are the same as on our PROD SAP system as it is recently refreshed. Test system with the issue is the same HW like the prod one:

Appl Server 24CPUs / 72GB RAM

DB Server 4 CPUs / 26 GB RAM

DB statistics are running every day and they are successful. No missing indexes etc.

how could I check the compiling by the way?

Thanks again,

Loukas

Edited by: Loukas Rougkalas on Aug 24, 2009 2:56 PM

Former Member
0 Kudos

Not sure but check your network speed too.

Former Member
0 Kudos

Yet another idea: Check CPU usage from OS level as well. Just to make sure values from ST06 are correct. They might not be updated, because of a resource bottleneck...

former_member193294
Active Participant
0 Kudos

I have done it already:

CPU User% Kern% Wait% Idle% Physc Entc

ALL 2.1 2.2 0.0 95.7 0.14 4.5

PAGING MEMORY

Faults 6 Real,MB 71680

Steals 0 % Comp 17

PgspIn 0 % Noncomp 16

PgspOut 0 % Client 16

PageIn 0

PageOut 0

Sios 0

PAGING SPACE

Size,MB 65536

% Used 0

% Free 100

Name PID CPU% PgSp Owner

disp+wor 2019352 0.9 29.1 st2adm

saposcol 1986642 0.2 3.9 st2adm

topas 2023494 0.1 3.0 st2adm

PatrolAg 618708 0.1 22.1 patrol

snmpmagt 770066 0.0 3.5 patrol

p_ctmag 2175074 0.0 0.4 root

java 475370 0.0 44.5 root

igspw_mt 2093082 0.0 16.1 st2adm

igsmux_m 1220840 0.0 10.4 st2adm

igspw_mt 2097178 0.0 16.1 st2adm

rt-fcpar 360596 0.0 0.4 root

gil 249978 0.0 0.9 root

rt-fcpar 471216 0.0 0.4 root

java 647370 0.0 33.4 st2adm

prole 352280 0.0 3.0 root

gwrd 745674 0.0 6.9 st2adm

sshd 2183354 0.0 0.7 lrougka

rtcmd 303136 0.0 1.1 root

sendmail 291014 0.0 1.0 root

nmon12e_ 2113560 0.0 4.8 root

Edited by: Loukas Rougkalas on Aug 24, 2009 3:20 PM

Former Member
0 Kudos

And disk response times in ST06?

former_member193294
Active Participant
0 Kudos

From topas:

Disk Busy% KBPS TPS KB-Read KB-Writ

hdisk1 2.0 22.0 5.0 0.0 22.0

hdisk0 1.0 22.0 5.0 0.0 22.0

dac1 0.0 0.0 0.0 0.0 0.0

dac1utm 0.0 0.0 0.0 0.0 0.0

dac2 0.0 0.0 0.0 0.0 0.0

dac2utm 0.0 0.0 0.0 0.0 0.0

dac3 0.0 0.0 0.0 0.0 0.0

dac3utm 0.0 0.0 0.0 0.0 0.0

hdisk2 0.0 0.0 0.0 0.0 0.0

hdisk3 0.0 0.0 0.0 0.0 0.0

hdisk4 0.0 0.0 0.0 0.0 0.0

hdisk5 0.0 0.0 0.0 0.0 0.0

hdisk6 0.0 0.0 0.0 0.0 0.0

hdisk7 0.0 0.0 0.0 0.0 0.0

hdisk8 0.0 0.0 0.0 0.0 0.0

hdisk9 0.0 0.0 0.0 0.0 0.0

hdisk10 0.0 0.0 0.0 0.0 0.0

hdisk11 0.0 0.0 0.0 0.0 0.0

ST06 is not giving any statistics on all our systems.

Former Member
0 Kudos

I can't comment on this output, because I am not familiar with topas.

And I haven't got any other idea now, sorry.

JPReyes
Active Contributor
0 Kudos

ST03n shows :

~28.000ms response time for DIA

~7.000ms response tiem for RFC

~3.000ms response time for BTC/AutoABAP

Judging by these, there must be something wrong with your settings... have you compared the db settings against your PRD system?...

Also, you can do an SQL trace via ST01 and see what sort fo response you're getting from the statements.

Regards

Juan

former_member193294
Active Participant
0 Kudos

Sorry my mistake:

After the system refresh only the application server parameters are the same like PROD . The DB parameters remane the original ones of the test system.

However, we perfromed another system refresh back in June on this test system and the system worked fine afterwards, with the application parameters overwritten by PROD and the DB parameters remained the original ones.

I run a SQL trace in ST01 and I have some high "Lasts (us)" on the 3rd column:

14:31:17:373 SQL 5861,540 ZAOPTD_INV_HDR Prog: ZAOP_CL_GDBL_BUFFER===========CP Row: 8,434 Ret.Value: 1,403

14:31:23:236 SQL 4 ZAOPTD_INV_HDR Prog: ZAOP_CL_GDBL_BUFFER===========CP Row: 8,434 Ret.Value: 0

14:31:23:236 SQL 5903,161 ZAOPTD_INV_HDR Prog: ZAOP_CL_GDBL_BUFFER===========CP Row: 8,434 Ret.Value: 1,403

14:31:29:140 SQL 4 ZAOPTD_INV_HDR Prog: ZAOP_CL_GDBL_BUFFER===========CP Row: 8,434 Ret.Value: 0

14:31:29:140 SQL 5873,093 ZAOPTD_INV_HDR Prog: ZAOP_CL_GDBL_BUFFER===========CP Row: 8,434 Ret.Value: 1,403

14:31:35:15 SQL 3 ZAOPTD_INV_HDR Prog: ZAOP_CL_GDBL_BUFFER===========CP Row: 8,434 Ret.Value: 0

14:31:35:15 SQL 5895,909 ZAOPTD_INV_HDR Prog: ZAOP_CL_GDBL_BUFFER===========CP Row: 8,434 Ret.Value: 1,403

14:31:40:913 SQL 4 ZAOPTD_INV_HDR Prog: ZAOP_CL_GDBL_BUFFER===========CP Row: 8,434 Ret.Value: 0

14:31:40:913 SQL 5865,997 ZAOPTD_INV_HDR Prog: ZAOP_CL_GDBL_BUFFER===========CP Row: 8,434 Ret.Value: 1,403

14:31:46:780 SQL 3 ZAOPTD_INV_HDR Prog: ZAOP_CL_GDBL_BUFFER===========CP Row: 1,739 Ret.Value: 0

14:31:46:780 SQL 562 ZAOPTD_INV_HDR Prog: ZAOP_CL_GDBL_BUFFER===========CP Row: 1,739 Ret.Value: 1,403

-


Here is also the execution plan:

SELECT

DISTINCT "GUID"

FROM

"ZAOPTD_AST_HDR"

WHERE

"MANDT" = :A0 AND "OBJID" = :A1 AND "IS_DELETED" = :A2#

Execution Plan

SELECT STATEMENT ( Estimated Costs = 1,576 , Estimated #Rows = 1 )

5 3 HASH UNIQUE

( Estim. Costs = 1,576 , Estim. #Rows = 1 )

Estim. CPU-Costs = 86,771,521 Estim. IO-Costs = 1,561

5 2 TABLE ACCESS BY INDEX ROWID ZAOPTD_AST_HDR

( Estim. Costs = 1,575 , Estim. #Rows = 1 )

Estim. CPU-Costs = 80,717,708 Estim. IO-Costs = 1,561

1 INDEX RANGE SCAN ZAOPTD_AST_HDR~01

( Estim. Costs = 1,574 , Estim. #Rows = 1 )

Search Columns: 2

Estim. CPU-Costs = 80,716,218 Estim. IO-Costs = 1,561

Access Predicates Filter Predicates

Edited by: Loukas Rougkalas on Aug 24, 2009 4:44 PM

Edited by: Loukas Rougkalas on Aug 24, 2009 4:45 PM

Edited by: Loukas Rougkalas on Aug 24, 2009 4:48 PM

Former Member
0 Kudos

Your memory, CPU and disk are having nominal values.

It looks like main chuck is in the dialog response time.

In ST03, inorder to seperate whether the issue lies in SAP or in Database, list out what the DB response time for the equallent dialog response time ~28000ms.

Also if you can see the top dialog response time in ST03N, which can give an indea what transaction involved, response time,network time.

If roll-wait time is more, which indicates it could be due to RFC communication between systems.

Is this system connected to any BI system, which can obviusly take much load?

Can you confirm whether the kernel, oracle patch, oracle client version remains same between your prod and this test system?

My other suspected areas are Invalid generation, SAPGUI, Network.

Go to SGEN=>Regenerate existing loads=>Only generate objects with invalidate load.

This will generate the objects in invalid state.

Was there any change happened in SAPGUI when compared to prior refresh?

former_member193294
Active Participant
0 Kudos

Thanks Vijay for your response .

Regarding oracle patches, oracle client, kernel it is as following:

Kernel is identical like PROD.

Oracle Patch is 10.2.0.2 on PROD vs 10.2.0.4 on TEST. I am planning to upgrade the PROD as well in September.

Basically the same patch (10.2.0.4) has been applied on another test system (using the same PROD copy as the problematic one) where the transactions are much much faster.

Indeed there was an issue with SAP GUI. We had several users using SAP GUI 6.20 , some 6.40 and some had upgraded to 7.10.

Most of them have been advised to upgrade to 7.10, patch 13. But in which regard would a SAP GUI version impact the runnign transactions/sql queries?The slow responses have come from mixed users, mainly from the ones using 6.20 and 7.10.

By the way the SAP systems PROD+TEST are 4.7 EE SR2.0.

Our O/S guys have eliminated , together twith the NW people, any network problems between application and database server.

Last thing which I am going to do is to append the oracle parameters from initSID.ora from TEST and PROD.

I will also run the SGEN to check the invalid state of the objects.

Thanks anyway for the moment.

Rgds,

Loukas

Former Member
0 Kudos

Hello Loukas,

one more question: You upgraded the test system to Oracle 10.2.0.4. Did you also apply all the interim patches that are listed in SAP note 1137346 ?

There are quite a few among them that are designed to solve performance problems, especially Optimizer-Merge-Patch 8599814.

regards

former_member193294
Active Participant
0 Kudos

Hi Joe,

I have installed the 10g Release 2 (10.2.0.4) Patch Set 3 for AIX 5L Based Systems (64-Bit) from 15.07.2008 in SAP NET.

In the README file of the patch , it is not mentioned the bug fix you wrote.

Is it a prerequite for the ORACLE upgrade to run several interim patches?

Rgds,

Loukas

Former Member
0 Kudos

Yes, SAP note 1137346 says those patches are required. And there is a reason for this. I am quite sure performance will be much better afterwards.

Try and see!

regards

JPReyes
Active Contributor
0 Kudos

Indeed very odd.

Now have you done an SQL trace to see if the indexes are been used properly?, also is it compiling?... Have you check the response times in ST03n?, have you run the statistics?...

What version is your DB?

Regards

Juan