cancel
Showing results for 
Search instead for 
Did you mean: 

After table partition, some transactions hang

darren_keenan
Explorer
0 Kudos

I am running SAP 4.6 and Oracle 10.2

I recently used online reorg to partition table DFKKOP (over 100 million records) into two parts. One partition was for "closed" orders, and the other was for "open" orders (less than 2% of the rows).

Since most of the queries against this table were against open orders, access times improved dramatically (one query went from 20 minutes to 20 seconds).

Unfortunately, while it was on a test system, our developers discovered that this caused transactions FPL9 and CIC0 to hang. To prove that it was the partition causing the hang, we used our "snapback" ability to return the database to the point in time prior to the partition (we did this twice), and the problem disappeared. Other than these hangs, the database seemed to work normally.

Does anyone have an idea why this might be happening? BTW - I had "row movement enabled", so that's not it. I have found nothing in SAP notes or on this forum.

Thanks.

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Hello Darren,

I have seen several SAP customers in the IS-U area that use partitioning with success. Recently I have partitioned e.g. DBERCHZ1 / DBERCHZ2 / DBERCHZ3 at one IS-U customer successfully based on BELNR number ranges (DBERCHZ1 had a size of 2.8 TB so it was important to make this table manageable again by cutting it into smaller pieces). Therefore nothing speaks against partitioning and if you are able to separate 2 % of useful data from 98 % of old data partitioning can really be a good idea.

There can be various reasons why certain activities slow down after partitioning - particularly if you don't pay attention for some details, e.g.:

- Proper local and global CBO statistics must be in place

- The index design (local, global, non partitioned) must be considered carefully.

In year 2009 SAP will release a "partitioning engine" integrated in SAP DDIC transactions that helps to set up and administer partitioning in a good way.

Kind regards

Martin

darren_keenan
Explorer
0 Kudos

Thanks, Martin.

I used a single global index. From what I have read it seems there is nothing wrong with this approach (as opposed to a partitioned index).

So you know, the performance didn't degrade over time. There was an immediate performance improvement whenever a query hit the DFKKOP table. However, after the partition was put in place, the two transactions I mentioned earlier either hung, or timed out.

As soon as I rolled back the system, those two transactions started working again, but the performance dropped for queries against DFKKOP.

It possible answer, it seemed to me, was that if a row was updated and forced to change from one partition to another. This might cause a hang. However, I just had a conversation with the developer who discovered the problem, and he said neither of these transactions was doing an update of data, so that hypothesis is blown.

The other likely possibility we considered was that one of the (seven) indexes against this table was never recreated after the partition. There is no way to check that at this point, so it's only speculation, but it seems unlikely. In the first place, I don't think performance would have improved if we were missing any of the other 7 indexes. It also seems unlikely we would have not noticed this on two different systems. Nevertheless, I will check this when I get another system to test on.

This is a big issue, because we are not allowed, for regulatory reasons, to archive off old data, and we have some data going back a decade.

If we could partition several large tables, we might be able cut the size of our "active" database by over 50%, and some estimates go much higher. Consider, what would happen to the size and speed of your database if you could chop it down to just 18 months worth of data "most of the time"?

Another possibility is that if we found some of the data to be relatively static, we could take one backup of that data, and then stop running backups for that portion of the data. Anyone out there who would like to cut their backup window in half?

Well, hopefully, this will all be resolved at least when the partitioning engine comes out.

Thanks.

Former Member
0 Kudos

Yes, with a global index you should be on the save side. I would not expect any significant performance degradation due to this index type.

Next time you experience hanging transactions related to database accesses, you can do a SQL analysis. See SAP note 766349 for more information.

Regards

Martin

Answers (5)

Answers (5)

darren_keenan
Explorer
0 Kudos

After much research and experimentation, it looks like the problem is in the partition type.

Two basic ways to partition are "range" based (like 1-3, 4-6, 7-9 for a total of 3), or they can be value-based (like 1, 2, 3, 4, 5, 6, 7, 8, 9 for a total of 9).

In my case I tried doing a value partition, but it appears that is creates a problem with certain types of queries against the table. The effect of this was that it appeared certain transactions were affected by the partition.

Once I redid the partition using ranges instead of discrete values, the transactions stopped hanging.

Former Member
0 Kudos

i am also thinking of partitioning a large table - custom table - that we use instead of glpca.. it is 40gb..would you happen to have a how-to approach in doing that? how did you handle the abap dictionary side? - or do you just do on oracle...thanks!

ps i was thinking on doing something fiscal year.. or acct number range...not sure how i could partition FY for years moving forward...meaing fy 2002 is one partition, 2003 is another, etc...

darren_keenan
Explorer
0 Kudos

Sandeep,

I did most of the work at the Oracle level - although I might have taken a peek at db02 or or some such, and I used SE14 to activate/build the indexes. The number of steps is not too large. Here is the outline I used and some hints to save you some trouble.

Tips:

A) You need LOTS of storage to do this. It creates logs, adds to temp, you are adding a new index, and this is besides the size of the new table. Rule of thumb: Two times the size of the table at least.

B) Issue the following statement before starting.

SQL> ALTER SESSION ENABLE RESUMABLE TIMEOUT N

(N=seconds, default is 3600)

-If you monitor for problems (such as running out of space), this causes the process to just hang for N seconds instead of erroring out and forcing you to start all over. This can be really handy if you are 6 hours in to a 7 hour process.

C) If you do blow up, you have to run this:

SQL> exec dbms_redefinition.abort_redef_table('saperp','dfkkop','dfkkop2');

D) I built a temporary tablespace just for the purposes of the reorg ("temp4reorg") to keep things clean.

E) Don't forget to delete the old, non-partitioned table when you are done.

Basic Steps:

1) Test whether the table can be partitioned (no blobs)

SQL> begin

dbms_redefinition.can_redef_table (

'saperp',

'dfkkop');

end;

/

2) Create a modified copy of the table structure (not data) to a new table. This new table should have the partitions built in.

SQL> create table saperp.dfkkop2

partition by range (augst)

(partition pspace values less than ('0') tablespace temp4reorg,

partition p9 values less than ('A') tablespace temp4reorg,

partition pnull values less than (maxvalue) tablespace temp4reorg)

storage (initial 150M

next 150M)

as select * from saperp.dfkkop

where 1=2;

-Don't forget this:

SQL > alter table saperp.dfkkop2 enable row movement;

3) Populate the table using the DBMS_REDEFINITION tools. This can take hours.

SQL> begin

dbms_redefinition.start_redef_table (

'saperp',

'dfkkop',

'dfkkop2',

null,

dbms_redefinition.cons_use_rowid

);

end;

/

4) Depending on how busy that table was with write operations. you may need to resynch (which "catches up" all the changed data since you started the conversion. This is usually quick.

SQL> begin

dbms_redefinition.sync_interim_table (

'saperp',

'dfkkop',

'dfkkop2'

);

end;

/

5) Copy dependent objects. I have had some trouble with this. It would be good if you had access to a tool like Toad to make sure everything is copied correctly.

declare

retval number(5);

begin

dbms_redefinition.copy_table_dependents

(

'saperp',

'dfkkop',

'dfkkop2',

1,

true,

true,

true,

true,

retval);

dbms_output.put_line(retval);

end;

/

6) Use the tools to automatically swap out the partitioned table for the old one. (you still have to delete the old table). This step is very fast.

SQL> begin

dbms_redefinition.finish_redef_table (

'saperp',

'dfkkop',

'dfkkop2'

);

end;

/

7) Build the new index for the partitioned table. This can also take a long time.

create index saperp.dfkkop_part

on saperp.dfkkop(augst)

tablespace psapdfkkopi

storage (

initial 50M

maxextents 500

next 50M

)

parallel 4;

Former Member
0 Kudos

>

> i am also thinking of partitioning a large table - custom table - that we use instead of glpca.. it is 40gb..would you happen to have a how-to approach in doing that? how did you handle the abap dictionary side? - or do you just do on oracle...thanks!

>

> ps i was thinking on doing something fiscal year.. or acct number range...not sure how i could partition FY for years moving forward...meaing fy 2002 is one partition, 2003 is another, etc...

Hi Sandeep,

Partitioning is a really cool feature - but it must be valued against your primary goal you want to achieve.

it has an influence WICH partitioning method you have to use AND wich partitioning keys you have to find.

If you see it like a triangle you can have these goals:


                               Performance
                                     /   \
                                  /   (*)  \
                               /               \
                            /                     \
                         /                           \
                       /                                 \
                     /- - - - - - -  - - - - - - - - - - - \ 
     Administration                                  Availability

you will hardly achieve all three (wich means you generate some overhead for the other goals).

It may that you are in a corner of one goal - see the location of (*)

Our goal was improve write performance (in the 1st place):

We did it on a SAP BW system on our own Z tables wich were growing bigger and bigger ( several hundreds of GBytes). As we debugged how BW itself managed the partitions we used this and a little ABAP to maintain the partitions for our own tables.

Our's was range partitioning with an Identifier (INT4) used as a sequence to flagg present and historical calculations as a partition key. We could now TRUNCATE the partitions instead of DELETE (the UNDO generation blew up our batch time window for computations of figures).

And yes we benefit from partion pruning - as far you had a lot of full table scans wich are now turned into partition scans. Index accesses benefit not that muxch from paritioning - but it reallly depends on your application HOW it uses the data.

All your queries should use the partition key as a criteria (in your case FISCYEAR) - otherwise you make things worse.

Be aware that you have to use the right data type (imagine a range partiton on a Varchar2 column

like the NUMBER data type in SAP tables) and evaluate it carefully against the partiton type you choose.

  • There are some pitfalls like moving partition keys (wich you have to avoid ).

  • There is the overhead of creating / dropping/ merging partitions and who is responsible for it (in our case

the application and not an SAP admin).

  • Maintain all indexes as local

You can use BRSPACE to partition the table , with your 40 Gybte SE14 could be also used as you can define the partitions (hopefully you are at least on a 6.20 system).

For our system it was essential

1) to fully understand the concept of partitioining (seems obvious, but guess where things most fail)

2 that it is supported by standard SAP tools (as in SE14 or BRSPACE)

3) to write an interface to handle the partitioning without involving a DBA

(imagine you have a process chain that uses your partition table and you have to provide a new partition while running the chain).

We run it now for more than a year and it was a good decision to move to partitioning.

Now we will extend the feature and use it for archivals on a partition basis.

Bye

yk

Former Member
0 Kudos

Thank you for your input.

Former Member
0 Kudos

Thank you for your input.

darren_keenan
Explorer
0 Kudos

I am working on a theory about why this problem happens. Hopefully it will help someone else.

Let me restate the problem. I was interested in partitioning a very large orders table based on whether the orders were "open" or "closed". This table has over 100 million records, and a simple count of open records took 20 minutes. After I partitioned the table, the same count time dropped to about 25 seconds. This kind of performance improvment is well worth investigating, and there were equivalent gains in other queries (depending on how central this table was to the query).

Unfortunately, two key SAP transactions appeared to hang, so there could be no implementation unless that issue was handled.

I currently believe that the transactions did not hang, but timed out because there was a problem with the indexes. I say this because after the table reorganization was done, I deleted all the indexes on this table, and rebuilt them. After this, the problematic transactions ran (and more quickly than before the partition).

In order to prove my "fix" worked, I went through the whole process again, but "unfortunately" I didn't have a problem. The transactions worked, and it is difficult to fix what isn't broken. I still could not implement the partitioning if I didn't understand what went wrong before.

Today, I came across some information that the reorganization tool may not be able to automatically build the indexes if you partition by list. This would explain why rebuilding the indexes seemed to fix the problem. It also would explain why there was no need to fix the problem when I repeated the test because the second time I used range partitioning instead of list partitioning.

lbreddemann
Active Contributor
0 Kudos

> Since most of the queries against this table were against open orders, access times improved dramatically (one query went from 20 minutes to 20 seconds).

Nonstandard feature usage and now in trouble ... ts ts ts..

> Unfortunately, while it was on a test system, our developers discovered that this caused transactions FPL9 and CIC0 to hang. To prove that it was the partition causing the hang, we used our "snapback" ability to return the database to the point in time prior to the partition (we did this twice), and the problem disappeared.

Ok, actually that's no proof at all. You think or believe that it must be the partitioning but you cannot be sure.

> Other than these hangs, the database seemed to work normally.

I like that one - it's like saying "other that he's dead the patient is fine".

To move on, you'll need to figure out, what the sessions are doing when they 'hang'.

Check ST04 - Sessions, perform a ST05 SQL trace ...

In any case it would be a good idea to open a support message for it.

Then people (trained ones if you're lucky) will have a look to the system and try to help you out of this misery.

> Does anyone have an idea why this might be happening? BTW - I had "row movement enabled", so that's not it. I have found nothing in SAP notes or on this forum.

What for? As far as I understood you, you used a reorganisation (DBMS_REDEFINITION or EXP/IMP) to turn the non-partitioned table into a partitioned one.

Where do you believe is row movement necessary for that?

regards,

Lars

stefan_koehler
Active Contributor
0 Kudos

Hey Lars,

> In any case it would be a good idea to open a support message for it.

> Then people (trained ones if you're lucky) will have a look to the system and try to help you out of this misery.

I was really amused about your last sentence. )

>What for? As far as I understood you, you used a reorganisation (DBMS_REDEFINITION or EXP/IMP) to turn the non-partitioned table into a partitioned one. Where do you believe is row movement necessary for that?

I think he mean that the row movement was enabled after the reorganisation. If he will update a key that would fit into another partition the row movement must be enabled on that object.

Regards

Stefan

lbreddemann
Active Contributor
0 Kudos

> I think he mean that the row movement was enabled after the reorganisation. If he will update a key that would fit into another partition the row movement must be enabled on that object.

Updating the partitioning key? GREAT idea! Let's move the data around a bit...

Ok, that shouldn't be the cause of his problem.

Would really like to see the issue on the system...

regards,

Lars

stefan_koehler
Active Contributor
0 Kudos

> Updating the partitioning key? GREAT idea! Let's move the data around a bit...

We don't know the keys that he has defined... everything is possible (for example with updated date fields) )

Regards

Stefan

darren_keenan
Explorer
0 Kudos

Stefan and MHO,

Thanks for your support. I have not responded to any of this because I have had to do an upgrade from 4.6 to 6.0, and "fun stuff" like partitioning for performance improvement went quickly to "not that important right now".

I did want to mention that Lars, you can buzz off. You are not helpful, and you obviously know less about partitioning than I do (or even enough to be helpful).

-


What for? As far as I understood you, you used a reorganisation (DBMS_REDEFINITION or EXP/IMP) to turn the non-partitioned table into a partitioned one.

Where do you believe is row movement necessary for that?

Updating the partitioning key? GREAT idea! Let's move the data around a bit...

-


Stefan was smart enough to realize that if you divide a table into "open" and "closed" orders, that occasionally, some of those open orders will be closed. If Oracle wasn't okay with moving data between partitions, they probably wouldn't have created a parameter called "row movement enable".

However, this does bring up a point. I can't find anything regarding this problem in SAPnotes on "row movement" issues, but this seems like a good possibility for causing the problem.

-


Nonstandard feature usage and now in trouble ... ts ts ts..

Ok, actually that's no proof at all. You think or believe that it must be the partitioning but you cannot be sure.

-


This was less of a problem than your smart-alec answer was non-informative. I snapped our test DB's back in about 5 minutes, and the problem went away. That's why we do testing on a test system, so we don't have these problems in production.

It should be noted that this is an option that is supported by SAP - at least since Oracle 9. Anyone who has implemented an SAP system that is completely standard, please raise your hand ....

(no one raises their hand)

Also, I did this on two different systems, and in both cases, the problem cleared up as soon as I rolled back the partition change. One of those system had been completely unused other than the partitioning action. So, while I haven't proven it beyond all doubt, I'll settle for 99% certainty.

Enough ranting against a flamebaiter.

I have a hard time believing that I'm the first SAP/Basis guy to try partitioning. So, I'm hoping someone (who has implemented partitioning on SAP) has some experience in this area.

At this point, the only way I can get the ST04 data is to re-partition a test system. As I mentioned, this is a two day process, and I currently don't have a system freed up anyway. I will eventually get a system, but it would be nice to leverage someone else's experience if it is available.

Thanks for your help.

lbreddemann
Active Contributor
0 Kudos

> I did want to mention that Lars, you can buzz off. You are not helpful, and you obviously know less about partitioning than I do (or even enough to be helpful).

Sorry to learn you see it that way.

> Stefan was smart enough to realize that if you divide a table into "open" and "closed" orders, that occasionally, some of those open orders will be closed. If Oracle wasn't okay with moving data between partitions, they probably wouldn't have created a parameter called "row movement enable".

There are many options in Oracle, that wouldn't be "okay" to use in a productive environment - still these options are there...

> However, this does bring up a point. I can't find anything regarding this problem in SAPnotes on "row movement" issues, but this seems like a good possibility for causing the problem.

That's why I wanted to know what you see when you look into ST04.

I did not doubt that the problem had something to do with the partitioning - but to know what was the problem with that would have been the goal of further investigation.

> -


> Nonstandard feature usage and now in trouble ... ts ts ts..

>

> Ok, actually that's no proof at all. You think or believe that it must be the partitioning but you cannot be sure.

> -


> This was less of a problem than your smart-alec answer was non-informative. I snapped our test DB's back in about 5 minutes, and the problem went away. That's why we do testing on a test system, so we don't have these problems in production.

> It should be noted that this is an option that is supported by SAP - at least since Oracle 9. Anyone who has implemented an SAP system that is completely standard, please raise your hand ....

That's true for partitioning, but not for row movement.

I also did not write that it's not supported - it's just that usually it does not help to use some feature like it was a silver bullet to your problems.

> Also, I did this on two different systems, and in both cases, the problem cleared up as soon as I rolled back the partition change. One of those system had been completely unused other than the partitioning action. So, while I haven't proven it beyond all doubt, I'll settle for 99% certainty.

Right, the problem went away the option for you to understand what was causing the issue.

Since you don't know AWR you cannot even use the historic wait information - so at the moment all you/we can know: the way the tables had been partitioned led to some issues.

> Enough ranting against a flamebaiter.

> I have a hard time believing that I'm the first SAP/Basis guy to try partitioning. So, I'm hoping someone (who has implemented partitioning on SAP) has some experience in this area.

Neither wanted I to 'flame' you nor did I wrote that partitioning has not been done or is forbidden.

Maybe you took my comments a bit to personal? Be sure they weren't meant to be mean against you.

> At this point, the only way I can get the ST04 data is to re-partition a test system. As I mentioned, this is a two day process, and I currently don't have a system freed up anyway. I will eventually get a system, but it would be nice to leverage someone else's experience if it is available.

As already mentioned, with Oracle 10g you may (if this feature is licensed for the used system) can access historical performance data via ASH/AWR.

Anyhow, I wish you luck with your partitioning project.

Lars

former_member204746
Active Contributor
0 Kudos

Personally, I never saw An R/3 or ECC system using partitions.

for BW systems, this is expected, but not on R/3 nor ECC.

I might be wrong but this is my feeling.

Former Member
0 Kudos

I never saw An R/3 or ECC system using partitions.

I know guys which did it for large tables in ECC systems, if done correctly then:

- You can enforce a certain order of the data, especially when the table has different retention periods for the archiving/deletion. You avoid table fragmentation, all the new data is inserted in one partition.

- Most queries should go to only one partition, reducing both physical io and saving buffer cache blocks

- Archiving is much faster, you don't have to scan a huge table, in the best case you can drop partition or merge partitions

The benefits can be huge, but the administration is still non-standard and quite sophisticated. One of the most critical things are the indexes.

@Darren: to help you further, we will definitely need an execution plan for the long running queries and the table and index information. You can for example obtain the plans with transaction ST04 -> Performance -> Wait Event Analysis -> Session Monitor -> select the statement and explain

Regards

Michael

darren_keenan
Explorer
0 Kudos

The problem starts when a transaction is called (either FPL9 or CIC0). At that point, the transaction just hangs.

Do you know how would I figure out the sql statement that happens when I call a transaction?

Former Member
0 Kudos

As i said you can use ST04, i guess there won't be hundreds of sessions active on your test system. To make sure you have the right statement, compare it to SM50, you can see the table and maybe the SQL there as well. The workprocess pid is the client pid in the Session Monitor.

Regards Michael

stefan_koehler
Active Contributor
0 Kudos

Hello Darren,

> Does anyone have an idea why this might be happening? BTW - I had "row movement enabled", so that's not it

It is very hard to say something without having any information about the wait event / blocking sessions.

But does this situation only occur since the reorganisation runs?

Keep in mind that a row movement works as an insert and a delete (moving one dataset from one partition to another). If you still have the AWR of this timeframe .. please query the history tables or generate an AWR report.

We need a little bit more information, because the most of us don't know the sql statements which are executed by FPL9 or CIC0.

Regards

Stefan

darren_keenan
Explorer
0 Kudos

I don't know what AWR is.

However, I had to remove (that is roll back in time) all the partition work because our developers couldn't use the test system the way it was. We use NetApp's snapback technology, but there are other products that do this as well.

Currently, the db exists as if it had never been partitioned. So, if the data you are looking for is from that timeframe, it no longer exists. In order to recreate it at this point, I would need 2 days access to a test system that the developer's were not using.

I was hoping someone had come across something like this before since what I did is supposed to be supported, and my partitions are pretty basic, most of SAP works fine, and nothing shows up in the system checks. It's just these 2 transactions that hang. Perhaps they were running (slowly), but we will never know since the transaction timed out. I also thought that maybe there was a problem with the indexes, but I checked and they were there.

Thanks