cancel
Showing results for 
Search instead for 
Did you mean: 

locating a batch that causes checkpoint is not complete

benoit-schmid
Contributor
0 Kudos

Hello,

In our PRD system, each night at 23:2x, we have oracle alert checkpoint is not complete.

I am suspecting a background job that is generating lots of insert/update/delete with a very small commit interval.

I would like to find which job is causing that.

Which methodology do you use to locate that kind of jobs?

I have thought of using st03 with transaction profile and db access sort to locate these jobs.

But you may have a well define procedure to find that kind of job as you have encountered similar case.

If this is the case, could you explain how you proceed?

Thanks in advance for your answer.

Accepted Solutions (0)

Answers (2)

Answers (2)

Former Member
0 Kudos

Hi,

Sorry, I think i overlooked the post in detail.

From the error logs posted, this seems to be a symptom of a high transactional activity.

You should check the size of your redo log files and the checkpoint parameter values?

Depending on the values, most probably, you are required to tune the checkpoint process and increase the size of the redo logs

The following links could be helpful.

http://shenblog.com/2010/11/26/thread-1-cannot-allocate-new-log-sequence-19-checkpoint-not-complete/

http://www.dba-oracle.com/t_oracle_checkpoint_not_complete.htm

Regards,

Varun

benoit-schmid
Contributor
0 Kudos

> From the error logs posted, this seems to be a symptom of a high transactional activity.

> You should check the size of your redo log files and the checkpoint parameter values?

Hello,

I am going to tune the oracle redo.

But my question is not concerning this point.

My question is how can I find the bacth job running in sap that is causing that.

We can also check that the abap report causing this intensive DB activity needs to to be tuned.

Therefore how can I find, at the abap AS level, the job doing this db acitivity?

Thanks in advance for your answer.

Former Member
0 Kudos

A few thoughts about how to find the batch job running in sap that is causing that :

You have got the time from Oracle alert log.

Now go to SM37 and find the jobs running at that time. There will be quite a lot.

But a process of elimination could help you to proceed:

Exclude all jobs that run for a few seconds only.

Exclude all periodic jobs that also run at times when there isn't a high database load.

Once you have got a few suspects, you might try to run them at a different time and see what happens.

Or a quite different approach: Analyze what's in the redo log files.

There is an Oracle tool called Logminer. Never tried it myself though.

Or on a simpler level, try Unix strings command.

This should give you names of tables and users involved here.

hope this helps

benoit-schmid
Contributor
0 Kudos

> You have got the time from Oracle alert log.

> Now go to SM37 and find the jobs running at that time. There will be quite a lot.

> But a process of elimination could help you to proceed:

>

> Exclude all jobs that run for a few seconds only.

> Exclude all periodic jobs that also run at times when there isn't a high database load.

>

Hello,

This is what I have done, but it makes a list of 20 jobs (>120s).

I am going to try logminer. This is a good idea to have access at the log and see the tables accessed.

From st03, can I have more "usefull" info on the 20 jobs, that could help me which one is very db modify intensive?

The thing is that the commit interval is much more important than the modification volume.

Thanks in advance for your answer.

Former Member
0 Kudos

I don't understand this:

The thing is that the commit interval is much more important than the modification volume.

Why?

For all I know the commit intervall of applications like SAP isn't related to the log switch frequency of Oracle.

A log switch occurs whenever the modification volume has reached a certain level, or in other words whenever an online redo log file is full.

Or what am I missing here?

benoit-schmid
Contributor
0 Kudos

I don't understand this:

> The thing is that the commit interval is much more important than the modification volume.

> Why?

>

> For all I know the commit intervall of applications like SAP isn't related to the log switch frequency of Oracle.

> A log switch occurs whenever the modification volume has reached a certain level, or in other words whenever an online redo log file is full.

>

> Or what am I missing here?

Hello,

If you delete 1000 rows and commit you generate much less redo log than if you commit 1000 (after each row deletion).

The disadvantage of committing late is that you are locking rows before committing.

Therefore you have to find a trade off between locking objects and slowing db by committing too often and thus generating excessive redo.

By the way, I have done a strings on my redo. It is very helpful as I see patterns that correspond to row or table names.

Regards,

Former Member
0 Kudos

Hi,

1. Please check the user id mentioned in the syslogs for the oracle checkpoint error.

2. Next, check for jobs being run by that user during the time period mentioned.

3. Check for corresponding jobs logs and lock entries that exist during that time. Also check for other jobs being run during the same time - to understand if there are conflicts during table access.

You can also try activating trace during that time frame using appropriate filters to better analyze the issue.

Hope this helps.

Thanks,

Varun

benoit-schmid
Contributor
0 Kudos

> 1. Please check the user id mentioned in the syslogs for the oracle checkpoint error.

Hello,

I do not understand what you mean.

The checkpoint warning is in the oracle alert:

Archived Log entry 1121 added for thread 1 sequence 3228 ID 0x6cc67f67 dest 1:

Wed Sep 21 23:28:30 2011

Thread 1 cannot allocate new log, sequence 3230

Checkpoint not complete

Current log# 1 seq# 3229 mem# 0: /oracle/PRD/origlogA/log_g11m1.dbf

Current log# 1 seq# 3229 mem# 1: /oracle/PRD/mirrlogA/log_g11m2.dbf

Wed Sep 21 23:28:32 2011

Completed checkpoint up to RBA [0xc9b.2.10], SCN: 46401017

Beginning log switch checkpoint up to RBA [0xc9e.2.10], SCN: 46473577

There is no user info.

All my oracle processes run as oraprd.

Which file are you referring as syslogs for the oracle?

Thanks for your answer.