cancel
Showing results for 
Search instead for 
Did you mean: 

Number of Oracle online redolog groups

Former Member
0 Kudos

Hi

Currently we have 4 groups of Oracle Redo LOGFILES with 2 members each 50 MB as standard installation

Our DBA team wants to increase with 10 Groups 3 members each100 MB.

I want to know if there is any guide line or standard there on setting up number and size of redo log files from SAP.

Your reply will be highly appreciated

Thanks

-Al

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

The size and number of redo log groups need to be tuned based on the requirement. It depends on the system load (number of users, update operation), security aspects, disk performance etc..etc..

Oversized redo log files means less often the chages are written to disk (less frequent check points) and thus pose a risk of losing more data during a crash. Also increased time for instance recovery.

Undersized files can cause performance problems due to frequent checkpoints (checkpoint incomplete) but less data is lost during a crash.

If you have many users and heavy update operations, you need to configure more number of groups. Only one group is used at a time and when it fills the next one in queue will be opened. Previously used one will be archived by the archiver. Depending on the speed that the log groups can be written, more than one log group may be waiting to be archived. So you may also need to evalute your archiver performance (maninly disk performance) while choosing the number of groups.

stefan_koehler
Active Contributor
0 Kudos

Hello,

> Oversized redo log files means less often the chages are written to disk (less frequent check points) and thus pose a risk of losing more data during a crash. Also increased time for instance recovery.

... completely wrong: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:19311485023372

Regards

Stefan

Former Member
0 Kudos

Not sure what you meant by pointing me to that page.

Are you trying to tell me using large online redo log files with fast_start_mttr or log_checkpoint parameters set to gain lower recovery time and faster performance?

I'm not completely confident on that solution as I have a real life experience on a very large retail system.

Agressive dbwr can cause poor performance by putting stress on your CPU and I/O

How about your experience?

Former Member
0 Kudos

The data is written to disk at commit time (at least), so you never loose data!

But back to the original question:

- do not configure a 3rd member, unless your DBA has a real good explanation to do so

- 10 groups with 100mb sound not bad, personally i would rather go for 8 groups with 100mb or 200mb each

As a guideline, size your redo logs big enough that roughly one logswitch occurs per minute during higher system load times. On larger systems sizes can easily be over 1gb. Systems nowadays often write more than 20mb/s of redo.

Cheers Michael

stefan_koehler
Active Contributor
0 Kudos

Hello,

> Are you trying to tell me using large online redo log files with fast_start_mttr or log_checkpoint parameters set to gain lower recovery time and faster performance?

The previous quoted statement was about loosing "more" data and time for instance recovery. If you don't have a huge load on your system with "big" online redo log file members - FAST_START_MTTR_TARGET is the solution for recovery time. ... on the other hand you never lose data (depending on the log file size)

> very large retail system

What's very large? Amount of redo log data? Database Size? Concurrent SQLs / Users?

> Agressive dbwr can cause poor performance by putting stress on your CPU and I/O

If you really have huge systems with high change rate - you normally have (huge) DBWR I/O load (because of space pressure and cleaning out dirty buffers, etc.). By the way - the algorithm round about FAST_START_MTTR_TARGET was also improved version by version.

Regards

Stefan

Former Member
0 Kudos

Hi Michael,

Thanks for your reply.

What is the rational not to use 3rd member within the group ? Is there any technical explanation behind it ?

I was told, regarding SAP redolog group, it should be multiple of 4 , you are right it should be either 4 , 8 or 12, what is the technical reason ? Please help

Thanks Again

-Al

volker_borowski2
Active Contributor
0 Kudos

Hi once more,

> What is the rational not to use 3rd member within the group ? Is there any technical explanation behind it ?

In general a raid controller does mirroring in HW better than SW.

From OS view, it is a single write to a device.

A SW mirror is two (three) OS-writes and SW-management to ensure all writes have been ok.

This is more expensive.

But if you do not trust your HW, go for a 3rd mirror! Saftey first!

We use HW mirror and keep the SW mirror for the luxury to have a last resort against accidential delete of a log file.

> I was told, regarding SAP redolog group, it should be multiple of 4 , you are right it should be either 4 , 8 or 12, what is the technical reason ? Please help

I heard, it should be a matter of "times 2" the filesystems (A / B).

That is just to keep lgwr and archiver active on diffrent filesystems.

(logswitch from A to B makes lgwr write on B and ARC read on A)

In fact you can not always keep up with this in real life effectively.

The system I described above uses 8 dbwr to manage the checkpoints involved with each logswitches.

Once we managed to get rid of "checkpoint incomplete", we were immediately confronted with

"all logs need archiving", which was when more archiver have been brought in.

But in this world, sometimes you have a backup ending in high activity, forcing a logswitch

while just some MB have been written and thus ARC4 been faster than average and

whoops : jump from log group 14 finds 15 still beinig archived but 16 already processed

... and it jumps from 14 to 16 to avoid the wait for 15 getting ready!

And with only 2 filesystems A/B you currently have read and write activity on either A or B.

Happened once a week with A and B in our system.

Now that we switched to A / B / C / D it happens once a quater, that the sequence is

messed up, but even when it is, it is normaly not so bad, if it switches from

A -> B -> C -> D

to

A -> C -> D -> B

Second reason for 4 filesystems was, that lgwr manages to write the 1200M in 20 secs,

but the archivers need significantly more, as they are all competing for oraarch as a target.

So even when switching back from B to A with 8 groups, the archiver was not ready on A then.

Volker

Answers (1)

Answers (1)

volker_borowski2
Active Contributor
0 Kudos

Hi,

I have a large system with 8 groups of 1200M logs but only 2 members.

They are distributed to filesystems

origlog A / B / C / D

mirrlog A / B / C / D

with two members on each filesystem.

They are switching at ~20 secs on peak load and 6 archiver processes are active then.

So it really depends on system activity how to size these.

I think two oracle SW Mirrors on seperate HW-mirrored filesystems are sufficiant (means 4 copies).

If your logs are on Raid-0 devices or native disks, a third member might be a choice of security.

Did you ever measure how long a critical crash recovery was, and how big the part reading the current online log was?

I never had it more than beinig minutes (very rarely more than 10, ever more around 3-5 minutes),

but this might be system specific and your systems might be diffrent than mine.

I suppose it depends on the type of user activity.

But I would not tend to size the logs for those rare 10 minutes "after crash"

against all the other minutes uptime the system has.

But this aspect might be diffrent in your case as well.

Volker