cancel
Showing results for 
Search instead for 
Did you mean: 

SAP GoLive : File System Response Times and Online Redologs design

Former Member
0 Kudos

Hello,

A SAP Going Live Verification session has just been performed on our SAP Production environnement.

SAP ECC6

Oracle 10.2.0.2

Solaris 10

As usual, we received database configuration instructions, but I'm a little bit skeptical about two of them :

1/

We have been told that our file system read response times "do not meet the standard requirements"

The following datafile has ben considered having a too high average read time per block.

File name -Blocks read - Avg. read time (ms) -Total read time per datafile (ms)

/oracle/PMA/sapdata5/sr3700_10/sr3700.data10 67534 23 1553282

I'm surprised that an average read time of 23ms is considered a high value. What are exactly those "standard requirements" ?

2/

We have been asked to increase the size of the online redo logs which are already quite large (54Mb).

Actually we have BW loading that generates "Chekpoint not comlete" message every night.

I've read in sap note 79341 that :

"The disadvantage of big redo log files is the lower checkpoint frequency and the longer time Oracle needs for an instance recovery."

Frankly, I have problems undertanding this sentence.

Frequent checkpoints means more redo log file switches, means more archive redo log files generated. right ?

But how is it that frequent chekpoints should decrease the time necessary for recovery ?

Thank you.

Any useful help would be appreciated.

Accepted Solutions (1)

Accepted Solutions (1)

stefan_koehler
Active Contributor
0 Kudos

Hello

>> I'm surprised that an average read time of 23ms is considered a high value. What are exactly those "standard requirements" ?

The recommended ("standard") values are published at the end of sapnote #322896.

23 ms seems really a little bit high to me - for example we have round about 4 to 6 ms on our productive system (with SAN storage).

>> Frequent checkpoints means more redo log file switches, means more archive redo log files generated. right?

Correct.

>> But how is it that frequent chekpoints should decrease the time necessary for recovery ?

A checkpoint is occured on every logswitch (of the online redologfiles). On a checkpoint event the following 3 things are happening in an oracle database:

  • Every dirty block in the buffer cache is written down to the datafiles

  • The latest SCN is written (updated) into the datafile header

  • The latest SCN is also written to the controlfiles

If your redologfiles are larger ... checkpoints are not happening so often and in this case the dirty buffers are not written down to the datafiles (in the case of no free space in the buffer cache is needed). So if your instance crashes you need to apply more redologs to the datafiles to be in a consistent state (roll forward). If you have smaller redologfiles more log switches are occured and so the SCNs in the data file headers (and the corresponding data) are closer to the newest SCN -> ergo the recovery is faster.

But this concept does not really fit the reality because of oracle implements some algorithm to reduce the workload for the DBWR in the case of a checkpoint.

There are also several parameters (depends on the oracle version) which control that a required recovery time is kept. (for example FAST_START_MTTR_TARGET)

Regards

Stefan

Answers (1)

Answers (1)

fidel_vales
Employee
Employee
0 Kudos

Hello,

Regarding the first one, the value depends if you have local disks or Storage attached ( read-write-cache-memory

system ).

If I recall properly, for SAN the expected is 15-20 ms, you are a bit higher

Regarding the second one.

the ammount of redo log data generated will be the same, what it changes is "how to store" it.

You mention that 54Mb is quite high, I do not agree.

SAP default installation is 50 Mb, and it is good for small systems. I've seen systems with redo log size of 600 Mb.

Ideally, you should not get more than 1 logswitch a minute.

If you get "Checkpoint not complete" then you are generating an ammount of redo information that fill all your online redo logs before the first one is archived. This must be avoided as Oracle "freezes" until the log is waiting for is archived.

Rgds

Fidel