cancel
Showing results for 
Search instead for 
Did you mean: 

Oracle 10g 10.2.0.2 on Solaris 10 ZFS file system performance

Former Member
0 Kudos

Gurus,

Questions:

Anyone out there using Oracle 10g, SAP NW2004s (7.0) on Solaris 10 SPARC w/ZFS instead of typical UFS/VxFS???

We jumped into it w/o doing any tuning whatsoever. And now we see LOTS of I/O bottlenecks and having to deal with them after going LIVE.

Specifically, we've got OLTP systems such as XI and MDM. And when we see just a few messages in our XI outbound queues, I/O, CPU, mutex spins go crazy. Our Wily tools show specific I/O bottlenecks in "avg queue length", etc.

We're on Solaris 10 SPARC release 11/06. Our SAN is NetApp.

For everyone's information, we're now trying to follow the ZFS tuning guide(s) here and blogs:

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide

http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide

http://blogs.sun.com/realneel/category/ZFS

ZFS is WAY too new and requires a high degree of tuning. I miss UFS...

So far we've made the following changes:

1. Disabled Oracle checksums because ZFS is doing it too:

Alter system set db_block_checksum=OFF scope=both;

2. Increased size of oracle redo/archive logs. We were set to 50Mb but logs were rolling off faster than could be archived. We were getting lots of "checkpoints not complete." So we followed OSS note 79341.

We plan on making these changes very soon:

1. Turn off ZFS "caching" because SAN is doing it already (see tuning guide above)

2. Limit amount of memory ZFS can use (it tries to use everything, again tuning guide)

3. Change ZFS "record size" to match Oracle. ZFS default is 128K, oracle is 8K. they need to be the same, according to tuning guide, except for redo log files which, they saw, should stay at 128K. I'm assuming that means origlogA/B, mirrlogA/B and oraarch.

3a Change only applies to "new" files. Ones that already exist are still 128K. Only way to fix that is full/cold backup. And copy files back down after changing record size.

4. Separate out the origlog, mirrorlog and oraarch directories. Create new SAN LUNs, and Zpools for these. Again tuning guide says performance will be better.

Other than these changes, we don't know what else to do. We completely followed the oracle param tuning guide for 10g, Note 830576.

Any thoughts/suggestions. I hope people read this and appreciate what needs to be done to correctly use ZFS. This has been a nightmare for us. And there really are not any good SAP notes related to ZFS out there.

Another thing is that TCODE OS06 for checking file system sizes does not recognize any ZFS file systems. For us, only /, /var, and other UFS file systems show up.

So be aware of that too.....

--NICK

Accepted Solutions (0)

Answers (2)

Answers (2)

markus_doehr2
Active Contributor
0 Kudos

> Any thoughts/suggestions. I hope people read this and appreciate what needs to be done to correctly use ZFS. This has been a nightmare for us. And there really are not any good SAP notes related to ZFS out there.

You know this one:

Note 1343733 - Support for SAP on ZFS

Another question: is this a T-Server series (CoolThreads) from Sun?

Markus

lbreddemann
Active Contributor
0 Kudos

HI Nicholas,

I cannot tell too much about ZFS (not sure we actually support this for SAP installations...).

But:

We jumped into it w/o doing any tuning whatsoever. And now we see LOTS of I/O bottlenecks and having to deal with them after going LIVE.

Why? Where was the testing? What was the reasoning behind this decision? Why do you keep the pain and stay on a filesystem you don't know how to tune?

Stop it now. Make an offline backup, setup the filesystem from scratch with options you master, restore the database and there you go!

If you still want to find out about ZFS - well I'm sure there are testservers available...

So far we've made the following changes:

1. Disabled Oracle checksums because ZFS is doing it too:

Alter system set db_block_checksum=OFF scope=both;

DON'T do that! With disabling the Oracle-Blockchecksum, you're physical I/O won't become any faster. Calculating the checksum is a CPU job - and it doesn't take too much of it.

What you loose is, that dbverify recognizes all kinds of data corruptions that can happen while the block structure is still valid. You also disable Oracle to recognize corruptions when it reads blocks.

Actually you allow that data changes in the "payload" of a valid datablock w/o the database "knowing" of it.

This road just leads to trouble!

> We plan on making these changes very soon:

>

> 1. Turn off ZFS "caching" because SAN is doing it already (see tuning guide above)

Ok, that is a setup option that is valid for all filesystems. Whenever possible, DirectIO (or a similar option) should be used. The memory of the server should be managed by Oracle. The database knows best which blocks should be in memory and which not.

Disable the caching and use the memory for the buffer cache.

> 2. Limit amount of memory ZFS can use (it tries to use everything, again tuning guide)

Same as 1.)

> 3. Change ZFS "record size" to match Oracle. ZFS default is 128K, oracle is 8K. they need to be the same, according to tuning guide, except for redo log files which, they saw, should stay at 128K. I'm assuming that means origlogA/B, mirrlogA/B and oraarch.

Hmm... block sizes of filesystems should in general match the block sizes of the database. But the redo logs are usually written in 512K pages - I don't see the reasoning for using 128K record size for those files. BTW: the archivelogs (oraarch)-setup should not be a performance issue.

It's nothing your session wait for. Archiving is done in the background. So no change should be required here.

> 3a Change only applies to "new" files. Ones that already exist are still 128K. Only way to fix that is full/cold backup. And copy files back down after changing record size.

Yep - that will do the trick.

> 4. Separate out the origlog, mirrorlog and oraarch directories. Create new SAN LUNs, and Zpools for these. Again tuning guide says performance will be better.

Different drives/LUNS for origlog, mirrorlog and oraarch should be used anyhow. Mirroring just don't makes much sense if it's done onto the very same drives... does it?

Hmmm... most of the topics are already covered in note

#793113 - FAQ: Oracle I/O configuration

Ok, I guess this kind of problem is rather classic. A go life with no or too little testing. The use of unknown technology (unknown to you) in a critical path of your project.

These are all no-nos. Don't do them! They always lead to project failure.

In most cases I've seen the only option to keep the damage small was: eliminate the unknown technology parts and move on with something that is known to the people in the project.

Maybe the ZFS gives you better opportunities in future steps of your project - right now it does the opposite. So better stop it now and follow the ZFS option in a non-critical action path - on the testsystem.

Just my 2 cents...

KR Lars

Former Member
0 Kudos

Lars,

Thanks for your detailed reply. To quickly answer the big question:

Q: why in the world are we using this technology instead of UFS??

A: The genius who designed the system didn't know what they were doing. The "figured" it would work. And Yes, SAP supports ZFS. I came in a month after prod was stood up. But like you said, we can always go back to UFS w/a cold backup and a restore. But I'm going to give the tuning a chance anyway and see if we can learn anything. Sad, I know.

So do you recommend setting the "record size" at the file system level for redo logs at 512K instead of 128K? And when I say "redo logs", are we talking about origlogA/B and mirrlogA/B, not oraarch?

Yeah, you're right that we should have separated out the log files long ago. It doesn't make sense for mirrors to be on the same drives......duh (dumb on our side)

Thanks again,

NICK

Former Member
0 Kudos

Did you able to tune this up? if so can you please conclude what made the most difference and if you could make ZFS perform better for OLTP system (ECC)