on 06-27-2011 6:42 AM
Hello,
We are currently using Networker backint for brbackup.
With Networker, we write directly to tapes, with hardware compression, through a robot.
The bandwidth that we have, on average, is 60-80 MBytes/second.
We also have a TSM infrastructure.
With TSM we first backup to a disk stage area.
From this stage area we backup to tape.
With TSM, the average backup bandwidth is 15 MBytes/second.
This makes the backups four times longer in time.
I am wondering which backup speed, the other backint users get?
This is why, if you are using backint, could you post
your backup speed and infrastructure?
Thanks in advance for your answers.
Symantec Netbackup (former Veritas) master with several media servers and ESL 712e and IBM TS3500 libraries.
Speeds are varying for LAN backups from 7mb/s up to 120mb/s
For a few systems we have a SAN backup (split mirror backup with disks and tape drives directly attached to a backup server), there we have up to 400mb/s, backup for a 6tb database in 4.5 hours.
You should look at disks, network and cpu (if you use compression) when looking at backup speed issues.
Cheers Michael
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello Michael,
Symantec Netbackup (former Veritas) master with several media servers and ESL 712e and IBM TS3500 libraries.
>
> Speeds are varying for LAN backups from 7mb/s up to 120mb/s
When you say 120 mb/s do you mean bits per seconds or Bytes per seconds?
When you backup at 120 mbs, does Netbackup write to a disk stage area or directly to tapes?
> For a few systems we have a SAN backup (split mirror backup with disks and tape drives directly attached to a backup server), there we have up to 400mb/s, backup for a 6tb database in 4.5 hours.
Could you post the init<SID>.sap and your brbackup scripts so that I better understand how you perform your split mirror backup?
> You should look at disks, network and cpu (if you use compression) when looking at backup speed issues.
In our case the bottle neck is not the network as it is servers with 1 Gb/s.
The machines do not not look cpu overloaded.
The bottle neck really seems to be the I/O (in a HDS bay, FC disks speed on the server).
But from what I have observed, on the Sap server, I could not reach 120 MB/s.
I say that because I do not reach 100 MB/s when I do a disk to disk copy with several cp commands running.
Thanks in advance for your answer.
When you say 120 mb/s do you mean bits per seconds or Bytes per seconds?
I meant bytes, basically i just divided the database size in MB through the runtime in seconds. But 120mb/s is really the top side of it, i think the average is somewhere between the 7mb and 120mb/s. I would consider 60-80mb/s as good on a 1gb lan and values around 140mb/s for a 2gb lan interface.
When you backup at 120 mbs, does Netbackup write to a disk stage area or directly to tapes?
I am not 100% sure, but i think those were directly on tape. It could be that this database had quite an amount of freespace, so the throughput will be quite high (data will be compressed on the tapes).
these are special in initSID.sap:
backup_type = online_split
backup_dev_type = util_file
split_cmd = "/usr/openv/netbackup/dbext/hdssplit/bin/nbuhdssplit -dbsvr server1 bc import -split -f $"
resync_cmd = "/usr/openv/netbackup/dbext/hdssplit/bin/nbuhdssplit -dbsvr server1 bc export -resync"
I cannot give out the scripts, because there are licensing restrictions. But the documentation is avalable here: [Split Mirror Backup|http://help.sap.com/saphelp_nw73/helpdata/en/46/ad6871c5b462d1e10000000a1553f7/frameset.htm]
I say that because I do not reach 100 MB/s when I do a disk to disk copy with several cp commands running.
I did a cp datafile /dev/null on a 10gb file (10000mb to be exact), it took 74 seconds, or 135mb/s. Then i copied 4x 10gb in parallel, 4 different files to /dev/null, the longest took 223 seconds. This would be 40000 / 223 = 179mb/s. During the second test disk were fully utilised.
Cheers Michael
Hello Michael,
> I did a cp datafile /dev/null on a 10gb file (10000mb to be exact), it took 74 seconds, or 135mb/s. Then i copied 4x 10gb in parallel, 4 different files to /dev/null, the longest took 223 seconds. This would be 40000 / 223 = 179mb/s. During the second test disk were fully utilised.
>
> Cheers Michael
What kind of disks were you using (SATA/FC/SAS) (direct attach/SAN/NAS)?
Thanks for your answer.
Hello Benoit,
I believe the backup speed mainly depends on the requirements (time to recover) and the available money. If you use LTO5 tape drives you can achieve a backup througput of 2 TB / hour per drive. Of course that means that all other components don't impose a bottleneck, so you have to use e.g. 8 Gbit fibrechannel links and some high end diskarray. As long as you don't have such high-end demands you can scale down to some other (less costly) combination.
Performing backups to disk implies that performance isn't your concern and that you have lots of time available for performing backups. If you want to perform backups quickly (and with minimal impact to your disk array), there is no other choice but use tape drives.
Regards,
Mark
Hello Mark,
Hello Benoit,
>
> I believe the backup speed mainly depends on the requirements (time to recover) and the available money. If you use LTO5 tape drives you can achieve a backup througput of 2 TB / hour per drive. Of course that means that all other components don't impose a bottleneck, so you have to use e.g. 8 Gbit fibrechannel links and some high end diskarray. As long as you don't have such high-end demands you can scale down to some other (less costly) combination.
2 TB = 2097152 MB
2097152 / 3600 ~ 580 MB/s
Have you experienced such I/O throughput?
If yes, what kind of disk/bay was it?
With FC Disk on Hitachi Bay and 4GB HBA on Solaris servers, I have never experienced such transfer rate.
It was less than 100 MB/s.
Thanks in advance for your answer.
Hello Mark,
> Performing backups to disk implies that performance isn't your concern and that you have lots of time available for performing backups. If you want to perform backups quickly (and with minimal impact to your disk array), there is no other choice but use tape drives.
How many tape drives were you using parallel to achieve such throughput?
Thanks in advance for your answer.
2097152 / 3600 ~ 580 MB/s
Have you experienced such I/O throughput?
If yes, what kind of disk/bay was it?
With FC Disk on Hitachi Bay and 4GB HBA on Solaris servers, I have never experienced such transfer rate.
It was less than 100 MB/s.
Yes, I have seen a backup throughput of 650 MB/s on one LTO5 drive. This is on a HP XP diskarray and 8 GB HBA on HP-UX servers and DataProtector. Perform a deep performance analysis of your SAN, there are so many potential bottlenecks which could ruin the throughput.
What kind of disks were you using
This was a HDS AMS 2500 (midrange storage system) SAN attached. The large systems are HDS XP 24000 (enterprise storage systems).
Mark provided some helpful input, another idea could be to use RMAN and incremental backups. This will definitely shorten the backup times and you will have less space needed on tape. BUT this has off course the big cost of increased restore time.
But rman has a bunch of nice features, it does not backup unused blocks, it can automatically check for block consistency, you can do backup compression on the host side etc. etc.
Cheers Michael
Hello Mark,
If you use LTO5 tape drives you can achieve a backup througput of 2 TB / hour per drive.
What surprises me is that you could get more than four times the maximum speed (140 MB/s) of the LT05 tapes:
http://en.wikipedia.org/wiki/Linear_Tape-Open
Does it mean that your compression rate was higher than four?
Thanks in advance for your answer.
Does it mean that your compression rate was higher than four?
Yes, but it is difficult to measure the exact compression rate. Typically the value is only estimated. A compression rate of 4 is quite normal for SAP systems on Oracle. This example was a BW system which show even somewhat better compression rates than R/3 systems. The goal was to make the LTO5 tape drive the bottleneck, and we were successful. 650 MB/s was the average value calculated for the whole backup time.
Hi,
no matter what device is beind TSM, you should be able to utilize nearly 90% of a dedicated 1GB/s lan interface.
so this would be ~100MB/sec, taking the byte for 10 bit for ease of calculation.
You need to dig in sessions and multiplexing parameters and may be into compression to dig out the best of your environment.
tsm sessions defines, how many data streams you utilize over the network
multiplexing defines how many datafiles you read within a single session.
so basicly you have 3 (4) bottlenecks to measure.
Backupdevice bandwidth
lan transfer bandwidth
disk read bandwidth
and (4) if compression is involved: cpu shortage (esp. with rman_compress)
I'll lookup my values tomorow, but 15MB/sec sugguests,
that you use tsm with RL compression and somewhere in your LAN path is a 100MBit/sec path, may be due to wrong autonegotiation or wrong routing. The Compression is, why you get a bit more than 100MBit
Volker
Hello Volker,
> no matter what device is beind TSM, you should be able to utilize nearly 90% of a dedicated 1GB/s lan interface.
> so this would be ~100MB/sec, taking the byte for 10 bit for ease of calculation.
Why do you say that?
Is it what you have experienced.
> You need to dig in sessions and multiplexing parameters and may be into compression to dig out the best of your environment.
> tsm sessions defines, how many data streams you utilize over the network
> multiplexing defines how many datafiles you read within a single session.
Which parameters do you generally use/recommends?
As I am not familiar with TSM, are there stats tools that you can run on the client?
> Backupdevice bandwidth
This the one I suspect the most, as the backup speed as bandwidth increased when my colleagues tested
with a server with less busy disks.
By the way, on your TSM server, do you backint-backup first on a disk area or do you backup directly to tape?
> lan transfer bandwidth
Well it should limits us for all backups.
I would not have fast Networker backup if it was the source .
They are going to install a 10Gb/s card. This should confirm that the net is not bottle neck, at least on the server side.
> disk read bandwidth
I guess this is what limits my networker backup (60-80 MB/s), but not my TSM.
> and (4) if compression is involved: cpu shortage (esp. with rman_compress)
No rman compression and the cpu seems to be not overused on these machines.
> I'll lookup my values tomorow
Thanks
> but 15MB/sec sugguests,
> that you use tsm with RL compression and somewhere in your LAN path is a 100MBit/sec path, may be due to wrong autonegotiation or wrong routing. The Compression is, why you get a bit more than 100MBit
There is no 100 Mb/s ethernet switch path in between.
There is on 10Gb/s and 1 Gb/s.
If it was an autoneg pb, I would have it when I perform my Networker backup.
This is not the case.
Thanks in advance for your answers.
Hi,
so I checked 3 of my bigger systems.
8,8 TB backs up in ~ 24,5 hours using TSM util_file 4 sessions, multiplexing set to 6 (24 read streams)
8800GB / 24,5 / 3600 -> ~ 100MB / sec which is close to the limit of its 1GB LAN adapter
8,2 TB (earlier copy of the above beast, running on slower SATA disks, which would bring down read performance)
backs up in 23 hours, using rman_util with compression
This results in only ~1.5 TB being really written due to rman_compress
4 rman channels, 4 files per set meaning 4 of the 16 cpus are at the edge with 100% utilisation during this run.
Currently assuming this is the limit. Due to the slower disks, we do not dare to do reading with
more than 16 streams and the impact to application performance.
-> 8200 / 23 / 3600 means equivilent throughput as above, allthough it is not a lan limit in this case, as
-> 1500 (compressed) / 23 / 3600 means effectively only ~20 MB/sec are transfered
And I have another system
3,1 TB backs up in ~ 9 hours using TSM util_file 2 sessions, multiplexing set to 8 (16 read streams)
3100GB / 9 / 3600 -> ~ 95MB / sec which is again close to the limit of its 1GB LAN adapter
All systems have dedicated interfaces for backup.
The 8,8 TB System is either backed up directly to tape dives attached to the tsm server
or to virtual disk tapes also attached to the tsm server, depending on which one is available.
The 3 TB system backs up to a disk pool and is migrated to tape later by the tsm people
(I am not involved with that procedure, so no data about this process).
The 8,2 TB system, due to the compression also fits into the disk pool and is therefore saves to the pool.
Right now, a session corresponds to a single backup device in our environment,
meaning either a physical or a virtual tape or a stream to the disk pool.
With rman compression it might be usefull to combine several backup sessions into less backup device streams,
but we have not worked out yet, if this really gives a benefit, as we are at the beginning of evolving rman_compress.
As for monitoring the backup, on AIX nmon is quite usefull to monitor the vscsi throughput for disk reads and the lan adapter throughput to the tsm server. If you have multiple machines backing up to the same target interface on the tsm server, you might need to monitor the LAN there as well.
tsm parameter are set in initSID.utl and these are most interesting:
sessions (there are several of these for brbackup and brarchive)
multiplexing
buffsize
esp. buffsize depends on the tsm client release and in older templates you may still have 128k or 256k values.
On our big beast we use 4193792.
For rman, the parameters are set in initSID.sap:
rman_channels
rman_compress
rman_files_per_set
Volker
Hello Volker,
Thanks for this detailed answer.
> 3,1 TB backs up in ~ 9 hours using TSM util_file 2 sessions, multiplexing set to 8 (16 read streams)
> 3100GB / 9 / 3600 -> ~ 95MB / sec which is again close to the limit of its 1GB LAN adapter
> ...
> The 3 TB system backs up to a disk pool and is migrated to tape later by the tsm people
Could you please tell use what kind of disk/bay, you are usingon the Sap server and on the TSM server?
Thanks in advance for your answer.
Hello Volker,
> 8,2 TB (earlier copy of the above beast, running on slower SATA disks, which would bring down read performance)
> backs up in 23 hours, using rman_util with compression
> This results in only ~1.5 TB being really written due to rman_compress
You are confirming what Mark said in so far as you get a ratio of 5 with compression.
Before openeing this thread I would never had thought that you could could compress Oracle DBs so much.
See you,
Hi,
TSM Disk Pool is on EMC Disks as well.
Yes, rman compression was pretty amazing.
We have been suprised as well. But it costs extremly CPU.
And we have not fully checked restore times yet.
To check disk read bandwidth, you can check an RMAN incr backup right after a full one, without dirty block tracking enabled.
this will need to read all datafilese for checking and write nearly nothing to back up, so this is the easiest way to measure GB/hour for an almost read-all-write-nothing operation.
Volker
Hi,
At my landspace, we are running on HP DP - VTL library and it is something around "16,00 MB/s", but I believe that there are many points to consider about backup performance such as bandwidth, system performance and so on...
Best regards,
Orkun Gedik
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
86 | |
10 | |
10 | |
9 | |
7 | |
7 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.