cancel
Showing results for 
Search instead for 
Did you mean: 

Looking for HA experiences for SAP on IBM i

Former Member
0 Kudos

Hi,

We are very experienced in disasters, it seems not possible but it is true: in last three years we have suffered three major outages, with big downtimes ( from 13 hours to 36 ) in our main production system, our R/3 which is supposed to be a 24x7 system.

We started our HA architecture several years ago, using MIMIX to do a logical replica of our production database. It worked, but it needed so much administration at this time. So we moved to a hardware replica, using DS8000 storage subsystems and moving SAP to an iASP, we also use the Rochester Copy Services toolkit to manage all this landscape. At first we used MetroMirror, a synchronous protocol, because we still owned our machines and they were physically close. Recently we evolved our HA architecture, moving our systems to a different sites of an outsourcing partner, we were forced to change the replication protocol to GlobalMirror, asynchronous, because the distances.

We know what is a 'rare' hardware failure on our storage susbsystem: it started to write zeroes on a disk, without practically no detection, lots of SAP tables were damaged. We also have suffered a human error that deleted an online disk and killed all the iASP ( first real tape recovery in my life ). And finally we know what is a power failure in the technical room of our partner. Imagine that all this failures with a database that is 3TB big. Do you know how much time is needed to restore from tape and run APYJRNCHGX or rebuilding access paths ? I know...

As you can imagine we have invested a lot of money trying to protect our data, and it worked because we have never lost any bit of information but our recovery times are always far from the ones needed.

I'm looking for experiences about how other SAP on IBM i customers are managing the HA in his critical systems, and if possible compare real experiences of similar outages. What are we doing wrong ? We cannot be the only ones...

Regards,

Joan B. Altadill

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Hi Joan,

We run MIMIX replication for our ERP system/partition and 4 other partitions with BW, Portal, PI, SRM, and Solution Mgr in them. There is some administration but it has been worth it for us. We have duplicate 570 hardware in an offsite DC 35 miles away for failover. We also do our backups on the replicated systems. We have been running MIMIX since going live with SAP in 1998.

Several years ago we used MIMIX replication to migrate to new servers during lease replacement which cut our migration downtime from 8 hrs for backup/restore to about 1 hr while we shut down the system on old servers, started up ERP system on new servers and checked all the interface connections.

But the real payoff came in March this year when our production server went down hard during a hot maintenance procedure. We were able to MIMIX switch to our DR server in under 1 hr and the business ran on the DR server for two weeks, while we reverse-replicated, then we switched back.

We have subsecond replication so we did not lose any data and there were no incomplete transactions on the DR side after the switch. MIMIX paid for itself, including administration, in that one incident.

Hope this helps,

Margie Teppo

Perrigo Co.

Answers (1)

Answers (1)

Former Member
0 Kudos

Hi Joan,

We are running on a hardware replica, using 2 DS systems storage, SAP is on a iASP too, we run this by a customized IBM solution to manage all this landscape. Our systems are on diferent locations but we are using MetroMirror synchronous protocol because a direct cable fiber connection. At the moment, we haven't got any disaster situation so I can not give you any detail but as a test, we reproduce it 2 times by swithing off PRD system an swithing on the other PRD clone. This operation is completed in 1 hour aprox.

kind regards

Scott

Former Member
0 Kudos

Hi,

We have used hardware replicas since 2004. And the three unplanned outages have also affected the backup system. We are planning now to return to a software replica.

1 hour is the normal time in our planned switches. The question is to know if you have tried to test an unplanned one and the time needed to do it.

Regards,

Joan

Former Member
0 Kudos

Hi,

Your environment seems practically the same than ours, with the only difference in the kind of PPRC: you are synchronous with Metro Mirror, we are asynchronous using Global Mirror. How do you manage the switches between environments ? You use custom scripts or a product like the IBM Rochester Copy Services toolkit ( we use this toolkit ) ?

Regards,

Joan

Edited by: Joan Baptista Altadill Elías on Nov 19, 2010 1:06 PM

Former Member
0 Kudos

Hi Joan,

True, it looks close similar... Fiber Switches are managed from both sides (both systems can reach both switches). Our solution was implemented by customized scripts and DsCli commands.

We perform tests by scheduled power-off on Productive system.

Kind regards

Scottie