Delayed replication of 4 hours and no delay?

I have a client who are asking the question: Can we have two standby databases, one with no delay on the replication at all and one with a delay of 4 hours?

Is it also possible to use SRS to have three ASEs - a primary, a

failover

with no delay, and a failover with 4 hour delay. If so, could the auto-failover occur between the primary and the zero-delay failover, but be

manual action to invoke the 4-hour delay failover?

SAP Managed Tags:
SAP Replication Server

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

In addition what Mark wrote:


In the simplest setup you would create a database replication definition for the primary database and then two database subscriptions - one subscription for each replicate database

I would suggest configuration using "Warm Standby" for production server and standby server (that with no delay) and based on created logical connection (configuration of two servers in "Warm Standby" seen as one logical connection) I would create MSA replicating into second standby server (that with 4h delay).

In that case I could make "switch active" (fast failover) between servers with no delay with no need of changing configuration for second standby (it doesn't matter which server in logical connection is active or standby). Second standby server will still receive all transactions.

I would suggest configuration using "Warm Standby" for production server and standby server (that with no delay) and based on created logical connection (configuration of two servers in "Warm Standby" seen as one logical connection) I would create MSA replicating into second  standby server (that with 4h delay).

Yeah, that would work, too! 😉

I've had intermittent problems at a couple recent clients where the 'switch active' command has been very finicky (eg, fails to complete the switch due to a stray/missing network packet) so I guess I've become a bit gun shy with recommending warm standby.

I thought I'd seen Luc Van der Veurst do a write-up on an environment where he'd replaced all of his warm standby configs with MSA configs ... and how he was able to perform his switches faster with MSA than with warm standby. I can't recall/find where that post is (assuming I'm not imagining the write-up) so perhaps Luc will see this thread and jump in, eh ...

I would like to add that using traditional "warm standby" might not satisfy the requirements, if the desire is to be able to also fail-over to the delayed copy.

If the delayed copy is never to become the primary, then a warmstandby pair subscribed to a 3rd copy is OK. But if the intent is to be able to failover to either standby location, I think we should recommend an MSA only solution - the benefit would be that the failiover procedures would be the same. With a mix of wamstandby and MSA, you have to maintain two differnet failover processes - which might be confusing to maintain.


Mark A. Parsons wrote:

I thought I'd seen Luc Van der Veurst do a write-up on an environment where he'd replaced all of his warm standby configs with MSA configs ... and how he was able to perform his switches faster with MSA than with warm standby.  I can't recall/find where that post is (assuming I'm not imagining the write-up) so perhaps Luc will see this thread and jump in, eh ...

I mentioned it in a thread about admin quisce_force_rsi command earlier this year, but that thread has been deleted (http://scn.sap.com/message/14764886 ). I find it still difficult to find information on the sap website :-), I also thought I wrote something about it recently, but can't find it.

I posted the following on the ISUG SIG-Replication mailing list, in oktober 2012 :

____________________________________________________________________

I used to have 5 warm-standby pairs, one had 52 replicated databases.

I defined 4 replication servers for that one server, each replicating 13 databases, so that I could do a switch to standby in parallel.

It took some investigation to switch to standby as fast as possible because trying to do things in parallel was causing deadlocks in the rssd_db and then manual intervention was necessary, causing extra minutes downtime.

Putting sleeps in between commands increased the time to switch too much, so I examined what happened in the system tables and wrote a script that checks some status columns in a loop and continues when a certain value is reached.

Switching to standby took 2 to 3 minutes, nerve breaking minutes because there is always something that can go wrong.

About 2 years ago, I replaced all warm-standby pairs by MSA bi-directional replication.

We still use it in an active-passive setup, so there are no conflicts that could raise like in an active-active situation.

Switching to standby now takes at most 9 seconds, and nothing can go wrong since there is no switch command that needs to be executed.

The actions that are performed during a switch are :

- locking the users on the active server

- remove the logical ip address from the active server

- kill all user connections on the active server

- insert a row in a table in all replicated databases in the active server and wait until the rows arrived at the standby server

- unlock the users on the standby server

- add the logical ip address to the standby server, now becoming the active server.

Almost all 9 seconds go to the test that all statements in the queues are replicated.

(I also do this test before the switch so that the replication server has a connection to all databases and therer is no delay in replicating).

The 9 seconds switch time are for the server with 52 replicated databases.

Switching a server with only 10 replicated databases takes about 6 seconds.

Our application servers reconnect to the database server automatically, so a user who doesn't do a database request within those 9
seconds doesn't notice that database servers were switched.

I work in a hospital where we have 4 time windows of 3 hours per year to do maintenance.

If a host needs maintenance, I can now make the host free in seconds, so this can be done without warning the users, thanks to MSA.

There are some disadvantages.

- There are more statements to set it up.

- Replication definitions, necessary to define a primary key, now require all columns, so each time you alter a table and add/change/modify a column, you also need to modify the replication definitions (you need 2, one at both sides)

- We have DML replication set on, so I have to be carefull to set replication off when I want to do some maintenance on the standby server

- We also have table replication from one msa-pair to another, this also requires more setup since after a switch, the server receives last commit information from another server. To adderss this, I copy rs_lastcommit information from active to standby during the switch process.

- ...

But I value the advantages much higher :

- faster switch, without stress :-), since basically, it's just an IP adres that needs to be moved to another system.

- disadvantage of having to define all columns in a replication definition, becomes an advantage when you want to remove columns from a table (that can be done at the standby site first after removing the columns from the replication definition)

- and it's possible to add a 3rd (4th, ...) server to the setup which makes it possible to test new versions of ASE with the ability to return to the previous version without breaking your MSA setup.

Our current situation is that we have 6 pairs of 2 12.5.4 ASE servers.

We have 2 datacenters and each pair has one server running in each datacenter.
To each pair, I added a 3rd 15.7 server with bi-directional replication to the other 2.

Any of the 3 can become the active server within at most 9 seconds and will then replicate to the other 2.

_________________________________________________________________________________

I'm still happy with this setup :-).

We don't have automatical fail-over (we also didn't have it when we were using warm-standby).

There are too many unknown factors: is the server really down ? what's the status of the replication system ? will there be data loss ? ...

We prefer to make the decission wether to restart the server on the same host, or to switch to the standby system ourselves.

Luc.

Accepted Solutions (0)

Answers (1)

Guys, thank you for your quick responses and and invaluable help!

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Ask a Question

Top Q&A Solution Author

User

Count