cancel
Showing results for 
Search instead for 
Did you mean: 

HANA High Availability in Scale-Out

Former Member
0 Kudos

I understand one of the HA solution for single-node / scale-up installations HANA is HANA System Replication. For Scale-Out installations, one or more standby node(s) do provides some level of HA (via the Host-Auto Failover method), however it does not provide protection from multiple failures. Some possible extreme designs might be to:

  1. Add a standby node after every 'n' nodes (where n<=2).This I believe can be accomplished using the host group concept in HANA.
  2. Create a complete isolated 'n' to 'n' HA system, i.e. if primary has 4 nodes, create another 4 nodes of secondary system and setup system replication. This feels like an expensive solution for HA (maybe okay for DR).

Please note: I am ignoring the disk based replication and recovery on purpose because the scope is truly HA where quick system availability after an incidence is most important. That certainly again feels like a good option for DR.


Now coming to the question:

Can someone share any insights on how you have setup HA for your scale-out systems? What do you like about it and what you think is difficult to operate with it?


Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Hi Mangesh,

We have been using a HANA Distributed setup with (3+1 compute nodes) for Host failover

HANA System Replication for High Availability(SYNCMEM) and Disaster Recovery(ASYNC) for BW on HANA

Our experience on Host failover has been encouraging so far we had a couple of occassions since the GO Live when the Indexserver service on one of the HANA compute nodes went into a HUNG state but that did not trigger any failover as the nameserver and daemon processes were never interrupted, as such we have not had any HOST failover situation so far

As far as the System Replication goes we have been facing some challenges with the ASYNC Replication to the secondary datacenter for Disaster recovery

Part of the problem could be the network setup between primary and secondary datacenter that cannot accomodate the replication traffic

We have had situations where the System Replication in ASYNC Mode had to be stopped to avoid performance impact on Production system

Hope this helps and provides some insights into the different options

Answers (0)