Automate SAP HANA System Replication with SLES for SAP Applications
SAP HANA System Replication on SLES for SAP Applications
You like directly to start with?
If you like to know, how to implement the solution including SUSE Linux Enterprise for SAP Applications, please read our setup guide available at: https://www.suse.com/products/sles-for-sap/resource-library/sap-best-practices.html
What is this solution about?
The new solution created by SUSE is to automate the takeover in SAP HANA system replication setups.
The basic idea is that only synchronizing the data to the second SAP HANA instance is not enough, as this only solves the problem of having the data shipped to a second instance. To increase the availability you need a cluster solution, which controls the takeover of the second instance as well as providing the service address for the client access to the database.
|The first picture shows a SAP HANA "PR1" in system replication setup. The left node has the primary SAP HANA instance which means that this one is the instance clients should access for read/write actions.|
The second picture shows what happens first, when either the node1 or the instance on that node is failing. The setup has now a "broken" SAP HANA primary and of course also the synchronization to the second node is stopped.
Picture 3 shows the cluster's first reaction: The secondary will be switched into a primary and in addition this new primary will be configured as new source of the system replication. Because of node1 or its' is SAP HANA instance is still down, the synchronization is not in "active" mode.
Picture 4 shows the situation, when node1 (or it's) SAP HANA instance is back. Depending on the resource parameters the cluster registers the former primary to be the new secondary and the system replication begins to work.
If you do not like the cluster to proceed an automated registration of the former primary, you could change the resource parameters and the cluster will keep the "broken"/former primary in shutdown status. This could make sense, if administrators first like to figure out what happened at this instance in detail or for other operating aspects.
When the automated registration is switched off, the administrator could register the former primary at any time. The cluster resource agent will detect this new status during the following monitor action.
What are the current limitations?
For the first version of the SAPHanaSR resource agent software package we limit the support to the following scenarios and parameters:
- Two-node clusters - we have tested also 3 node clusters, but the best practice guides are focusing on 2 nodes
- The cluster must include a valid STONITH method like SBD or IPMI
- Scale-Up (single-box to single-box) system replication
- Both nodes are in the same network segment (layer 2) - in some environments like AWS there are other solutions which fulfil this requirement
- Technical users and groups such as sidadm are defined locally in the Linux system
- Name resolution of the cluster nodes and the virtual IP address can be done locally on all cluster nodes
- Time synchronization between the cluster nodes (like NTP)
- For the performance-optimized scenario there is no other SAP HANA system (like QAS or TST) on the replicating node which needs to be stopped during takeover. The cost-optimized scenario is explained in my article: HOW TO SET UP SAPHanaSR IN THE COST OPTIMIZED SAP HANA SR SCENARIO - PART I
- Only one system replication for the SAP HANA database. You could also implement Multi SID set-ups, if you have a SAP HANA revision > 090 and the resource agent version >= 0.152. The best practice guides are focusing on a single pair. To automize more than one SAP HANA pair you need to limit the resource consumption of the SAP HANA RDBMS and the pairs need to be strictly independent in the cluster setup to allow a seperate takeover per pair.
- Both SAP HANA instances have the same SAP Identifier (SID) and Instance-Number
- If the cluster nodes are installed in different data centers or data center areas, the environment must match the requirements of the SLE HAE cluster product. This in special means the network latencies between the nodes and the recommended maximum distance. Please review our product documentation for SLE HAE about those recommendations.
- As a good starting configuration for PoCs and projects we recommend to switch-off the automated registration of a failed primary. The setup AUTOMATED_REGISTER=”false” is also the default since version 0.139.
- The SAPHOSTAGENT must be installed on both nodes.
Note: Please note that without a valid STONITH method the complete cluster is unsupported and will not work properly.
If you need to implement a different scenario we strongly recommend to define a PoC with SUSE. This PoC will focus in testing the existing solution in your scenario. The limitation of most of the above items is mostly due to testing limits.
The resource agent supports SAP HANA in System replication beginning with SAP HANA version 1.0 SPS 7 patch level 70.
Beside SAP HANA you need SAPHOSTAGENT to be installed on your system. Automated start of SAP HANA instances during system boot must be switched off.
Which general steps are needed to get this solution into operation?
- Install two systems using SUSE Linux Enterprise Server for SAP Applications. For update levels, package solution and so on see also SAP notes like 1310037, 1944799 and 1855805.
- Install SAP HANA on both nodes using the same SID and instance number. Please refer to the SAP installation guides for details.
- Check, if SAPHOSTAGENT is installed on both nodes. If not please install that software on both nodes.
- Create a user in the database and user store keys at Linux level, so the resource agent could check the SAP HANA internal synchronization status. This step is described in SAP HANA administration manuals as well as in our setup guide (Automate your SAP HANA System Replication Failover | SUSE).
- Perform a database backup on the node that will hosts the primary side and enable the system replication at this instance.
- Register the other SAP HANA instance to be the secondary.
- Install the SUSE pattern for high availability cluster and install the new resource agents.
- Setup the cluster base configuration on the first node and make it available on the second node.
- First tuning of the cluster bootstrap parameters.
- Do not forget to setup a STONITH method.
- Simply call our HAWK configuration wizard, enter the SID, the instance number and the IP address. An other method is to use crmsh, the command line interface (CLI) of our cluster solution.
- Enjoy the cluster controlling your SAP HANA instance in scale-up scenario.
Are there already customers running this solution?
Yes, but however at this point in time I could not name the customers as a reference, but we already have customer and partner installations.
Additional content in SCN