Skip to Content

HOW TO SET UP SAPHanaSR IN THE COST OPTIMIZED SAP HANA SR SCENARIO - PART II

HOW TO SET UP SAPHanaSR IN THE COST OPTIMIZED SAP HANA SR SCENARIO

Beside this SCN document we also have a more detailed best practice online at: Best Practices - Resource Library | SAP Applications | SUSE

To make this article a bit more easy to edit, I have split the article into 3 parts:

4  Setup and configure SUSE Linux Enterprise High Availability

This section of the article describes the configuration of the cluster framework SUSE Linux Enterprise High Availability Extension, which is part of the SUSE Linux Enterprise Server for SAP Applications and SAP HANA Database Integration.

4.1  Install the HA software packages and SAPHanaSR resource agents on both cluster nodes

If not already done, install the software packages with YaST. Alternatively, you can install them from the command line with zypper. This has to be done on both nodes.

suse01:~> zypper in -t pattern ha_sles

suse01:~> zypper in SAPHanaSR

suse01:~> zypper in ClusterTools2  # optional

For more information please read Section 3.2.2 Initial Cluster Setup of SUSE Linux Enterprise High Availability Extension

4.2  Configure cluster communication on the first node

First we have to setup the base cluster framework. For convenience use YaST2 or the sleha-init (on node1) and sleha-join (on node2) scripts. These scripts create a working configuration proposal, but for production you had to customize this configuration. It is although strongly recommended to add a second cluster communication ring. The cluster communication protocol should be changed to unicast (UCAST) and the timeout values must be adjusted according to your network characteristics.

We call sleha-init to create a cluster configuration on node 1 (suse01). After adapting the configuration we will run sleha-join on node 2 (suse02)  to complete the basic cluster configuration.

To configure the cluster stop the first node and change the cluster communication settings with your preferred editor. These changes are activated when we start the cluster again.

suse01:~ # rcopenais stoppin

suse01:~ # vi /etc/corosync/corosync.conf

The following sample configuration for /etc/corosync/corosync.conf shows how to add a second ring and switch from multicast to unicast (UCAST):

totem {

    ...

    interface {

          ringnumber:    0

        #Network Address to be bind for this interface setting

        bindnetaddr:    192.168.1.0

        member {

            memberaddr:    192.168.1.10

        }

        member {

            memberaddr:    192.168.1.11

        }

        #The multicast port to be used

        mcastport:      5405

        #The ringnumber assigned to this interface setting

    }

      interface {

          ringnumber:    1

        #Network Address to be bind for this interface setting

        bindnetaddr:    192.168.2.0

        member {

            memberaddr:    192.168.2.10

        }

        member {

            memberaddr:    192.168.2.11

        }

        #The multicast port to be used

        mcastport:      5415

        #The ringnumber assigned to this interface setting

    }

    #How many threads should be used to encypt and sending message.

    threads:        4

    #

    transport:      udpu

    #How many token retransmits should be attempted before forming

    # a new configuration.

    token_retransmits_before_loss_const:    10

    #To make sure the auto-generated nodeid is positive

    clear_node_high_bit:    new

    ...

}

4.2.1  Configure SBD for STONITH fencing

You can skip this section if you do not have any SBD devices, but be sure to implement an other supported fencing mechanism. If you use the latest version of the pacemaker packages from the SUSE maintenance-channels, the -P option (Check Pacemaker quorum and node health) is available, which enables the cluster nodes not to self-fence if SBDs are lost, but pacemaker communication is still available. The timeout parameter -t defines the reprobe intervall for the SBD device in seconds. It is highly recommended to configure a watchdog for SDB devices to protect against failures of the SBD process itself. Please refer to the SLES manual for setting up a watchdog.

Replace /dev/disk/by-id/SBDA and SBDB by your SBD device names in the example configuration file.

# /etc/sysconfig/sbd

SBD_DEVICE="/dev/disk/by-id/SBDA;/dev/disk/by-id/SBDB"

SBD_OPTS="-W -S 1 -P -t 300"

4.2.2  Verify the SBD device

You can skip this section, if you do not have any SBD devices, but be sure to implement an other supported fencing mechanism..

It's a good practice to check, if the SBD device could be accessed from both nodes and does contain valid records. Check this for all devices configured in /etc/sysconfig/sbd.

suse01:~ # sbd -d /dev/disk/by-id/SBDA dump

==Dumping header on disk /dev/disk/by-id/SBDA

Header version    : 2.1

UUID              : 0f4ea13e-fab8-4147-b9b2-3cdcfff07f86

Number of slots    : 255

Sector size        : 512

Timeout (watchdog) : 20

Timeout (allocate) : 2

Timeout (loop)    : 1

Timeout (msgwait)  : 40

==Header on disk /dev/disk/by-id/SBDA is dumped

The timeout values in our sample are only common values, which need to be tuned in your environment.

To check the current SBD entries for the various cluster nodes you can use “sbd list”. If all entries are “clear” no fencing task is marked in the SBD device.

suse01:~ # sbd -d /dev/disk/by-id/SBDA list

0      suse01  clear

1      suse02  clear

For more information on SBD configuration parameters, please read section 17.1.3 of SUSE Linux Enterprise High Availability Extension.

Now it's time to start the cluster on node 1 (suse01) again (rcopenais start).

4.3  Configure cluster communication on the second node

After we have adapted the cluster configuration on node 1 and prepared the SBD fencing (if you use this type of STONITH device) we can join node 2 (suse02) to the cluster. The procedure will inherit all changes from node 1:

suse02:~ # sleha-join

suse02:~ # rcopenais status

4.4  Basic pacemaker configuration

After we have completed the cluster communication and have two active nodes in our cluster we can proceed with the the pacemaker cluster configuration:

node suse01 \

    description="PRDp"

node suse02 \

    description="PRDs and QAS"

property $id="cib-bootstrap-options" \

        no-quorum-policy="ignore" \

        stonith-enabled="true" \

        stonith-action="reboot" \

        stonith-timeout="150s"

rsc_defaults $id="rsc-options" \

        resource-stickiness="1000" \

        migration-threshold="5000"

op_defaults $id="op-options" \

        timeout="600"

To add resources, constraints or other definitions you should create a text file containing the crm definition and load the definition to the cluster:

suse01:~ # vi crm-bs.txt

...

suse01:~ # crm configure load update crm-bs.txt

4.5  Setup the STONITH fence mechanism

4.5.1  Using SBD as fencing mechanism

The integration of the SBD STONITH mechanism into the pacemaker cluster configuration is quite easy. Just add the following crm resource definition to your cluster:

primitive stonith-sbd stonith:external/sbd \

    op start interval="0" time="15" start-delay="5"

To add resources, constraints or other definitions you should create a text file containing the crm definition and load the definition to the cluster:

suse01:~ # vi crm-sbd.txt

...

suse01:~ # crm configure load update crm-sbd.txt

4.5.2  Using IPMI as fencing mechanism

Instead of using SBD for fencing you can use remote management boards compatible with the IPMI standard. Our cluster configuration example below shows the usage of the IPMI STONITH devices.

For further information about fencing, see SUSE Linux Enterprise High Availability Guide.

For the IPMI based fencing you need to configure a primitive per cluster node. Each resource is responsible to fence exactly one cluster node. You need to adapt the IP addresses and login user / password of the remote management boards to the STONITH resource agent. We recommend to create a special STONITH user instead of providing root access to the management board. The location rules defining that a host should never run its own STONITH resource are:

primitive rsc_suse01_stonith stonith:external/ipmi \

    params hostname="suse01" ipaddr="10.1.1.210" userid="stonith" passwd="k1llM3" interface="lanplus" \

    op monitor interval="1800" timeout="30"

primitive rsc_suse02_stonith stonith:external/ipmi \

    params hostname="suse02" ipaddr="10.1.1.211" userid="stonith" passwd="k1llM3" interface="lanplus" \

    op monitor interval="1800" timeout="30" \

location loc_suse01_stonith rsc_suse01_stonith  -inf: suse01

location loc_suse02_stonith rsc_suse02_stonith  -inf: suse02

4.5.3  Using other STONITH fencing mechanisms

We recommend to use SBD (best practice) or IPMI (second choice) as STONITH mechanism. The SUSE Linux Enterprise High Availability product also supports additional fencing mechanism not covered here.

For further information about fencing, see SUSE Linux Enterprise High Availability Guide.

5  Integrating the Control of the HANA databases into the Cluster

5.1  Setup of cluster resources for SAPHanaSR

Follow the setup guide of SAPHanaSR.

Remark: For SAP HANA SPS9+ and SAPHanaSR 0.152+ you can skip all activities to create SAP HANA database users and secure store keys for the productive system. The system controlled via SAPDatabase and SAPHOSTAGENT still needs the user secure store key.

This snippet describes the cluster configuration part for the SAP HANA productive database (SLE) and the virtual IP address for the client access: We refer to the cluster nodes as suse01 and suse02 respectively:

primitive rsc_ip_SLE_HDB00 ocf:heartbeat:IPaddr2 \

    meta target-role="Started" \

    op monitor interval="10s" timeout="20s" \

    params ip="192.168.1.100"

primitive rsc_SAPHanaTopology_SLE_HDB00 ocf:suse:SAPHanaTopology \

    op monitor interval="10" timeout="30" start_delay="10" \

    op start interval="0" timeout="600" \

    op stop interval="0" timeout="600" \

    params SID="SLE" InstanceNumber="00"

primitive rsc_SAPHana_SLE_HDB00 ocf:suse:SAPHana \

    op monitor interval="60" role="Master" timeout="3600" \

    op monitor interval="61" role="Slave" timeout="3600" \

    op promote interval="0" timeout="3600" \

    op start interval="0" timeout="3600" \

    op stop interval="0" timeout="3600" \

    params SID="SLE" InstanceNumber="00" \

      PREFER_SITE_TAKEOVER="false" \

      DUPLICATE_PRIMARY_TIMEOUT="7200" \

      AUTOMATED_REGISTER="false"

ms msl_SAPHana_SLE_HDB00 rsc_SAPHana_SLE_HDB00 \

    meta clone-max="2" clone-node-max="1" master-max="1" \

      interleave="true" priority="1000"

clone cln_SAPHanaTopology_SLE_HDB00 rsc_SAPHanaTopology_SLE_HDB00 \

    meta clone-node-max="1" target-role="Started" interleave="true"

colocation col_saphana_ip_SLE_HDB00 3000: rsc_ip_SLE_HDB00 \

      msl_SAPHana_SLE_HDB00:Master

order ord_SAPHana_SLE_HDB00 Optional: cln_SAPHanaTopology_SLE_HDB00 \

      msl_SAPHana_SLE_HDB00

5.1.1  Add the cluster resource for the non-productive SAP HANA database

This snippet describes the cluster configuration part for the SAP HANA QAS database:

We refer to the cluster nodes as suse01 and suse02 respectively:

primitive rsc_SAP_QAS_HDB10 ocf:heartbeat:SAPDatabase \

    params DBTYPE="HDB" SID="QAS" \

    MONITOR_SERVICES="hdbindexserver|hdbnameserver" \

    op start interval="0" timeout="600" \

    op monitor interval="120" timeout="700" \

    op stop interval="0" timeout="300" \

    meta priority="100"

5.1.2  Adding the cluster rules for the automatic shutdown of SAP HANA QAS

We refer to the cluster nodes as suse01 and suse02 respectively:

location loc_QAS_never_suse01 rsc_SAP_QAS_HDB10 -inf: suse01

colocation col_QAS_never_with_PRDip -inf: rsc_SAP_QAS_HDB10:Started rsc_ip_SLE_HDB00

order ord_QASstop_before_PRDpromote inf: rsc_SAP_QAS_HDB10:stop msl_SAPHana_SLE_HDB00:promote

The next step is now to running cluster tests.