cancel
Showing results for 
Search instead for 
Did you mean: 

SAP System in OS Cluster -- failover test

Former Member
0 Kudos

Hi,

We have installed PI PROD system is in OS cluster(Linux Cluster) [ mount points shifts to failover node if primary is down]

We have (DB+CI) instance on Primary node.

When we are making primary node down ,we are not able to make the system up and running in the fail over node.

The data is on SAN and both primary and fail over nodes have connection to SAN.

Can any one help regarding the prerequisites from SAP BASIS end to make the system up and runnign in the secondary node?

Thanks and Regards,

Moulinath Ray

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

all directory for sap(sapmnt, database,/usr/sap) must be mounted in the other node when the fail over event occured and don't forget to adjust hostname and ip config.

same user exists in the other note(same UID and GID) and copy the content of the home directory of each SAP related user(<sid>adm, ora<sid>, sap<sid>)from primary one.

some files need to be copied manually, search the HA guide for current platform

if you already performed 1st and 2nd step try running database manually, logon as database user, start the database. see if it run and what error occured if not.

some HA platform already provided specific solution for SAP software which will be more secure and easier to be implemented.

Former Member
0 Kudos

Hi,

We are stuck at the very early stage.

Our SAP application is up and runnning on the primary node.At this situation, when we are shutting down the primary node , the mount points ( which is named as a service in the cluster) are not shifted in the secondary node. My Linux team is saying that the cluster is not able to shift the mount points as it finds some process (e.g. sapstartsrv) running in the primary node.

When the primary node is shut down completely, Linux admin executed the following commands

clusvcadm -d

clusvcadm -e

After this , the mount points are showing in the fail over node.

From BASIS side, what can we do so that , when the primary node is shutting down , the mount points are automatically shifted to the failover node. Or, can the current problem happen due to improper cluster configuration by the OS team?

Regards,

Moulinath

Former Member
0 Kudos

As sapstartsrv is running by default, but has to be shut down when switching over the file system, you have to force the switching operation. This can be done by the parameter "force_unmount="yes"" that is added to the "fs" resource type in your cluster config. The "force_unmount" flag shuts down any process that runs on the file system that wants to switch.

Example:

<fs name="fs_rhc_ascs" mountpoint="/usr/sap/RHC/ASCS00" device="/dev/vg_rhc_ext3/lv_ascs" fstype="ext3" force_unmount="yes"/>

Matthias

Former Member
0 Kudos

Hi,

Thanks for the input.

As the running processes belonged to /usr/sap/<SID> and /sapmnt/<SID> , we did forced unmounting for these two file ssytems only.After that we encounetred another problem. While executing stopsap or startsap command , the file system service ( in which all mount points were there) was failing and the mount points were unmounted.

we reverted back the change and rebooted the primary and secondary nodes. Now all the mount points were showing in the primary node.But again when we executed startsap command, the mount points were unmounted and the file system service failed.

What can be the problem?

I am attaching here the cluster configuration file .

Thanks and Regards,

Moulinath

Edited by: Moulinath Ray on Jul 15, 2011 2:57 PM

[root@piproddbpr ~]# cat /etc/cluster/cluster.conf

<?xml version="1.0"?>

<cluster alias="sappicluster" config_version="228" name="sappicluster">

<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>

<clusternodes>

<clusternode name="db2.example.com" nodeid="1" votes="1">

<fence>

<method name="1">

<device lanplus="" name="db2-rsa"/>

</method>

</fence>

<multicast addr="239.192.212.137"/>

</clusternode>

<clusternode name="db1.example.com" nodeid="2" votes="1">

<fence>

<method name="1">

<device lanplus="" name="db1-rsa"/>

</method>

</fence>

<multicast addr="239.192.212.137"/>

</clusternode>

</clusternodes>

<cman expected_votes="1" two_node="1">

<multicast addr="239.192.212.137"/>

</cman>

<fencedevices>

<fencedevice agent="fence_ipmilan" ipaddr="10.251.248.30" login="USERID" name="db1-rsa" passwd="PASSW0RD"/>

<fencedevice agent="fence_ipmilan" ipaddr="10.251.248.20" login="USERID" name="db2-rsa" passwd="PASSW0RD"/>

</fencedevices>

<rm>

<failoverdomains>

<failoverdomain name="sappidomain1" nofailback="0" ordered="1" restricted="1">

<failoverdomainnode name="db2.example.com" priority="2"/>

<failoverdomainnode name="db1.example.com" priority="1"/>

</failoverdomain>

</failoverdomains>

<resources>

<ip address="10.251.16.70" monitor_link="1"/>

<fs device="/dev/mapper/pivg1-db2event" force_fsck="0" force_unmount="0" fsid="30712" fstype="ext3" mountpoint="/db2/WPX/db2event" name="db2event" self_fence="0"/>

<fs device="/dev/mapper/pivg1-sqllib" force_fsck="0" force_unmount="0" fsid="48598" fstype="ext3" mountpoint="/db2/WPX/sqllib" name="sqllib" self_fence="0"/>

<fs device="/dev/mapper/pivg1-acs" force_fsck="0" force_unmount="0" fsid="24807" fstype="ext3" mountpoint="/db2/WPX/acs" name="acs" self_fence="0"/>

<fs device="/dev/mapper/pivg1-db2dump" force_fsck="0" force_unmount="0" fsid="23558" fstype="ext3" mountpoint="/db2/WPX/db2dump" name="db2dump" self_fence="0"/>

<netfs export="/sapdump" force_unmount="0" fstype="nfs" host="10.251.21.60" mountpoint="/sapdump" name="sapdump" options=""/>

<fs device="/dev/mapper/pivg1-WPX1" force_fsck="0" force_unmount="0" fsid="50548" fstype="ext3" mountpoint="/sapmnt/WPX" name="sapmntWPX" options="" self_fence="0"/>

<fs device="/dev/mapper/pivg1-WPX2" force_fsck="0" force_unmount="0" fsid="63724" fstype="ext3" mountpoint="/usr/sap/WPX" name="usrsapWPX" options="" self_fence="0"/>

<fs device="/dev/mapper/pivg1-db2WPX" force_fsck="0" force_unmount="0" fsid="35307" fstype="ext3" mountpoint="/db2/WPX/db2WPX" name="WPXdb2WPX" self_fence="0"/>

<fs device="/dev/mapper/pivg1-db2WPX1" force_fsck="0" force_unmount="0" fsid="17487" fstype="ext3" mountpoint="/db2/db2WPX" name="db2WPX" self_fence="0"/>

<fs device="/dev/mapper/pivg1-WPX3" force_fsck="0" force_unmount="0" fsid="26037" fstype="ext3" mountpoint="/db2/WPX" name="WPX" self_fence="0"/>

<clusterfs device="/dev/mapper/pivg1-trans" force_unmount="0" fsid="45014" fstype="gfs" mountpoint="/usr/sap/trans" name="usrsaptrans" self_fence="0"/>

<nfsexport name="nfsexport"/>

<nfsclient allow_recover="0" name="nfsclientqa" options="rw" target="10.251.21.66"/>

<nfsclient allow_recover="0" name="nfsclientdev" options="rw" target="10.251.21.60"/>

<fs device="/dev/mapper/pivg1-sapdata1" force_fsck="0" force_unmount="0" fsid="55345" fstype="ext3" mountpoint="/db2/WPX/sapdata1" name="sapdata1" self_fence="0"/>

<fs device="/dev/mapper/pivg1-sapdata2" force_fsck="0" force_unmount="0" fsid="35211" fstype="ext3" mountpoint="/db2/WPX/sapdata2" name="sapdata2" self_fence="0"/>

<fs device="/dev/mapper/pivg1-sapdata3" force_fsck="0" force_unmount="0" fsid="47389" fstype="ext3" mountpoint="/db2/WPX/sapdata3" name="sapdata3" self_fence="0"/>

<fs device="/dev/mapper/pivg1-sapdata4" force_fsck="0" force_unmount="0" fsid="11454" fstype="ext3" mountpoint="/db2/WPX/sapdata4" name="sapdata4" self_fence="0"/>

<fs device="/dev/mapper/pivg1-sapdata5" force_fsck="0" force_unmount="0" fsid="7208" fstype="ext3" mountpoint="/db2/WPX/sapdata5" name="sapdata5" self_fence="0"/>

<fs device="/dev/mapper/pivg1-log_dir" force_fsck="0" force_unmount="0" fsid="21451" fstype="ext3" mountpoint="/db2/WPX/log_dir" name="log_dir" self_fence="0"/>

<fs device="/dev/mapper/pivg1-log_archive" force_fsck="0" force_unmount="0" fsid="52748" fstype="ext3" mountpoint="/db2/WPX/log_archieve" name="log_archieve" self_fence="0"/>

<fs device="/dev/mapper/pivg1-log_retrieve" force_fsck="0" force_unmount="0" fsid="59268" fstype="ext3" mountpoint="/db2/WPX/log_retrieve" name="log_retrieve" self_fence="0"/>

</resources>

<service autostart="1" domain="sappidomain1" exclusive="0" name="ip" recovery="relocate">

<ip ref="10.251.16.70"/>

</service>

<service autostart="1" domain="sappidomain1" name="nfsclient_sapdump" recovery="relocate">

<netfs ref="sapdump"/>

</service>

<service autostart="1" domain="sappidomain1" exclusive="0" name="nfstrans" recovery="relocate">

<clusterfs ref="usrsaptrans">

<nfsexport ref="nfsexport">

<nfsclient ref="nfsclientqa"/>

<nfsclient ref="nfsclientdev"/>

</nfsexport>

</clusterfs>

</service>

<service autostart="1" domain="sappidomain1" exclusive="0" name="filesystem" recovery="relocate">

<fs ref="db2event"/>

<fs ref="sqllib"/>

<fs ref="acs"/>

<fs ref="db2dump"/>

<fs ref="sapmntWPX"/>

<fs ref="usrsapWPX"/>

<fs ref="WPXdb2WPX"/>

<fs ref="db2WPX"/>

<fs ref="WPX"/>

<fs ref="sapdata1"/>

<fs ref="sapdata2"/>

<fs ref="sapdata3"/>

<fs ref="sapdata4"/>

<fs ref="sapdata5"/>

<fs ref="log_dir"/>

<fs ref="log_archieve"/>

<fs ref="log_retrieve"/>

</service>

</rm>

</cluster>

Former Member
0 Kudos

Hi,

Here is the cluster config file

[root@piproddbpr ~]# cat /etc/cluster/cluster.conf

xml removed, it was too long to display correctly

Edited by: Matthias Schlarb on Jul 15, 2011 3:21 PM

Edited by: Matthias Schlarb on Jul 15, 2011 3:24 PM

Former Member
0 Kudos

Hello,

you have only resources of type "fs" in your configured services and you have all of the file systems in one service group. I don't know what kind of approach this should be. Where do you configure your "SAPInstance" and "SAPDatabase" resource types to be operated by the service? Why do you not split DB, ASCS, CI, etc.? Where are the virtual IP addresses configured? Please read the above mentioned white paper regarding Red Hat Cluster Suite carefully.

If you run into troubles, I'll always need the error message you receive.

Kind regards,

Matthias

Answers (3)

Answers (3)

Former Member
0 Kudos

Please read the implementation guides for SAP HA on Linux.

SUSE Linux Enterprise 11:

http://www.novell.com/saptechdocs

([direct link|http://www.novell.com/docrep/2011/04/sap_on_sles11_simple_stack.pdf])

Red Hat Enterprise Linux 5:

http://www.redhat.com/sap

([direct link|http://www.redhat.com/f/pdf/ha-sap-v1-6-4.pdf])

Former Member
0 Kudos

did you problem solved? if rhcs in use then are you using the web based interface-conga? use cluster administration pdf provided by red hat site provided general config for the cluster. and this www.redhat.com/f/pdf/ha-sap-v1-6-4.pdf will provide you general sap HA configuration. use gfs for the file system and set it so it can be read by two system, the storage cluster system will automatically adjust locks on its own.

Former Member
0 Kudos

Hi,

Are you using a VCS cluster or else which cluster service is being used ?

Former Member
0 Kudos

You have not given any logs and just an overview of error which mean bit difficult to pinpoint the issue.

First thing I want to ask whether you have gone through the installation guide and performed all pre-req? Also the cluster is correctly configured by UNIX admin? As you have not mentioned the Linux version and flavor please check whether its supported or not?

UNIX admin should go through the cluster log and should explain whats going wrong and why the FS not getting transferred to another node.