cancel
Showing results for 
Search instead for 
Did you mean: 

As Java HA windows failover test

Former Member
0 Kudos

hi everyone,

i have one small question while i'm testing a HA failover test scenario.

I have installed a NW7.5 java stack on Windows, and configured two nodes for HA, each node has a java instance. when i kill the msg_server.exe process, i think the message server should be automatically restart and it did restart, but after restart two java instance came into yellow state, and after 3 minutes it came backup green.

but when we switch over the SCS from one node to another node manually, those java instance are green all the time.

so may i ask is it normal after we kill msg_server process, java instance will restart ?

there are some logs found in work directory:

########################################

Dev_datcol

F  ********************************************************************************

F  Process datcol started with pid 22040

F  ********************************************************************************

F  [Thr 22388] *** LOG => Process datcol started (pid 22040).

F [Thr 22388] Fri Aug 12 15:27:27 2016

F  [Thr 22388] *** LOG => Process datcol stopping (pid 22040).

F [Thr 20752] Fri Aug 12 15:27:27 2016

F  [Thr 20752] *** LOG => Signal 13 SIGCHLD.

F  [Thr 22388] *** LOG => Process datcol stopped (pid 22040).

F  [Thr 22388] *** LOG => exiting (exitcode 0, retcode 0).

Jvm_datcol

Aug 12, 2016 3:27:27 PM com.sap.engine.datcol.Task error

SEVERE: while trying to get the length of a null array loaded from a local variable at slot 6

java.lang.NullPointerException: while trying to get the length of a null array loaded from a local variable at slot 6

at com.sap.engine.datcol.internal.Scanner.scanPattern(Scanner.java:67)

at com.sap.engine.datcol.internal.Scanner.scan(Scanner.java:31)

at com.sap.engine.datcol.tasks.Copy.execute(Copy.java:50)

at com.sap.engine.datcol.Task.perform(Task.java:96)

at com.sap.engine.datcol.internal.DataSet.execute(DataSet.java:28)

at com.sap.engine.datcol.internal.DataCollectorApp.run(DataCollectorApp.java:191)

at com.sap.engine.datcol.internal.DataCollectorApp.main(DataCollectorApp.java:31)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at com.sap.engine.offline.OfflineToolStart.main(OfflineToolStart.java:162)

Dev_icm

[Thr 11044] Fri Aug 12 15:26:54 2016

[Thr 11044] JNCMReconnectAsync: successfully reconnected to message server. Waiting 30 sec for consistent cluster.

[Thr 11044] Fri Aug 12 15:27:15 2016

[Thr 11044] *** WARNING => P4RecvHandShake: read failed: NIECONN_BROKEN(-6) [p4_plg.c     3158]

[Thr 11044] *** WARNING => P4PlugInReadHandler(id=6/16781): P4RecvHandShake failed: Network error (NI)(-8) [p4_plg.c     1245]

[Thr 11044] *** WARNING => P4RecvHandShake: read failed: NIECONN_BROKEN(-6) [p4_plg.c     3158]

[Thr 11044] *** WARNING => P4PlugInReadHandler(id=1/16780): P4RecvHandShake failed: Network error (NI)(-8) [p4_plg.c     1245]

[Thr 11044] Fri Aug 12 15:27:29 2016

[Thr 11044] JNCMReconnectAsync: inconsistent cluster reconnect [2 nodes are still not connected]

[Thr 11044] JNCMIReconnectMerge: can't find node [17181751] in reconnect list -> element loss

[Thr 11044] JNCMIReconnectMerge: can't find node [17181750] in reconnect list -> element loss

[Thr 11044] JNCMIHttpMsPutLogon: set http logon port (port:50100) (lbcount: 2)

[Thr 11044] JNCMIHttpMsPutLogon: set https logon port (port:0) (lbcount: 2)

[Thr 11044] JNCMIP4MsPutLogon: set p4 logon port (port:50104) (lbcount: 2)

[Thr 11044] JNCMIIIOPMsPutLogon: set iiop logon port (port:50107) (lbcount: 2)

[Thr 11044] JNCMITelnetMsPutLogon: set telnet logon port (port:50108) (lbcount: 2)

[Thr 11044] JNCMIHttpMsPutLogon: set http logon port (port:50100) (lbcount: 1)

[Thr 11044] JNCMIHttpMsPutLogon: set https logon port (port:0) (lbcount: 1)

[Thr 11044] Fri Aug 12 15:36:12 2016

[Thr 11044] *** ERROR => can't delete node [cluster id:24168720] [jncmxx.c     2268]

Dev_server0

J Fri Aug 12 15:26:51 2016

J  Heap

J   par new generation   reserved 1397760K, committed 1397760K, used 405765K [0x00000006f0000000, 0x0000000745500000, 0x0000000745500000)

J    eden space 1048320K,  22% used [0x00000006f0000000, 0x00000006fe1a1568, 0x000000072ffc0000)

J    from space 174720K, 100% used [0x000000073aa60000, 0x0000000745500000, 0x0000000745500000)

J    to   space 174720K,   0% used [0x000000072ffc0000, 0x000000072ffc0000, 0x000000073aa60000)

J   concurrent mark-sweep generation reserved 2796544K, committed 2796544K, used 372127K [0x0000000745500000, 0x00000007f0000000, 0x00000007f0000000)

J   Metaspace       used 323690K, capacity 359164K, committed 359808K, reserved 575488K

J    class space    used 38446K, capacity 47840K, committed 48124K, reserved 262144K

F [Thr 20432] Fri Aug 12 15:26:52 2016

F  [Thr 20432] *** LOG => SfCJavaVm: exit hook is called. (rc = 11114)

F  ********************************************************************************

F  *** ERROR => Java node 'server0' terminated with exit code 11114.

F  ***

F  *** Please see section 'Java program exit codes'

F  *** in SAP Note 1316652 for additional information and trouble shooting advice.

F  ********************************************************************************

F  [Thr 20432] *** LOG => exiting (exitcode 11114, retcode 1).

M  [Thr 20432] CCMS: CCMS Monitoring Cleanup finished successfully.

Accepted Solutions (0)

Answers (1)

Answers (1)

Sriram2009
Active Contributor
0 Kudos

Hi Minas.


so may i ask is it normal after we kill msg_server process, java instance will restart ?

It will not restart start in windows failover cluster environment. You can do manually failover.

You can use the both nodes one of the node hold the SAP Java group and another one hold the database.

Refer the HA FAQ

BR

SS

Former Member
0 Kudos

Yes, normally it will not restart in manually failover or autofailover. but the the following Java parameters controls the java instance restart when enqueue server is restarted one node. When msg_server.exe is killed, MSCS will try to restart the who SCS instance, therefore SCS is restarted, and then java instance is restarted.

enqu.check.restart

Check if the enqueue server restart has taken place. This flag indicates if action should be taken if the enqueue server is restarted while the Application Server instances are running.

The value true means that the check is enabled; falsemeans that the check is disabled.

BR.

Minas

Sriram2009
Active Contributor
0 Kudos

Hi Minas.

Yes, ERS holding all the SAP table locks which normally active in both nodes in Windows MSCS. If your are restarting the ERS its will restart the MSG & SCS instances.

If you want to do the Windows failover cluster testing. you can remove the MSCS cluster network cable(Dont remove the Cluster hardbit card) from one of the active node which active in SAP group. it will failover the all Cluster resource to another node.

BR

SS