cancel
Showing results for 
Search instead for 
Did you mean: 

EP server nodes restarting

Former Member
0 Kudos

Dear Gurus,

We are using NW2004s EP7.0 system on SP21. Kernel is Release700 SP236

We were experiencing java performance issue and the system was dead slow in its response.

Initially we had single server node with Heap size of 1GB.

To deal with the problem we added two more Java server nodes and assigned 1280MB and 1536MB respectively.

Since then the server is running fine but the nodes 2 and 3 are restarting very frequently.

Idx

Name

PID

State

Error

Restart

---






0

dispatcher

7494

Running

0

yes

1

server0

7495

Running

0

yes

2

server1

29075

Starting

1

yes

3

server2

29268

Starting

0

yes

4

SDM

7498

Running

0

yes

-


As far as I know heap memory is reason that can cause the node to restart due to insufficient memory.

To deal with any possible java I copied all the parameters from node1 to node2 and node three.

Please check below the node properties of the restarting nodes..

-Djava.security.policy=./java.policy

-Djava.security.egd=file:/dev/urandom

-Dorg.omg.CORBA.ORBClass=com.sap.engine.system.ORBProxy

-Dorg.omg.CORBA.ORBSingletonClass=com.sap.engine.system.ORBSingletonProxy

-Dorg.omg.PortableInterceptor.ORBInitializerClass.com.sap.engine.services.ts.jts.ots.PortableInterceptor.JTSInitializer

-Djavax.rmi.CORBA.PortableRemoteObjectClass=com.sap.engine.system.PortableRemoteObjectProxy

-Djco.jarm=1

-XX:MaxPermSize=512M

-XX:PermSize=512M

-Xmx1536m

-Xms1536M

-XX:NewSize=307M

-XX:MaxNewSize=307M

-XX:+DisableExplicitGC

-verbose:gc

-Xloggc:GC.log

-XX:+PrintGCDetails

-XX:+PrintGCTimeStamps

-Djava.awt.headless=true

-Dsun.io.useCanonCaches=false

-XX:SoftRefLRUPolicyMSPerMB=1

-XX:SurvivorRatio=2

-XX:TargetSurvivorRatio=90

-XX:+UseParNewGC

-XX:+PrintClassHistogram

-XX:ReservedCodeCacheSize=64M

-XX:CodeCacheMinimumFreeSpace=2M

*********************************************

I am not able to find any satisfactory log entries in thread dump trace/default logs.

This is happening every 1/2 an hour.

Please suggest.

Regards

Vishal

Accepted Solutions (0)

Answers (1)

Answers (1)

sunny_pahuja2
Active Contributor
0 Kudos

Hi.

Do you have sufficient amount of resources on the server to cater resources for 2 more server nodes ? No. of server nodes in a server can be calculated with below formula:

#ServerNodes = (AvailableMemory / 1.5 GB)

Please also check whether you have set all parameters in accordance with SAP note 723909.

Thanks

Sunny

Former Member
0 Kudos

Thanks Sunny... The system has free memory and CPU. I had already checked the node and have kept in mind the prerequisites required for adding additonal server nodes.

today morning I found one of the server nodes stopped.

in std_server2.out file I found the following log entries. please check if it makes any sense.

stdout/stderr redirect

-


node name : server2

pid : 7354

system name : WEP

system nr. : 00

started at : Wed Feb 23 10:38:02 2011

[Thr 1] Wed Feb 23 10:38:03 2011

[Thr 1] MtxInit: 10003 0 0

SAP J2EE Engine Version 7.00 PatchLevel 76340.450 is starting...

Loading: LogManager ... 837 ms.

Loading: PoolManager ... 4 ms.

Loading: ApplicationThreadManager ... 121 ms.

Loading: ThreadManager ... 38 ms.

Loading: IpVerificationManager ... 15 ms.

Loading: ClassLoaderManager ... 17 ms.

Loading: ClusterManager ... [Framework -> criticalShutdown] Exiting Listener Loop. This requires a restart of the node. Possible reason is an interrupted reconnect sess

ion to the message server.

Feb 23, 2011 10:38:09... com.sap.engine.core.Framework [SAP J2EE Engine|MS Socket Listener] Fatal: Critical shutdown was invoked. Reason is: Exiting Listener

Loop. This requires a restart of the node. Possible reason is an interrupted reconnect session to the message server.

Feb 23, 2011 10:38:09... ...anagerImpl.init(java.util.Properties) [Thread[Thread-1,5,main]] Fatal: Cannot attach to the Message Server. cluster Id is not unique.

Loading: ClusterManager returned false!

Kernel not loaded. System halted.

Regards,

Vishal

sunny_pahuja2
Active Contributor
0 Kudos

Hi,

> Loop. This requires a restart of the node. Possible reason is an interrupted reconnect session to the message server.

> Feb 23, 2011 10:38:09... ...anagerImpl.init(java.util.Properties) [Thread[Thread-1,5,main]] Fatal: Cannot attach to the Message Server. cluster Id is not unique.

>

> Loading: ClusterManager returned false!

> Kernel not loaded. System halted.

>

Your cluster id is different. Please check below link:

Thanks

Sunny