on 01-28-2009 11:10 PM
Version: PI 7.0
Our J2EE Engine fails with the following as per the defaultTrace[n].trc file:
FATAL: Caught OutofMemoryError ! Node will exit with exit code 666
java.lang.OutofMemoryError
Full thread dump ...
Upon implementing OSS note 994433 Step 1 [(a) - [(d)] (since we found that service "com.sap.aii.af.ms.svc" could not be started) we expected (according to the Note) no messages to be processed and that the J2EE Engine should now be restartable, however this is not the case. Upon complete system shutdown and restart, the message outlined above once again is experienced, and no access to J2EE services are available.
This message appears to point to one of the newer XI Scenarios, however until we can gain access to Java to remove or de-activate that Scenario, we have no way of confirming this.
We are looking for a solution to allow us to restart the J2EE Engine, or a solution to de-activate that problem XI Scenario through the SAP Gui (if that would even be possible).
Any insight or suggestions most welcome!!!
John
OSS Note 994433 follows:
25.01.2009 Page 1 of 5
Note 994433 - XI AF on J2EE Engine terminates with OutOfMemoryError
Note Language: English Version: 5 Validity: Valid from 08.10.2008
Summary
Symptom
Your J2EE Engine terminates with "java.lang.OutOfMemoryError". You are
running SAP XI Adapter Framework. Engine restart fails periodically with
the same error. The error message is written to the cluster node's
std_server<N>.out file. Example:
/usr/sap/<SID>/<Instance>/work/std_server0.out
FATAL: Caught OutOfMemoryError! Node will exit with exit code 666
java.lang.OutOfMemoryError
Full thread dump ...
Note: The "Solution" section of this OSS note is NOT relevant if an
OutOfMemoryError happens sporadically. The "Reasons and Prerequisites"
section explains possible reasons for an OutOfMemoryError when the XI
Adapter Framework used. It is not applicable for other cases.
More Terms
OutOfMemory Exception, jlaunch
Cause and Prerequisites
You are running SAP XI Adapter Framework and sending large messages through
the system. This can be a polling JDBC, File or any other sender Adapter,
or XI messages coming from the Integration Server (IS). Since the Adapter
Framework's message processing is multithreaded it can happen that messages
enter the system faster than they leave it. This causes a variable memory
load on the system. A 'large' message (>10 MB) can enter the system via
thread A in a moment where the memory consumption is low, but leave the
system in a high memory usage situation via another thread B. Thread A puts
the message into a queue, thread B picks it up from that queue.
The time that passes by between thread A and B depends on the number of
configured threads, number of messages in the queue and the overall system
performance.
Other factors like heap space fragmentation, garbage collection, the
operating system, database performance, Java mappings (triggered by JCo
calls from IS), user interaction via an UI and so on also affect the memory
load in a not predictable fashion.
This can lead to the following situations:
(a) Messaging System: One asynchronous message in database
Receiver thread A writes one large asynchronous message to the database and
puts it into the receive queue. After that a consumer thread B takes the
message and starts to process it. When thread B starts, the system cannot
allocate enough coherent memory for that message --> OutOfMemoryError.
(b) Messaging System: Several asynchronous messages in database
Several large message can arrive, while all receive consumer threads B
still work on other 'small' messages. So two or more large messages could
25.01.2009 Page 2 of 5
Note 994433 - XI AF on J2EE Engine terminates with OutOfMemoryError
get into the database/queue (in a row), but processed in parallel later -->
OutOfMemoryError.
In both cases (A) and (B) the system will restart automatically. Since
large messages were accepted, they reside in the database with a 'for
processing' status. The messaging system service will pick them up during
service startup. This causes the OutOfMemoryError again which might end in
a cyclic restart.
Note: Synchronous messages cannot harm the system like that, because they
were not persisted into the database.
(c) Polling Adapters (syncronous (BestEffort) and asyncronous
(ExactlyOnce/InOrder) messages)
A polling Adapter represents an Adapter that picks up a message from
somewhere (e.g. from file or database - Sender File/JDBC Adapter ) and
passes it to the AF Messaging System. If an OutOfMemoryError happens before
the message has reached Messaging System's persist layer (async) or queue
(sync and async), the Adapter will try to process the message again after
the next restart. This also causes periodic restarts of the J2EE server
(sync and async messages).
(d) Pushing Adapters (sync and async messages)
A pushing Adapter represents an Adapter that gets a message from somewhere
outside XI (e.g. R/3 system - RFC Adapter) and passes it to the Messaging
System. Again: If an OutOfMemoryError happens before the message has
reached Messaging System's persist layer (async) or queue (sync and async),
the sending backend system may send the message again and force periodic
restarts.
Solution
Step 0: Check startup logs.
Open the std_server.out files in the cluster node's work folder:
/usr/sap/<SID>/<Instance>/work/std_server[n].out
The startup order of the AF Services is:
Service com.sap.aii.af.cpa.svc started
Service com.sap.aii.af.security.service started
Service com.sap.aii.af.svc started
Service com.sap.aii.af.ms.svc started
Service com.sap.aii.adapter.marketplace.svc started
Service com.sap.aii.adapter.bc.svc started
Service com.sap.aii.adapter.jms.svc started
Service com.sap.aii.adapter.xi.svc started
Service com.sap.aii.adapter.jdbc.svc started
Service com.sap.aii.adapter.rfc.svc started
Service com.sap.aii.adapter.mail.svc started
Service com.sap.aii.adapter.file.svc started
Service com.sap.aii.af.ispeak.svc started
Check which one was the last successfully started service A and determine
the next service B in the list. Service B causes the OutOfMemory and cyclic
restart and it can be either the Messaging System Service
(com.sap.aii.af.ms.svc) or an Adapter (com.sap.aii.adapter.* and
25.01.2009 Page 3 of 5
Note 994433 - XI AF on J2EE Engine terminates with OutOfMemoryError
com.sap.aii.af.ispeak.svc).
If service "com.sap.aii.af.ms.svc" could not be started, this is case
(a)/(b). If this service starts, but one of the Adapter services fails,
this is case (c)/(d). Check this carefully.
Step 1 [(a)-(d)]: Set Adapter Framework service startup to 'manual'.
Start the offline config tool from shell:
/usr/sap/<SID>/<Instance>/j2ee/configtool/offlinecfgeditor.(sh/bat)
Browse to "cluster_data/server/<NodeID>/cfg/services" and switch to 'edit'
mode. Open Propertysheet "com.sap.aii.af.ra.ms-runtime" and set
startup-mode = manual. Save and restart the cluster node. Since the service
is disabled no messages are processed.
Since all Adapters have references to the Messaging System Service they
will also be stoped. Your Engine should reboot now. Logon to the Visual
Administrator to check this.
Step 2 [(a)-(d)]: Increase trace level
Case (a) and (b):
Logon to the Visual Administrator and browse to the 'Log Configurator'
service. Go to 'location' "com.sap.aii.af.ra.ms" and set the trace severity
to DEBUG. Propagate this setting to subtree. Save for all server nodes. Now
you are able to see from the default trace the message number of the last
successfully processed message before OutOfMemoryError happens. This will
help you to find the successional message on the database.
Case (c) and (d):
Logon to the Visual Administrator and browse to the 'Log Configurator'
service. Go to the 'location' of the concerned Adapter and set the trace
severity to DEBUG. This is com.sap.aii.adapter.* (e.g. for File Adapter) or
com.sap.aii.af.service.* (e.g. for RFC Adapter). Save for all server nodes.
This should help you to identify the channel where the message comes from.
If you are not sure at all what causes the OutOfMemoryError, increase the
trace level for the whole Adapter Framework (com sap.aii.*) to DEBUG.
Note: Be aware that trace level DEBUG slows down the system performance
dramatically. Change it back to ERROR if the trace was created !
Step 3 [(a),(b)]: Messaging System: Reduce parallelization
-
If XI/PI version is < XI 3.0 SP19 / PI 7.0 SP11 then
logon to the Visual Administrator and browse to 'SAP XI AF Messaging'
service's properties. In property "messaging.connections" set for
connection 'AFW' Send.maxConsumers=1 and Recv.maxConsumers=1.
If XI/PI version is >= XI 3.0 SP19 / PI 7.0 SP11 then
refer to the new queueing scheme note 1129604 and using the Visual
Administrator set the 'SAP XI AF Core' service's property
'messaging.connectionDefinition' values Send.maxConsumers and
Recv.maxConsumers both to 1.
-
25.01.2009 Page 4 of 5
Note 994433 - XI AF on J2EE Engine terminates with OutOfMemoryError
Save and restart the service. Now the asynchronous messages are processed
serial. If this works you are done. Continue with step 5. Otherwise
continue with step 4.
Step 3 [(c),(d)]: Adapter: Disable polling/pushing
Analyse the trace files and disable the polling/pushing Adapter. Check why
the payload is getting so large, e.g. a too large payload file for File
Adapter, select statement result set for JDBC Adapter, and so on.
Step 4 [(a),(b)]: (only for Messaging System if step 3 failes)
If step 3 doesn't solve the problem you have to change the status of the
message that has to be processed next (by the respective server node) to
'FAIL'. This must be done via an SQL command on database level. The name of
the message table is SAP<SID>DB.XI_AF_MSG.
Example for the respective select statement:
select * from "SAP<SID>DB"."XI_AF_MSG" where SENT_RECV_TIME >
'<your_OutOfMemory_timestamp> - <delta_time>' and NODE = '<your_node>'
Check the result set for MSG_ID, STATUS and SENT_RECV_TIME and analyse the
default trace for the last processed message:
/usr/sap/<SID>/<Inst.>/j2ee/cluster/server<N>\log\defaultTrace.<n>.trc
Change the STATUS of the next message to 'FAIL'. Messages with this status
cannot be processed any longer. If you are unsure which message is
affected, change the status of all messages with SENT_RECV_TIME >
'<your_OutOfMemory_timestamp>' to 'NDLV'. Such messages can only be
restarted manually. After that restart the server node.
Step 5[(a)-(d)]: Clean up
Revert the settings from steps 1, 2 and 3.
Note:
If you plan to process large messages (10-100 MB or larger) frequently, we
recommend to setup a decentral Adapter Engine with serial processing (see
step 3 [(a),(b)]).
Header Data
Release Status: Released for Customer
Released on: 13.10.2008 11:55:28
Priority: Recommendations/additional info
Category: Performance
Main Component BC-XI-CON-MSG Messaging System
Additional Components:
BC-XI-CON-AFW J2EE Adapter Framework
The note is not release-dependent.
25.01.2009 Page 5 of 5
Note 994433 - XI AF on J2EE Engine terminates with OutOfMemoryError
Related Notes
Number Short Text
1129604 New Queueing Scheme in XI 3.0 SP19 / 7.0 SP11
Hi
Check the solution here
FATAL: Caught OutOfMemoryError! Node will exit with exit code 666 java.lang.OutOfMemoryError: J
https://www.sdn.sap.com/irj/scn/wiki?path=/display/jsts/(Kernel)OOM
Regards
Abhishek
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Check this blog /people/markus.kohler/blog/2006/06/07/welcome-to-my-new-netweaver-performance-blog
Thanks!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
93 | |
10 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.