cancel
Showing results for 
Search instead for 
Did you mean: 

SMD agents goes offline in agent administration

Former Member
0 Kudos

Hi,

In my landscape all the SMD agents are directly registered to Solman SPS27.

Each time my solman is restarted, all teh agents in agent administration goes offline. but in actual its runing at OS lvl.

After we take seperate Java stack restart of solman, agents come online in agent admin. This scenario is confusing

Any idea as to how can we overcome this issue ?

Regards,

Shyam

Accepted Solutions (1)

Accepted Solutions (1)

guilherme_balbinot
Active Participant

Hello Shyam,

Firstly, make sure you are using the latest LM-SERVICE applicable to your SP level.

I helped a customer in the past to solve a similar issue. His problem was that a Database Backup was being performed. During this backup, the DB was put down and after put UP again. While this was done, the agent had problems authenticating.

The deep reason why the agents were shutdown was because when receiving an authentication error, the agents systematically shutdown.

Why? because there's no way to determine the exact root cause of this failure, so in order to not take the risk to endlessly lock the user, it shuts down the agent.

The agents cannot authenticate because the ABAP stack cannot connect to the DB to authenticate the user.

Now, back to your situation:

I can see a similar scenario in your case. When rebooting the Solution Manager it is common to see one stack initiating after/before another. It is possible that the agents are getting authentication problems and then shutting down. After you restart the Java stack the agents will authenticate.

Another possibility is that the agent status is on the Database, saved as online. If you put the system down, the status might get updated to OFFLINE, but notice that the agents are still running. After the agents start, they will try to connect and receive an error because the Solution Manager already has a connection ID for the JCO. Therefore we can assume that the restart leads to an inconsistent state, making agents not be able to communicate.

Please note that the above assumptions are hypothetical, but I'm pretty sure something similar is happening because this SMD Agent + Solution Manager connection/relationship is fragile.

With Solution Manager 7.1 it will be possible to have a P4 connection + SLD connection (double connection) from

the SMD Agent to the Solution Manager. This connection will be a fallback and will serve to control the agents in a more smart way, thus avoiding problems like this.

Finally, the workaround:

go to the Agent Administration.

Switch maintenance mode ON before stopping the Solution Manager.

Switch off the maintenance mode when the Solution Manager is back up.

I have researched for you in our knowledge database on how to do this automatically. Please refer to the Page 37 of the Diagnostics Agent Troubleshooting Guide available at

http://service.sap.com/diagnostics -> media library

I'm sure there's a way to incorporate this command in a shell script or a .BAT

I hope this solves your issue.

Best regards,

Guilherme Balbinot

Edited by: Guilherme Balbinot on Aug 16, 2011 5:57 PM

Former Member
0 Kudos

Thanks Guilherme for this very good clearification!

I have a simular problem at the moment and can confirm that the agent connection get disabled because they are not able to authentificate. I will try to implement the workaround with the telnet command to bring the agents in MM... This should help to solve the issue.

Anyway what I don´t understand is that not all agents get disconnected!

For example:

SMDsystem.log of Agent which is offline:

Aug 20, 2011 1:01:15 AM [Thread[Connector,5,main]] Info Checking server availability...

Aug 20, 2011 1:01:16 AM [Thread[Connector,5,main]] Info Authentication in progress ...

Aug 20, 2011 1:02:07 AM [Thread[Connector,5,main]] Error Failed to connect to SMD server - user: SMD_ADMIN

Aug 20, 2011 1:02:07 AM [Thread[Connector,5,main]] Error Connecting to SMD server ms://jusmp00:8104/P4 failed

com.sap.engine.services.jndi.persistent.exceptions.NoPermissionException: Exception during getInitialContext operation. Wrong security principle/credentials.

Aug 20, 2011 1:02:08 AM [Thread[Connector,5,main]] Warning Access denied occurs during the registration, smd connector has been disabled (waiting new SLD GUID).

Aug 20, 2011 1:43:20 AM [Thread[SLD-Registration,1,main]] Info Next try to push instance data in SLD is Sat Aug 20 13:43:06 AST 2011

SMDsystem.log of Agent which is online:

Aug 20, 2011 1:01:12 AM [Thread[Connector,5,main]] Error Ping failed : com.sap.smd.server.exec.asio.TimeOutError - null

Time out occurred when calling method 'ping' on object after 30001 ms.

possible cause: com.sap.smd.server.util.concurrent.TimeoutException

at com.sap.smd.server.exec.asio.AsioInvocationHandler.invoke(AsioInvocationHandler.java:130)

at $Proxy1.ping(Unknown Source)

at com.sap.smd.agent.connection.SMDConnector$SMDConnectionTask.updateConnectionStatus(SMDConnector.java:230)

at com.sap.smd.agent.connection.SMDConnector$SMDConnectionTask.run(SMDConnector.java:189)

at java.lang.Thread.run(Thread.java:664)

Caused by: com.sap.smd.server.util.concurrent.TimeoutException

at com.sap.smd.server.util.concurrent.FutureResult.timedGet(FutureResult.java:159)

at com.sap.smd.server.exec.asio.AsioInvocationHandler.invoke(AsioInvocationHandler.java:122)

... 4 more

Aug 20, 2011 1:01:12 AM [Thread[Connector,5,main]] Info Checking server availability...

Aug 20, 2011 1:40:27 AM [Thread[Connector,5,main]] Info Authentication in progress ...

Aug 20, 2011 1:40:29 AM [Thread[Connector,5,main]] Info Connection established .

Aug 20, 2011 1:40:30 AM [Thread[Connector,5,main]] Info Registering agent on server ...

So you see that the secound agents waits the 40 minutes and start the authentication process after than.

Both agents are 7.11 and application version 7.01.7.3.20110311130810.

Any idea about that?

Former Member
0 Kudos

Hello all,

I have the same problem, my agent disconnected and the problem is on hpux ia64 and linux x86.

It's really the same behavior as thomas, and if i test the maintenance mode on and mode off, i can lose the connection like i lost it after hours...I don't really know why and it can put the maintenance mode on all night because our solman is backup all days..

If you find the solutions, i will be very happy to know it

guilherme_balbinot
Active Participant
0 Kudos

Hello,

Thank you for the feedback. Based on your logs I have confirmed that your case is exactly like I described, it is the same error.

In Solution Manager 7.1 this problem is corrected by using SMD Agent 7.30, which supports SLD and Direct connection. This will provide a fallback alternative in cases like yours and the agent won't run into this undesired "asleep" state.

This behavior has been identified by SAP and it is corrected in 7.1.

Note that by upgrading the Agent to 7.30 in Solution Manager 7.0 EHP1 won't suffice.

For both of your cases it is recommended to turn on the Maintenance Mode On and afterwards turn it off.

Please share your results here so we can help others !

I intend to put it on the Wiki at

http://wiki.sdn.sap.com/wiki/display/SM

Best regards,

Guilherme Balbinot

Former Member
0 Kudos

Hi Experts,

In our case the SMD agents are online for the satallite systems which we saw through OS level. But in SMD - Agent administration the agents are shown as offline. Could you please suggest why there is a difference at OS level status and SMD agent adminstration. Agents are online when we check at os level.

Please advice.

Answers (0)