cancel
Showing results for 
Search instead for 
Did you mean: 

More Enterprise Manager silliness..

bernie_krause
Participant
0 Kudos

Ok, first the details.  Environment is Sun Solaris, Oracle, SM7.1 sp7, EM 9.1.  latest SMD agents.  EM has been installed and running for several months, this is a new installation.  Solman running for years, SP7 for a few months now. 

We can call up Introscope Webview and log in with standard ID, interface works.  Enterprise Manager starts and runs with no issues.

Problem - for some reason last week EM disconnected itself from SolMan and now we cannot get Solman to see the running instance of EM again. Managed System config fails now because EM is not seen, can't finish configuring some systems.  EWA reports are coming in gray because of missing Introscope metrics.  Currently have 16 systems configured through Managed Systems Config.  All were reporting in quite nicely till noon Friday.  No system outages occurred that day, had restarted Solman 2 days prior to bump up Shared Memory setting to 300m to avoid short dumps because of monitoring activity.  Xmx and Xms set to 2048 in lax file (were 1024, I increased it to see if the error would go away.  It did not). 

Stopped/started EM several times, stopped/started SMD agents, short of rebooting Solman (production system, not easy to get time slice to do that), I'm not sure where to look any more. 

Not really seeing any errors in EM logs.  As far as they're concerned, it's running fine.  There are SOME metrics being reported in to Introscope, not sure why some get there and others not.  EM Self Monitoring screen in Introscope shows 137 agents connected (about right), 2,626 metrics (seems low).  So the agents ARE getting there, but something is still blocking the connection.

Seeing "failed to bind to server socket... Address already in use" in the Introscope log, checked those port and nothing else seems to be using it (8081:6001).  When I stop EM, the 6001 entry goes away.

Also seeing "Error accessing to Enterprise Manager (socketTest) ... java.net.ConnectException: Connection refused" on the app server that EM is running on, but that's a symptom of the real problem and not really useful (to me, anyway).

I've gone through a lot of the posts here already, tried various things, getting close to reboot time.  Any suggestions to try before I have to go that route?

Thanks.
Bernie

Accepted Solutions (1)

Accepted Solutions (1)

bernie_krause
Participant
0 Kudos

Unbelievable.  Today mysteriously port 6001 decides to start working again and EM is getting metrics ...  and of course "no one did anything".. 

400 agents and 355k+ metrics.  Looks like we're back to "normal".  Thanks all for all the suggestions - next time I'll go corral a network guy first. 

bxiv
Active Contributor
0 Kudos

Don't forget to mark your response as the answer, for future viewers.

Answers (1)

Answers (1)

Former Member
0 Kudos

Can you attach full log files with errors (from SolMan and EM)?

bernie_krause
Participant
0 Kudos

zip file attached - the seven10.txt file is the log from July 10 for Introscope.  The other is the App server log.  I had delete the first portion of the trc file to get the upload to accept the size. 

Tried telnetting to server 6001 , also got error - connection refused.  Yet nothing seems to be using that port. 

bxiv
Active Contributor
0 Kudos

Found this in the defaultTrace.18 zip, is this the SMD agent for SolMan?

#1.5 #00144F879C2400600000864A000016550004E12A3C3D6B59#1373471422376#com.sap.engine.services.httpserver.server.Log##com.sap.engine.services.httpserver.server.Log#J2EE_GUEST#0##6DE757BFE97811E2C55800000018FB1E#6de757bfe97811e2c55800000018fb1e-0#6de757bfe97811e2c55800000018fb1e#SAPEngine_Application_Thread[impl:3]_22##0#0#Info#1#/System/HttpAccess/Access#Plain###10.70.71.161 : POST /GRMGHeartBeat/EntryPoint HTTP/1.1 200 3927 [97] d[100] c[16192256]#

#1.5 #00144F879C240030000083A1000016550004E12A3D2246B9#1373471437375#com.sap.smd.server##com.sap.smd.server#SMD_ADMIN#27498##E382A6F7E97711E2C0E900000018FB1E#e382a6f7e97711e2c0e900000018fb1e-0#e382a6f7e97711e2c0e900000018fb1e#SAPEngine_Application_Thread[impl:3]_1##0#0#Error##Plain###[SMDManager.registerPendingAgent] Receive registration for an already existing entry. Registration REJECTED

Existing entry :

AgentHandleEntry: com.sap.smd.SMDManager$AgentHandleEntry@1c33f25e

           JNDI key         : fssprmap03_SMD_SMDA97@2013.07.10-10.43.01.913

           Server Name      : fssprmap03

           ID               : fssprmap03_SMD_SMDA97

           CanonicalHostName: fssprmap03

           HostName         : fssprmap03

           Address          : 10.70.74.150

           AgentHandleWrap  : com.sap.smd.local.AgentHandleWrapper@74a508f7

Incoming entry:

AgentHandleEntry: com.sap.smd.SMDManager$AgentHandleEntry@7a3c43cd

           JNDI key         : null

           Server Name      : fssprmap03

           ID               : fsssx4050_SMD_SMDA97

           CanonicalHostName: fsssx4050

           HostName         : fsssx4050

           Address          : 10.70.74.142

           AgentHandleWrap  : com.sap.smd.local.AgentHandleWrapper@4a9225d6#

#1.5 #00144F879C240030000083A2000016550004E12A3D224BBD#1373471437376#com.sap.smd.server##com.sap.smd.server#SMD_ADMIN#27498##E382A6F7E97711E2C0E900000018FB1E#e382a6f7e97711e2c0e900000018fb1e-0#e382a6f7e97711e2c0e900000018fb1e#SAPEngine_Application_Thread[impl:3]_1##0#0#Error##Java###Agent Registration failed

[EXCEPTION]

{0}#1#com.sap.smd.server.manager.SMDException: [SMDManager.registerPendingAgent] Receive registration for an already existing entry. Registration REJECTED

Existing entry :

AgentHandleEntry: com.sap.smd.SMDManager$AgentHandleEntry@1c33f25e

           JNDI key         : fssprmap03_SMD_SMDA97@2013.07.10-10.43.01.913

           Server Name      : fssprmap03

           ID               : fssprmap03_SMD_SMDA97

           CanonicalHostName: fssprmap03

           HostName         : fssprmap03

           Address          : 10.70.74.150

           AgentHandleWrap  : com.sap.smd.local.AgentHandleWrapper@74a508f7

Incoming entry:

AgentHandleEntry: com.sap.smd.SMDManager$AgentHandleEntry@7a3c43cd

           JNDI key         : null

           Server Name      : fssprmap03

           ID               : fsssx4050_SMD_SMDA97

           CanonicalHostName: fsssx4050

           HostName         : fsssx4050

           Address          : 10.70.74.142

           AgentHandleWrap  : com.sap.smd.local.AgentHandleWrapper@4a9225d6

As for your seven10 file the only thing that stands out, from what I saw, are the dashboards errors which just could mean you have missing files.

Have you tried just restarting the Introscope service on the server?

Does 'telnet IP.add.re.ss 6001' provide any useful information, if successful should just be a blank terminal and after about 10 secs a connection lost message?

What versions are the LM* components on your Java side currently at?

Have you tried just restarting the java side?

Former Member
0 Kudos

Can you check both ports 8081 and 6001 with netstat command?

bernie_krause
Participant
0 Kudos

No, the agent listed there is for a different system.  Someone installed it with the virtual host name instead of VIP name, now it's running on a different host and causing problems..  Grrr...

Netstat 6001 gives "connection refused".  Netstat 8081 connects just fine. 

Have not tried just restarting Java, may give that a shot.

LM Tools 7.02 SP11 (1000.7.02.11.0.20120216212322

LM Services 7.10 SP7 (1000.7.10.7.1.20121219150800)  as of 02/13/13

bxiv
Active Contributor
0 Kudos

LM Tools has SP13 patch 2

LM Service has SP08

However knowing that 6001 doesn't work in your system, and someone else installing the EM; at this point it may be in your interest to install the EM.  6001 is how the SMD agents send data/stats in, 8081 is the interface that is used to view the data.

I can imagine that your system is going to be grumpy if you try to remove the EM; you may have to install it again with different ports, go into SolMan and re-assign all of the SMD agents to the new installation, then you can uninstall/reinstall to fix the 6001 issue, then reassign every agent, uninstall the temp EM setup.