on 10-01-2013 3:28 AM
Hi Experts,
We are using MDM 7.1 and frequently the MDS server is stopping abnormally. While analysis found there is no load other than few users using data manager.
Please advice for any suggestions / SAP notes / Earlier discussions.
Thanks
Sridhar T A
Hi All,
The issue is finally resolved.
As I could not able to get any detailed information from the MDS, MDIS, MDSS logs and the audit files, installed the MDM_Info collector tool and upgraded the Java Jre version to 1.6 for running this tool.
The installation and the support guide are in the notes 1522125.
Also could not able to get the exact core dump file created during the server stopped, have checked and set all the parameters mentioned below.
The reason is, not really aware because
1) Core dump was not created
2) Core dump was removed..
1. As per the note 1541213 checked the file block ulimit settings and set it to unlimited.
$ ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) 1048576
stack(kbytes) 131072
memory(kbytes) unlimited
coredump(blocks) 4194303
2. As per the note 1522123 Then checked for the file descriptor in nohup.out file
Starting: /usr/sap/***/***00/exe/mds-r - Fri Oct 4 03:26:36 xxx 2013
-----------------------------
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) 1048576
stack(kbytes) 131072
memory(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) 8192
3. As per the note 172747 checked the tunable parameters maxfiles, maxfiles_lim and nflocks (with SAM)
The values should be: 8192, 8192 and 4096 respectively.
4. As per the note 1250193 modified that there is no limit of the core dump file size
$ limit
cputime unlimited
filesize unlimited
datasize 1048576 kbytes
stacksize 131072 kbytes
coredumpsize unlimited
descriptors 8192 files
memoryuse unlimited
Then we have tried replicating the same issue and purposefully tried to bring down the system. as a result only core file got generated for MDIS and MDSS.
Generated the debug file using the below commands
gettrace `which mdis-r` <core dump filename> > debug.txt
gettrace `which mdss-r` <core dump filename> > debug.txt
From that got the infomation that the
Program terminated with signal 11, Segmentation fault.
SEGV_ACCERR - Invalid Permissions for object
#0 A2i::A2iSingleLock::Lock ()
And as per the note 1541213 changed the following too.
data(kbytes) 1048576
stack(kbytes) 131072
Also on the other end, applied the other SAP Note 1620182 - MDM Clients leave open connections on Console when closed.
Thanks,
Sridhar T A
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Sridhar,
Is this issue resolved? If so could you pls update on what was the reason.
regards
mohan
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Sridhar,
MDS Shut down can be with regards to the space on the DB. Try having a look at this. The number of connections at one time can also be causing problem. check if the connections closed are actually getting closed in the Console under the Admin --> Connections. Sometimes they would still be open as active connections and may cause problem.
Thanks and Regards,
Shahid
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Ravi / Mohan / Thamiz
Thanks for the reply. Few info got from the log before we have done the MDS restart.
recv() returned error 232, "Connection reset by peer" Socket [4117], Remote IP: xx.xxx.x.xx, Remote Port: xxxxx, Local Port: xxxxx
Caught BadConnection exception "OS network layer error occurred during Recv()" while handing an incoming message. <Connection [0x6000000069DA1F10], Socket [UNINITIALIZED], Remote IP: xxxx, Remote Port: xxx, Local Port: xxxx, connection source [], Client IP xxxxxx., Client version 7.1.08.276; connection is currently in Dead state>
accept() failed with error RC:24 ErrMsg: Too many open files
accept() failed with error RC:24 ErrMsg: Too many open files
Failed to open INI file for read: /usr/sap/MP0/MDS00/config/mds.ini
For debugging the information, have executed the script for MDM Info collector tool. I am not sure whether the Core Dump file for MDS is not created or removed and due to this, I could not able to identify anything.
Please advise.
Thanks,
Sridhar T A
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Sridhar,
From the output above I cannot see the ulimit file descriptors value, usually you can get it with the command ulimit -n, please check it.
Please note that the MDS process isn't always using the same ulimit values as the values in the <SID>adm session, The affective ulimits are written by MDS during startup into the trace file nohup.out at MDS \work folder. To check ulimit settings at the recent startup, search for the last trace line which starts with "Starting". Current MDS ulimits are reported under that line.
HTH,
Tal.
Hi Tal,
Thanks for the quick response. I did checked the nohup.out file and the recent values (captured recently during the application restart) are
Starting: /usr/sap/***/***00/exe/mds-r - Fri Oct 4 03:26:36 xxx 2013
-----------------------------
HP-UX vlunx019 B.11.31 U ia64 2660043047 unlimited-user license
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) 1048576
stack(kbytes) 131072
memory(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) 8192
Thanks,
Sridhar T A
Hi Tal,
As I dont have access for SAM, please advise if i can able to see the tunable parameters from the server level. Actually I was looking for the parms.tx file, but it is not available.
http://internotes.fieb.org.br/help/readme.nsf/848e87c87edd168c8525636a00718f88/432751bc2c05176a85256...
Thanks,
Sridhar T A
Hi Sridhar,
You can use kctune to check the values, for example on my HP server:
bash-4.0# kctune | grep nflocks
nflocks 4096 Default Auto
bash-4.0# kctune | grep maxfiles
maxfiles 8192 8192
maxfiles_lim 8192 8192 Immed
You can use kctune to change those values but please consult with your IT/BASIS team first since doing it incorrctly can demage your server.
Regards,
Tal
Hi Sridhar,
If you still do not find any resolution, then i would suggest you to raise an OSS message and attach Info Collector logs, CLIX monitoring logs, DB and CPU Utilization logs for the MDS and MDIS system in the message. This will help SAP to proceed with analysis quickly.
Please check these things and let us know regarding the same.
Thanks and Regards,
Ankush
Hi Sridhar,
As suggested by Ravi and Mohan analyzing the logs will help you in understanding what the issue is. We recently had the same issue and when analyzed it appeared to be a DB issue which has been sorted by DB team. The MDS issue was because of the space constraint. Normally the frequency of Archive folder should be set in such a way that it does not affect the space allocated for that particular MDS in the underlying DB.
Most of the cases restarting the DB will help the issue to resolve.
Post us with your replies.
Kind Regards,
Thamizharasi N
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Sridhar,
Please post the errors and warnings from MDS logs.
Is a port being skipped or MDIS is getting stopped?
Thanks,
Ravi
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Sridhar,
Could you provide more details on this like, no. of repos mounnted, any recent upgrade on MDM or DB level, which server, on what operation MDS is getting stoped.
This looks more like a server/ DB-MDM connectivity issue. Pls do check the following
1. Is the DB-MDM connectivity healthy?
2. Check MDS and OS level logs.
3. is enough space is available for MDS.
4. restart DB-MDM servers and observe.
Pls update your findings
Regards
mohan kumar
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
93 | |
10 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.