cancel
Showing results for 
Search instead for 
Did you mean: 

SAP MDS is stopping abnormally

Former Member
0 Kudos

Hi Experts,

We are using MDM 7.1 and frequently the MDS server is stopping abnormally. While analysis found there is no load other than few users using data manager.

Please advice for any suggestions / SAP notes / Earlier discussions.

Thanks

Sridhar T A

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Hi All,

The issue is finally resolved.

As I could not able to get any detailed information from the MDS, MDIS, MDSS logs and the audit files, installed the MDM_Info collector tool and upgraded the Java Jre version to 1.6 for running this tool.

The installation and the support guide are in the notes 1522125.

Also could not able to get the exact core dump file created during the server stopped, have checked and set all the parameters mentioned below.

The reason is, not really aware because

1) Core dump was not created

2) Core dump was removed..

1.          As per the note 1541213 checked the file block ulimit settings and set it to unlimited.

$ ulimit -a

time(seconds)        unlimited

file(blocks)         unlimited

data(kbytes)         1048576

stack(kbytes)        131072

memory(kbytes)       unlimited

coredump(blocks)     4194303

2.          As per the note 1522123 Then checked for the file descriptor in nohup.out file

Starting: /usr/sap/***/***00/exe/mds-r - Fri Oct  4 03:26:36 xxx 2013

-----------------------------

time(seconds)        unlimited

file(blocks)         unlimited

data(kbytes)         1048576

stack(kbytes)        131072

memory(kbytes)       unlimited

coredump(blocks)     unlimited

nofiles(descriptors) 8192

3.          As per the note 172747 checked the tunable parameters maxfiles, maxfiles_lim  and nflocks (with SAM)

The values should be: 8192, 8192 and 4096 respectively.

4.          As per the note 1250193 modified that there is no limit of the core dump file size

$ limit

cputime         unlimited

filesize        unlimited

datasize        1048576 kbytes

stacksize       131072 kbytes

coredumpsize    unlimited

descriptors     8192 files

memoryuse       unlimited

Then we have tried replicating the same issue and purposefully tried to bring down the system. as a result only core file got generated for MDIS and MDSS.

Generated the debug file using the below commands

gettrace `which mdis-r` <core dump filename> > debug.txt

gettrace `which mdss-r` <core dump filename> > debug.txt

From that got the infomation that the

Program terminated with signal 11, Segmentation fault.

SEGV_ACCERR - Invalid Permissions for object

#0 A2i::A2iSingleLock::Lock ()

And as per the note 1541213 changed the following too.

data(kbytes) 1048576

stack(kbytes) 131072

Also on the other end, applied the other SAP Note 1620182 - MDM Clients leave open connections on Console when closed.

Thanks,

Sridhar T A

Former Member
0 Kudos

Thanks for the detailed solution, must be helpful to some time.

Regards,

Shahid Noolvi

Answers (6)

Answers (6)

0 Kudos

Hi Sridhar,

Is this issue resolved? If so could you pls update on what was the reason.

regards

mohan

Former Member
0 Kudos

Hi Sridhar,

MDS Shut down can be with regards to the space on the DB. Try having a look at this. The number of connections at one time can also be causing problem. check if the connections closed are actually getting closed in the Console under the Admin --> Connections. Sometimes they would still be open as active connections and may cause problem.

Thanks and Regards,

Shahid

Former Member
0 Kudos

Hi Ravi / Mohan / Thamiz

Thanks for the reply. Few info got from the log before we have done the MDS restart.

recv() returned error 232, "Connection reset by peer" Socket [4117], Remote IP: xx.xxx.x.xx, Remote Port: xxxxx, Local Port: xxxxx                                                                                                                                                                                                                        

Caught BadConnection exception "OS network layer error occurred during Recv()" while handing an incoming message. <Connection [0x6000000069DA1F10], Socket [UNINITIALIZED], Remote IP: xxxx, Remote Port: xxx, Local Port: xxxx, connection source [], Client IP xxxxxx., Client version 7.1.08.276; connection is currently in Dead state>

accept() failed with error  RC:24 ErrMsg: Too many open files                                                                                                                                                                                                                                                                                             

accept() failed with error  RC:24 ErrMsg: Too many open files                                                                                                                                                                                                                                                                                             

Failed to open INI file for read: /usr/sap/MP0/MDS00/config/mds.ini                                                                                                                                                                                                                                                                                       

For debugging the information, have executed the script for MDM Info collector tool. I am not sure whether the Core Dump file for MDS is not created or removed and due to this, I could not able to identify anything.

Please advise.

Thanks,

Sridhar T A

Former Member
0 Kudos

Hi Sridhar,

Look like your facing issues with file handling, please check if the file descriptors ulimit setting is set correctly on you UNIX/LINUX machine by following SAP Note No. 1522123

HTH,

Tal.

Former Member
0 Kudos

Hi Tal,

I guess the parameters are set correctly.

$ ulimit -a

time(seconds)        unlimited

file(blocks)         unlimited

data(kbytes)         1048576

stack(kbytes)        131072

memory(kbytes)       unlimited

coredump(blocks)     4194303

Thanks,

Sridhar T A

Former Member
0 Kudos

Hi Sridhar,

From the output above I cannot see the ulimit file descriptors value, usually you can get it with the command ulimit -n, please check it.

Please note that the MDS process isn't always using the same ulimit values as the values in the <SID>adm session, The affective ulimits are written by MDS during startup into the trace file nohup.out at MDS \work folder. To check ulimit settings at the recent startup, search for the last trace line which starts with "Starting". Current MDS ulimits are reported under that line.

HTH,

Tal.

Former Member
0 Kudos

Hi Tal,

Thanks for the quick response. I did checked the nohup.out file and the recent values (captured recently during the application restart) are

Starting: /usr/sap/***/***00/exe/mds-r - Fri Oct  4 03:26:36 xxx 2013

-----------------------------

HP-UX vlunx019 B.11.31 U ia64 2660043047 unlimited-user license

time(seconds)        unlimited

file(blocks)         unlimited

data(kbytes)         1048576

stack(kbytes)        131072

memory(kbytes)       unlimited

coredump(blocks)     unlimited

nofiles(descriptors) 8192

Thanks,

Sridhar T A

Former Member
0 Kudos

Hi Sridhar,

Can you check the tunable parameters maxfiles, maxfiles_lim  and nflocks (with SAM)

The values should be: 8192, 8192 and 4096 respectively.

Thanks,

Tal.

Former Member
0 Kudos

Hi Tal,

As I dont have access for SAM, please advise if i can able to see the tunable parameters from the server level. Actually I was looking for the parms.tx file, but it is not available.

http://internotes.fieb.org.br/help/readme.nsf/848e87c87edd168c8525636a00718f88/432751bc2c05176a85256...

Thanks,

Sridhar T A

Former Member
0 Kudos

Hi Sridhar,

You can use kctune to check the values, for example on my HP server:

bash-4.0# kctune | grep nflocks

nflocks                             4096  Default      Auto

bash-4.0# kctune | grep maxfiles

maxfiles                            8192  8192

maxfiles_lim                        8192  8192         Immed

You can use kctune to change those values but please consult with your IT/BASIS team first since doing it incorrctly can demage your server.

Regards,

Tal

Former Member
0 Kudos

Hi Sridhar,

  • From the logs, you can see that you are facing Socket error.
  • It generally happens when nubmer of hits on MDM system are more than the maximum concurrent sessions for all the instances like Client level conncetions, Java API connections or other webservices calls.
  • May be in your system, you are having large number of JAVA API calls, but after the processing those connections are not getting closed/terminated. This will in turn keep on addign Memory issues and will ultimately lead to Server breakdown.
  • So i would suggest you to check this thing also and then see if this is the issue.

If you still do not find any resolution, then i would suggest you to raise an OSS message and attach Info Collector logs, CLIX monitoring logs, DB and CPU Utilization logs for the MDS and MDIS system in the message. This will help SAP to proceed with analysis quickly.

Please check these things and let us know regarding the same.

Thanks and Regards,

Ankush

shanthi_kumar
Active Participant
0 Kudos

Hi Sridhar,

As suggested by Ravi and Mohan analyzing the logs will help you in understanding what the issue is. We recently had the same issue and when analyzed it appeared to be a DB issue which has been sorted by DB team. The MDS issue was because of the space constraint. Normally the frequency of Archive folder should be set in such a way that it does not affect the space allocated for that particular MDS in the underlying DB.

Most of the cases restarting the DB will help the issue to resolve.

Post us with your replies.

Kind Regards,

Thamizharasi N

Former Member
0 Kudos

Hi Sridhar,

Please post the errors and warnings from MDS logs.

Is a port being skipped or MDIS is getting stopped?

Thanks,
Ravi

0 Kudos

Hi Sridhar,

Could you provide more details on this like, no. of repos mounnted, any recent upgrade on MDM or DB level, which server,  on what operation MDS is getting stoped.

This looks more like a server/ DB-MDM connectivity issue. Pls do check the following

1. Is the DB-MDM connectivity healthy?

2. Check MDS and OS level logs.

3. is enough space is available for MDS.

4. restart DB-MDM servers and observe.

Pls update your findings

Regards

mohan kumar