on 09-27-2007 11:26 AM
Hello
we're experiencing some issues with linux sles 9 in which i need help:
we have 3 machines, each one with the DEV and TST system (R3, BW and CRM)
the problem is, on R3 we couldnt keep the 2 insances up at the same time, and we solved it by resizing the shmmax and shmall parameters to 4Gb(instead of 1Gb)
after this, we were able to keep the 2 instances up.
now, on both BW and CRM machines, we can only keep 1 instance alive.
we tryed to expand shm parameter, but the errors keep up.
no errors on the start, and some processes (dw.sap*) on the 2nd instance to get up(the 1st one to start, stays online) keep running some times (3 or 4 instead of 8).
this processes that stay online, are writing the dev_w logs continously.
here is a part of dev_w4:
N =================================================
N === SSF INITIALIZATION:
N ===...SSF Security Toolkit name SAPSECULIB .
N ===...SSF trace level is 0 .
N ===...SSF library is /usr/sap/DCM/SYS/exe/run/libsapsecu.so .
N ===...SSF hash algorithm is SHA1 .
N ===...SSF symmetric encryption algorithm is DES-CBC .
N ===...sucessfully completed.
N =================================================
N MskiInitLogonTicketCacheHandle: Logon Ticket cache pointer retrieved from shared memory.
N MskiInitLogonTicketCacheHandle: Workprocess runs with Logon Ticket cache.
W =================================================
W === ipl_Init() called
W ITSP Running against db release 620!
W ITSP Disable Kernel Web GUI functionality
W === ipl_Init() returns 2, ITSPE_DISABLED: Service is disabled (sapparam)
W =================================================
I *** ERROR => e=28 semget(20145,1,2047) (28: No space left on device) [semux.c 1013]
I *** ERROR => SemStat: Implicit SemInit failed. Key=45 [semux.c 1969]
M ***LOG R31=> SosSemCheck, SemStat ( 01) [thxxtool2.c 512]
M in_ThErrHandle: 1
M *** ERROR => SosSemCheck: SemStat (step 5, th_errno 5, action 2, level 1) [thxxhead.c 9428]
dev_disp part where the error is:
Thu Sep 27 11:21:33 2007
CCMS: start to initalize 3.X shared alert area (first segment).
DpMsgAdmin: Set release to 6400, patchlevel 0
MBUF state PREPARED
MBUF component UP
DpMBufHwIdSet: set Hardware-ID
***LOG Q1C=> DpMBufHwIdSet [dpxxmbuf.c 1025]
DpMsgAdmin: Set patchno for this platform to 54
Release check o.K.
Thu Sep 27 11:21:37 2007
ERROR => W0 (pid 5335) died [dpxxdisp.c 12193]
ERROR => W2 (pid 5337) died [dpxxdisp.c 12193]
ERROR => W4 (pid 5339) died [dpxxdisp.c 12193]
ERROR => W5 (pid 5340) died [dpxxdisp.c 12193]
ERROR => W6 (pid 5341) died [dpxxdisp.c 12193]
ERROR => W7 (pid 5342) died [dpxxdisp.c 12193]
ERROR => W0 (pid 5403) died [dpxxdisp.c 12193]
ERROR => W1 (pid 5336) died [dpxxdisp.c 12193]
ERROR => W2 (pid 5404) died [dpxxdisp.c 12193]
my types changed after wp death/restart 0xbf --> 0xbd
ERROR => W4 (pid 5405) died [dpxxdisp.c 12193]
ERROR => W5 (pid 5408) died [dpxxdisp.c 12193]
ERROR => W1 (pid 5421) died [dpxxdisp.c 12193]
my types changed after wp death/restart 0xbd --> 0xbc
ERROR => W6 (pid 5409) died [dpxxdisp.c 12193]
ERROR => W7 (pid 5412) died [dpxxdisp.c 12193]
ERROR => W4 (pid 5422) died [dpxxdisp.c 12193]
ERROR => W5 (pid 5425) died [dpxxdisp.c 12193]
...
we have reviewed all parameters and all are identical to the parameter on the r3 instances(which are ok)
we also tryed to expand to 6Gb the shmall an shmmax, but its stills dies...
does anyone have any idea what this can be ???
Hi Joaquim,
please check the following parameters at your system of the values are set like follows:
# cat /proc/sys/kernel/sem
1250 256000 100 8192
# cat /proc/sys/kernel/msgmni
1024
Thanks,
Hannes Kuehnemund
SAP LinuxLab Walldorf
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
hello Hannes
i have the following:
svsapdqcrm01:~ # cat /proc/sys/kernel/sem
250 32000 32 128
svsapdqcrm01:~ # cat /proc/sys/kernel/msgmni
16
on R3 the parameters are the same, and its working..
im new to operating system admin, so i dunno what parameters are these.
what are them ?
Message was edited by:
Joaquim Pereira
Dear Joaquim,
according to the kernel documentation:
/proc/sys/kernel/sem - The maximum number and size of semaphore sets that can be allocated.
/proc/sys/kernel/msgmni - Sets maximum number of message queues
Anyway, your values are far to small. On sles9 the suse-sapinit rpm package should adjust these values by default. I assume that you did not install this package. Please do so, or adjust the values to the ones of my last reply.
Thanks
Hannes Kuehnemund
SAP LinuxLab
Just for completeness, as an example:
dev_disp:
*** ERROR => e=28 semget(21032,1,2047) (28: No space left on device) [semux.c 1027]
*** ERROR => e=28 semget(21032,1,2047) (28: No space left on device) [semux.c 1027]
*** ERROR => SemRq: Implicit SemInit failed. Key=32 [semux.c 1633]
*** ERROR => SemRel: Ill. internal Handle. Key=32 [semux.c 1823]
<ES> client 0 initializing ....
*** ERROR => e=28 semget(21033,1,2047) (28: No space left on device) [semux.c 1027]
***LOG R63=> ThWpHalt, halt wp () [thxxhead.c 14979]
Error: "No space left on device" -> ENOSPC
man 2 semget:
Indicates that either the system limit for the maximum number of semaphore sets (SEMMNI), or the system wide maximum number of semaphores (SEMMNS), would be exceeded.
Settings used in this example:
# cat /proc/sys/kernel/sem
1250 32000 100 25
MSL MNS OPM MNI
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
> I *** ERROR => e=28 semget(20145,1,2047) (28: No
> space left on device) [semux.c 1013]
The system tries to map a small amount of memory (20 MB) segment which is not available.
How much swap do you have configured? Is this 32bit or 64bit?
--
Markus
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
i have 30Gb on each server.
its 64bit
we have changes the parameters like Hannes said, and we are now able to keep both instances on all servers alive.
to make the changes permanent ill have to put this on /etc/sysctl.conf, and the entries would be something like :
kernel.sem="1250 256000 100 8192"
kernel.msgmni=1024
correct ?
Message was edited by:
Joaquim Pereira
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.