cancel
Showing results for 
Search instead for 
Did you mean: 

Production Server - got Restarted.

Former Member
0 Kudos

Hi,

Recently our Production Server had restarted itself.

Could not find the Reason. Following are the logs of dev_disp log.

It stopped around 20:23 and started again at 21:49.

could find logs like ::

>>Operating system call WSASend failed

>>1 possible network problems detected - check tracefile and adjus |

Searched SMP and found some notes relating to NIPING.

55147 WinNT: Connection reset by peer

500235 Network Diagnosis with NIPING

But would like to find out if any one has faced the same problem and if so, how did you solve it.

Thanks & Regards

L Raghunahth

DEV_DISP

-


16:44:04|<<SERVER NAME>>|DP | | | | | |Q0 |4|Connection to user 769 (USR NAME), terminal 49 (GCC81559 ) lost |

16:50:39

<<SERVER NAME>>

DP

Q0

I

Operating system call recv failed (error no. 10054)

16:50:44

<<SERVER NAME>>

DP

Q0

4

Connection to user 783 (USR NAME), terminal 46 (GCC14636 ) lost

16:54:36

<<SERVER NAME>>

DIA

000

100

K-KURASAWA

SESS

US

1

User <usrname> locked due to incorrect logon

20:23:15

<<SERVER NAME>>

DP

Q0

I

Operating system call recv failed (error no. 10054)

20:23:19

<<SERVER NAME>>

DP

Q0

4

Connection to user 1748 (USR NAME), terminal 51 (GCC11196 ) lost

20:23:38

<<SERVER NAME>>

DP

Q0

I

Operating system call recv failed (error no. 10054)

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DIA

001

R0

Z

The update dispatch info was reset

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:39

<<SERVER NAME>>

DP

Q0

G

Request (type DIA) cannot be processed

20:23:40

<<SERVER NAME>>

DP

Q0

I

Operating system call connect failed (error no. 10061)

20:23:40

<<SERVER NAME>>

DP

Q0

N

Failed to send a request to the message server

20:23:40

<<SERVER NAME>>

DIA

001

R0

R

The update has been deactivated following a system error

20:23:40

<<SERVER NAME>>

DIA

001

R0

Z

The update dispatch info was reset

20:23:40

<<SERVER NAME>>

DP

Q0

N

Failed to send a request to the message server

20:23:40

<<SERVER NAME>>

DIA

001

R0

R

The update has been deactivated following a system error

20:23:41

<<SERVER NAME>>

DP

Q0

N

Failed to send a request to the message server

20:23:41

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:23:45

<<SERVER NAME>>

DP

Q0

I

Operating system call connect failed (error no. 10061)

20:23:56

<<SERVER NAME>>

DIA

001

Q0

I

Operating system call WSASend failed (error no. 10054)

20:23:57

<<SERVER NAME>>

DIA

001

Q0

I

Operating system call connect failed (error no. 10061)

20:23:58

<<SERVER NAME>>

UP2

030

Q0

2

Stop Workproc30, PID 9540

20:23:58

<<SERVER NAME>>

DIA

009

Q0

2

Stop Workproc 9, PID 10992

20:23:58

<<SERVER NAME>>

DIA

016

Q0

2

Stop Workproc16, PID 7360

20:23:58

<<SERVER NAME>>

BTC

027

Q0

2

Stop Workproc27, PID 7892

20:23:58

<<SERVER NAME>>

BTC

028

Q0

2

Stop Workproc28, PID 9988

20:23:58

<<SERVER NAME>>

DIA

003

Q0

2

Stop Workproc 3, PID 10036

20:23:58

<<SERVER NAME>>

DIA

006

Q0

2

Stop Workproc 6, PID 4360

20:23:58

<<SERVER NAME>>

BTC

025

Q0

2

Stop Workproc25, PID 8528

20:23:58

<<SERVER NAME>>

DIA

002

Q0

2

Stop Workproc 2, PID 4336

20:23:59

<<SERVER NAME>>

UP1

021

Q0

2

Stop Workproc21, PID 9652

20:23:59

<<SERVER NAME>>

DIA

013

Q0

2

Stop Workproc13, PID 7260

20:23:59

<<SERVER NAME>>

BTC

026

Q0

2

Stop Workproc26, PID 1060

20:23:59

<<SERVER NAME>>

DIA

015

Q0

2

Stop Workproc15, PID 8388

20:23:59

<<SERVER NAME>>

UP1

020

Q0

2

Stop Workproc20, PID 10048

20:23:59

<<SERVER NAME>>

DIA

005

Q0

2

Stop Workproc 5, PID 572

20:23:59

<<SERVER NAME>>

DIA

011

Q0

2

Stop Workproc11, PID 7224

20:24:00

<<SERVER NAME>>

DIA

014

Q0

2

Stop Workproc14, PID 7956

20:24:00

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:00

<<SERVER NAME>>

DIA

012

Q0

2

Stop Workproc12, PID 10592

20:24:00

<<SERVER NAME>>

SPO

029

Q0

2

Stop Workproc29, PID 10760

20:24:00

<<SERVER NAME>>

DIA

017

Q0

2

Stop Workproc17, PID 6264

20:24:00

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:00

<<SERVER NAME>>

DIA

018

Q0

2

Stop Workproc18, PID 9332

20:24:00

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:00

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:00

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:00

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:00

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:00

<<SERVER NAME>>

DIA

008

Q0

2

Stop Workproc 8, PID 9592

20:24:00

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:01

<<SERVER NAME>>

BTC

024

Q0

2

Stop Workproc24, PID 6832

20:24:01

<<SERVER NAME>>

DIA

019

Q0

2

Stop Workproc19, PID 1576

20:24:01

<<SERVER NAME>>

DIA

004

Q0

2

Stop Workproc 4, PID 9968

20:24:01

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:01

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:01

<<SERVER NAME>>

DIA

007

Q0

2

Stop Workproc 7, PID 10464

20:24:01

<<SERVER NAME>>

UP1

023

Q0

2

Stop Workproc23, PID 9512

20:24:01

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:02

<<SERVER NAME>>

UP1

022

Q0

2

Stop Workproc22, PID 8328

20:24:02

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:02

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:02

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:02

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:03

<<SERVER NAME>>

DIA

001

Q0

I

Operating system call connect failed (error no. 10061)

20:24:04

<<SERVER NAME>>

DIA

010

Q0

2

Stop Workproc10, PID 9764

20:24:09

<<SERVER NAME>>

DIA

001

Q0

I

Operating system call connect failed (error no. 10061)

20:24:15

<<SERVER NAME>>

DIA

001

Q0

I

Operating system call connect failed (error no. 10061)

20:24:20

<<SERVER NAME>>

DIA

001

GI

0

Error calling the central lock handler

20:24:20

<<SERVER NAME>>

DIA

001

GI

3

> Failed to clean up lock entries

20:24:20

<<SERVER NAME>>

DIA

001

Q0

2

Stop Workproc 1, PID 9852

20:24:20

<<SERVER NAME>>

RD

Q0

I

Operating system call recv failed (error no. 10054)

20:24:57

<<SERVER NAME>>

DIA

000

100

<USRNME>

Y_GE

Q0

2

Stop Workproc 0, PID 8212

20:24:57

<<SERVER NAME>>

RD

S3

0

SAP gateway was closed

20:25:24

<<SERVER NAME>>

DP

Q0

5

Stop SAP System, Dispatcher Pid 68

21:49:37

<<SERVER NAME>>

DP

E1

0

Buffer SCSA Generated with Length 4096

21:49:37

<<SERVER NAME>>

DP

Q0

0

Start SAP System, SAPSYSTEM 02, Dispatcher Pid 8116

21:49:42

<<SERVER NAME>>

DP

GZ

Z

> 1 possible network problems detected - check tracefile and adjus

21:49:43

<<SERVER NAME>>

RD

S0

0

SAP Gateway Started (PID: 6368)

21:49:43

<<SERVER NAME>>

WRK

000

Q0

Q

Start Workproc 1, Pid 11060

21:49:43

<<SERVER NAME>>

WRK

000

Q0

Q

Start Workproc 4, Pid 9344

|21:49:43|<<SERVER NAME>>|WRK |000| | | | |Q0 |Q|Start Workproc 0, Pid 10656

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Usually in Windows we have to reboot servers after a while, looks like yours is automatic!.... just kidding.

As Markus already said, if Windows reboot by itself is really dificult that a network card caused that, just look at Event Viewer because the cause should be there. In SAP logs you will just find consecuences not causes when is an OS/hardware related problem.

former_member204746
Active Contributor
0 Kudos

maybe Windows update installed a patch and automatically rebooted your server.

check youe Windows event log, if "USER32" rebooted your server, you have found the culprit.

Former Member
0 Kudos

Hi,

Thanks everyone for your valuable replies.

Actually the OS log is in Japanese and I have translated the log during the time of the incident.

Actually there are some more log after 21:45, if you need I will translate it.

-


2008/10/30,20:23:34,Service Control Manager,Error,none,7034,N/A,SRV15752,SAPOsCol service terminated unexpectedly. It has done this 1 time(s).

2008/10/30,20:23:36,ClusSvc,Error,Failover Manager,1069,N/A,SRV15752,Resource Group 'SAP SRV' Cluster Resource 'SAP SRV SAPOsCol' has failed.

2008/10/30,20:23:38,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,The SAPSRV_01 service was successfully sent a Stop control.

2008/10/30,20:23:38,Service Control Manager,Info,none,7036,N/A,SRV15752,The SAPSRV_01 service entered the stopped state.

2008/10/30,20:23:38,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,The SAPSRV_00 service was successfully sent a Stop control.

2008/10/30,20:23:38,Service Control Manager,Info,none,7036,N/A,SRV15752,The SAPSRV_00 service entered the stopped state.

2008/10/30,20:24:09,Service Control Manager,Error,none,7011,N/A,SRV15752,Timeout (3000 milliseconds) waiting for transaction response from the SAPSRV_02 service.

2008/10/30,20:24:39,Service Control Manager,Error,none,7011,N/A,SRV15752, Timeout (3000 milliseconds) waiting for transaction response from the service.

2008/10/30,20:24:40,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_10 successfully sent a Stop control

The Cluster Service failed to bring the Resource Group "Cluster Group" completely online or offline.

2008/10/30,20:24:40,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_10 Service entered the stopped state.

2008/10/30,20:24:45,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPOsCol successfully sent a Start control

2008/10/30,20:24:45,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPOsCol service entered the running state.

2008/10/30,20:24:46,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_01 successfully sent a Start control

2008/10/30,20:24:47,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_00 successfully sent a Start control

2008/10/30,20:24:47,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_02 successfully sent a Start control

2008/10/30,20:24:47,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_10 successfully sent a Start control

2008/10/30,20:24:49,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_01 service entered the running state.

2008/10/30,20:24:50,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_00 service entered the running state.

2008/10/30,20:24:50,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_02 service entered the running state.

2008/10/30,20:24:51,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_10 service entered the running state.

2008/10/30,20:25:04,ClusSvc,Error,Failover Manager ,1069,N/A,SRV15752,Resource Group 'SAP SRV'cluster resource'SAP SRV 02 Instance' failed。

2008/10/30,20:25:25,ClusSvc,Error,Failover Manager ,1069,N/A,SRV15752,Resource Group 'SAP SRV' cluster resource'SAP SRV 02 Service' failed。

2008/10/30,20:25:25,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_02 successfully sent a Stop control

2008/10/30,20:25:25,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_02 Service entered the stopped state.

2008/10/30,20:25:26,Service Control Manager,Info,none,7035,GPC-DM\sapcluster,SRV15752,SAPSRV_02 successfully sent a Start control

2008/10/30,20:25:28,Service Control Manager,Info,none,7036,N/A,SRV15752,SAPSRV_02 service entered the running state.

2008/10/30,21:45:41,ClusSvc,Info,Failover Manager ,1205,N/A,SRV15752, The Cluster Service failed to bring the Resource Group "SAP SRV" completely online or offline.

2008/10/30,21:45:41,ClusSvc,Warning,Failover Manager ,1146,N/A,SRV15752,The cluster resource monitor died unexpectedly, an attempt will be made to restart it.

-


Regards,

Raghunahth L

rolfc_weber
Contributor
0 Kudos

Hi,

According to your OS log it seems you running MSCS cluster....

We had similar problems (SAP bouncing / being restarted), in our clusters, that was solved by updating the SAP cluster resources saprc.dll and saprcex.dll according to sap note [1043592|http://service.sap.com/sap/support/notes/1043592].

Maybe this could help you...

By the way it does not harm checking the version of the dll's and update them to the newest available...

Regards

Rolf

Former Member
0 Kudos

Hi

Thanks a lot for your answer.

Were you with WINDOWS 2k3 sp2..

Did you also Uninstall SP2 .

I could not follow the note as it written for installation...

Can you explain exactly as to what you did to solve the problem..

Thanks a lot.

Best regards,

L Raghunahth

rolfc_weber
Contributor
0 Kudos

Hi,

No we did not uninstall SP2.

We installed updated version of the SAP dll's.

Start download the newest NTCLUST.SAR file from the service marketplace matching your environment.

(It can be found in the database independent Kernel folder).

Then follow only step 1 of solution in SAP note [867521|http://service.sap.com/sap/support/notes/867521]

(This note describes other things to do, but here the imported is how to replace the files of the NTCLUST.SAR file)...

Hope this helps....

Regards

Rolf

Former Member
0 Kudos

Thanks Rolf.

That has answered all my doubts.

Thanks a lot for your time.

Best Regards

L Raghunahth

Answers (4)

Answers (4)

Former Member
0 Kudos

can we see the work process event log

markus_doehr2
Active Contributor
0 Kudos

> can we see the work process event loل

If an operating system crashes or reboots the reason should be in the event log (in case of Windows). Windows 2003 writes something like "the operating system has been rebooted after an unexpected shutdown" when this happens. Messages before that message should give an idea about what happened.

The workprocess is an application running on TOP of the OS, it won't have any information what happened.

Markus

markus_doehr2
Active Contributor
0 Kudos

What is in your windows event log? I doubt, that the problem is caused by the network card.

Markus

Former Member
0 Kudos

HI Raghunath

We have faced the similar problem , in our case it was the problem with windows patch , we were running 4.7 on windows and one applicatoin server used to restart by itself with the similar message and problem got resolved by applying patches at OS level

you can also check the message and gateway server logs but u might not find anything substancial related to SAP other than Network problem ,,

Ardhian is quite right ,, it can be because of the network problem ,,,as well ...

Hope this information might help you ,,

cheers

dEE

Edited by: Deep Kwatra on Nov 17, 2008 1:44 PM

Former Member
0 Kudos

hi,

please check your network connection. Also check your NIC card. If possible try to replace it.

ardhian

http://sapbasis.wordpress.com

http://ardhian.kioslinux.com

Former Member
0 Kudos

Hi,

Thanks for your reply. Your inputs are very useful. I will check the NIC Card.

BTW, have you actually faced this problem in your System?.

As this is the PRD server, I would like to collect inputs from people who actually faced the same problem.

Thanks & Best Regards,

L Raghunahth.