on 10-21-2010 12:36 PM
Hi Team ,
Our Production is running on MS SQL Cluster 2000. Today we faced a unexpected issue. One of the Cluster Service gone offline unexpectedly. The cluster is on fail over , but even that is not happened. It happened just after a archival job(only write job) completed. (That is only my concern , i don`t know how much it related to our problem ). We are having DR log shipping also.
System Detail :
Window : 2003 Sever IA64 SP2
Database : MsSql 2000 Ver 8 (SP4)
<removed_by_moderator>
Edited by: Juan Reyes on Oct 22, 2010 11:53 AM
You should be able to see all logs in Windows event viewer.
Not sure how your cluster groups are configured. each cluster group has parameters for failover ie. how many attempts to start on same server and then move recourse to another server.
>One of the Cluster Service gone offline
Can you please elaborate this is it SAP or DB or Webdisp etc?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello,
Need to update the sap resource dll's.
What errors do you get in trace files in work dir dev_disp.log dev_ms.log and dev_w0 when
cluster fails might be network issue.
regards,
John Feely
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
hello,
Any errors in Windows event viewer when you try to start the cluster service.
Please update the sap resource dll's from the marketplace in /system32 folder
on both nodes and reboot Windows.
See Note 828432 Enabling Microsoft Cluster Support in SAP MMC
502203 SAP resource DLL
regards,
John Feely
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Dear ,
Its a production server , some 400 users are logged in daily. we can`t restart the server..........my concern is that only what cause of this happening , i check event log and cluster log even no clue
Check the Cluster log
This is time when the cluster goes down
04:19:59.713 INFO [Qfs] GetDiskFreeSpaceEx N:\MSCS\, status 0
00000b60.00000b6c::2010/10/21-04:20:03.583 INFO SAP Resource <SAP-R/3 EDP>: LooksAlive request.
00000ac8.000015d0::2010/10/21-04:20:03.603 INFO [CP] CppDepositCheckpoint checkpointing data to file N:\MSCS\c7092cf0-58f3-4a7e-b7b0-224102e71a9e\00000004.CPT
00000ac8.000015d0::2010/10/21-04:20:03.603 INFO [Qfs] QfsCreateDirectory N:\MSCS\c7092cf0-58f3-4a7e-b7b0-224102e71a9e, status 183
00000ac8.000015d0::2010/10/21-04:20:03.603 INFO [Qfs] QfsOpenFile N:\MSCS\c7092cf0-58f3-4a7e-b7b0-224102e71a9e\00000004.CPT => 2, 738 status 0
00000ac8.000015d0::2010/10/21-04:20:03.603 INFO [Qfs] WriteFile 738 (regf) 3860, status 0 (0=>0)
00000ac8.000015d0::2010/10/21-04:20:03.613 INFO [Qfs] WriteFile 738 (....) 4096, status 0 (0=>0)
00000ac8.000015d0::2010/10/21-04:20:03.613 INFO [Qfs] WriteFile 738 (M...) 4096, status 0 (0=>0)
00000ac8.000015d0::2010/10/21-04:20:03.613 INFO [Qfs] WriteFile 738 (....) 236, status 0 (0=>0)
00000ac8.000015d0::2010/10/21-04:20:03.613 INFO [Qfs] QfsFlushBuffers 738, status 0
As this is your windows server all the activity will be logged under event viewer.
Have you checked the event viewer for System logs?
for example The node lost communication with cluster node '<server_name>' on network 'PRIVATE LAN'.
If you had a heartbeat issue between servers then also your cluster services will move from one node to another.
I suggest you must look for errors/warning in Start -> setting -> control Panel -> admin tools -> event viewer -> systems.
User | Count |
---|---|
79 | |
9 | |
9 | |
7 | |
7 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.