on 05-23-2012 7:52 AM
Hello,
I'm using MaxDB 7.8.02.21 on Win2008Server StandardEdition 64bit (SP1)
My database freeze (SQL commands are waiting for response, but there is no answer some hours) on SQL command and it is not possible to work with it (I can't connect using sqlcli tool, SQL Studio, Database Studio, ...). The only I can do, is find out database state using dbmcli tool (db_state says that database is ONLINE), but when I try to stop the database (db_offline) it hangs too. The only way to stop the database is kill the kernel process. There are insert statements executed in this time (on tables without triggers) and one procedure which inserts into 2 tables and then returns cursor using recursive cursor.
Here is a part of "Database messages" log (the database hanged at 06:19:57):
*******************************************************************************************
6376: Thread 0x3960 Task 237 2012-05-23 06:09:27 Pager 20003: SVP(1) Start Write Data
6377: Thread 0x3960 Task 237 2012-05-23 06:09:42 Pager 20004: SVP(1) Stop Data IO, Cluster Pages: 0, Cluster IO: 0, Pages: 28735 IO: 6021
6378: Thread 0x3960 Task 237 2012-05-23 06:09:42 Pager 20005: SVP(2) Wait for last synchronizing task: 237
6379: Thread 0x3960 Task 237 2012-05-23 06:09:42 Pager 20006: SVP(2) Stop Wait for last synchronizing task, Pages: 0 IO: 0
6380: Thread 0x3960 Task 237 2012-05-23 06:09:42 DataCache 4: Mark data pages for savepoint (prepare phase)
6381: Thread 0x3960 Task 237 2012-05-23 06:09:42 Pager 20007: SVP(3) Start Write Data
6382: Thread 0x3960 Task 237 2012-05-23 06:09:43 Pager 20008: SVP(3) Stop Data IO, Cluster Pages: 0, Cluster IO: 0, Pages: 466 IO: 138
6383: Thread 0x3960 Task 237 2012-05-23 06:09:43 Pager 20009: SVP(3) Start Write Converter
6384: Thread 0x3960 Task 237 2012-05-23 06:09:43 Pager 20011: SVP(3) Stop Converter IO, Pages: 906 IO: 906
6385: Thread 0x3960 Task 237 2012-05-23 06:09:43 DataCache 3: Savepoint with ID 680 completed
6386: Thread 0x624 Task 1 2012-05-23 06:19:42 Savepoint 1: Savepoint (Time) started by T1
6387: Thread 0x3960 Task 237 2012-05-23 06:19:42 Pager 20003: SVP(1) Start Write Data
6388: Thread 0x3960 Task 237 2012-05-23 06:19:57 Pager 20004: SVP(1) Stop Data IO, Cluster Pages: 0, Cluster IO: 0, Pages: 28270 IO: 6352
6389: Thread 0x3960 Task 237 2012-05-23 06:19:57 Pager 20005: SVP(2) Wait for last synchronizing task: 237
6390: Thread 0x3960 Task 237 2012-05-23 06:19:57 Pager 20006: SVP(2) Stop Wait for last synchronizing task, Pages: 0 IO: 0
6391: Thread 0x4F30 Task 138 2012-05-23 07:36:26 CONNECT 19633: Connect req. (TLS_ENE2, T138, connection obj. 0xe4e02b8, Node:'MapSrv.Cassovia.cs', PID: 12220)
6392: Thread 0x4F30 Task 139 2012-05-23 07:41:04 CONNECT 19633: Connect req. (TLS_ENE2, T139, connection obj. 0xe9afd60, Node:'MapSrv.Cassovia.cs', PID: 16460)
6393: Thread 0x4F30 Task 140 2012-05-23 07:42:06 CONNECT 19633: Connect req. (TLS_ENE2, T140, connection obj. 0xe9b1f88, Node:'MapSrv.Cassovia.cs', PID: 14840)
6394: Thread 0x4F30 Task 141 2012-05-23 07:43:29 CONNECT 19633: Connect req. (TLS_ENE2, T141, connection obj. 0xe75d558, Node:'MapSrv.Cassovia.cs', PID: 9248)
6395: Thread 0x4F30 Task 142 2012-05-23 07:45:09 CONNECT 19633: Connect req. (TLS_ENE2, T142, connection obj. 0xe221ff0, Node:'MapSrv.Cassovia.cs', PID: 11408)
This freezing happened more time before, but I can't replicate it. Sometime it is running some hours, sometime some days, and then freezing occurs.
Can you help me to analyze what happened, and how to avoid it ?
Thank you, Dusan
Hi Dusan,
Since it is not a reproducible case, do the following when you experience the freezing situation next time.
1) Restart the DBAnalyzer with 1 Minute Interval. Call transaction DB50 and choose "Kernel Threads" >> "Task Manager". Activate the DB measurement of time with the clock button (fourth button on the left).
For more info, refer to Section 2 in the solution of the NOTE: 748225
2) When the freezing situation is experienced next time, Use "x_cons" to collect additional information by running the following command.
x_cons <SID> show all 10 10 > x_consshowall.txt
This will collect the information about the tasks that are running or waiting for long time. Using this information we can debug the task that is causing the problem.
3) Attach the above generated "x_consshowall.txt" to the thread
4) If you are an SAP customer, create a message and attach the above information along with the
diag_pack generated using the command:
dbmcli -d <db_name> -u <dbm_user>,<pwd> diag_pack
You can find the diag_pack in the run directory of the databaes.
Regards,
Yashwanth
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello Dusan,
Can you post the knlmsg anf knlmsg.err file for that time period
Thanks
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
7 | |
4 | |
3 | |
2 | |
2 | |
1 | |
1 | |
1 | |
1 | |
1 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.