on 03-26-2013 3:08 PM
Hi everyone,
I've moved a database from Windows 2003 32bits to anothed serveur running Windows 2003 64 bits, using dump/restore.
Everything worked fines for 5 days, and the database crashed:
2013-03-19 03:03:08 0xDC4 ERR 20013 MOVECODE Module Data_PrimTreeStatistic.cpp call index 4
2013-03-19 03:03:08 0xDC4 ERR 20011 MOVECODE + Bad parameter: source size [-2] dest size [8096], source addr [0X7FCB9588F74]+1, dest addr [0X7FFCD5F0DF2]+1,
2013-03-19 03:03:08 0xDC4 ERR 20011 MOVECODE + -2 bytes to copy
2013-03-19 03:03:08 0xDC4 ERR 20015 Data Base error 'System error: BD Data page corrupted' during update statistic on record index 0 on page 2909347 of root
2013-03-19 03:03:08 0xDC4 ERR 20015 Data 2880955
2013-03-19 03:03:11 0xF08 ERR 51080 SYSERROR -9407 unexpected error
2013-03-19 03:03:12 0xF08 ERR 9 Data Aborting statistics computation due to system error
2013-03-19 03:03:12 0xF08 ERR 42 SrvTasks + Error while collecting statistics for table 0000000000271703; error data_page_corrupted
2013-03-19 03:03:12 0xF08 ERR 20013 MOVECODE + Module Data_PrimTreeStatistic.cpp call index 4
2013-03-19 03:03:12 0xF08 ERR 20011 MOVECODE + Bad parameter: source size [-2] dest size [8096], source addr [0X7FCB9588F74]+1, dest addr [0X7FFCD5F0DF2]+1,
2013-03-19 03:03:12 0xF08 ERR 20011 MOVECODE + -2 bytes to copy
... several times, and then
2013-03-22 11:04:02 0xD00 ERR 18793 EXCEPT EXCEPTION:'Access violation' (0xc0000005), The program code at IP:0x97c814 attempted to read to/from address:0x1fadef28
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE Using 'imagehlp.dll' version: 4.0.5
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE SymbolSearchPath: d:\sdb\data\wrk\TORPEDO;D:\Program Files\sdb\TORPEDO\pgm;D:\Program Files\sdb\TORPEDO\symbols;D:
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE \Program Files\sdb\TORPEDO\symbols;D:\Program Files\sdb\TORPEDO\sap;C:\WINDOWS;D:\Program Files\
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE sdb\TORPEDO\sap\
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE ----> Register Dump <----
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE rax=0x00000000ffffffff rbx=0x00000000caee5490 rcx=0x000000001fade030 rdx=0x00000000caee54d8
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE r8 =0x000000001fadef28 r9 =0x000000001fadcdd0 r10=0x0000000000000000 r11=0x000000001fadcea8
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE r12=0x00000000caee54d8 r13=0x000000001fade030 r14=0x00000000caee19c4 r15=0x0000000000005339
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE rip=0x000000000097c814 rsp=0x000000001fadcd20 rbp=0x000000001fadcf90
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE rsi=0x0000000000000000 rdi=0x000000001fade030
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE ----> Module List <----
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE |.text Start |.text End | Module File Name
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE |0x0000000000400000|0x0000000001070000| D:\Program Files\sdb\TORPEDO\pgm\kernel.exe
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE |0x0000000076050000|0x0000000076287000| D:\Program Files\sdb\TORPEDO\pgm\liboms.dll
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE |0x0000000078880000|0x00000000789c2000| C:\WINDOWS\system32\ntdll.dll
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE |0x0000000078c10000|0x0000000078d1c000| C:\WINDOWS\system32\USER32.dll
2013-03-22 11:04:03 0x138C ERR 19999 BTRACE |0x0000000078d20000|0x0000000078ea3000| C:\WINDOWS\system32\kernel32.dll
So i've stopped the database, run a "check database structure and clear converter in operational state ADMIN".
The test ran for 3 hours, without any error. After that i've restared the database, and run a update sttistics on the involved table. No error.
For me the problem was solved, but today i've found this on the database errors file:
2013-03-26 07:50:03 0xCBC ERR 20013 MOVECODE Module Data_PrimTreeStatistic.cpp call index 4
2013-03-26 07:50:04 0xCBC ERR 20011 MOVECODE + Bad parameter: source size [-2] dest size [8096], source addr [0X7FC135B3096]+1, dest addr [0X7FFB64A0DF2]+1,
2013-03-26 07:50:04 0xCBC ERR 20011 MOVECODE + -2 bytes to copy
2013-03-26 07:50:04 0xCBC ERR 20015 Data Base error 'System error: BD Data page corrupted' during update statistic on record index 0 on page 2909347 of root
2013-03-26 07:50:04 0xCBC ERR 20015 Data 2880955
2013-03-26 07:50:04 0x12D4 ERR 51080 SYSERROR -9407 unexpected error
2013-03-26 07:50:04 0x12D4 ERR 9 Data Aborting statistics computation due to system error
2013-03-26 07:50:04 0x12D4 ERR 42 SrvTasks + Error while collecting statistics for table 0000000000271703; error data_page_corrupted
2013-03-26 07:50:04 0x12D4 ERR 20013 MOVECODE + Module Data_PrimTreeStatistic.cpp call index 4
2013-03-26 07:50:04 0x12D4 ERR 20011 MOVECODE + Bad parameter: source size [-2] dest size [8096], source addr [0X7FC135B3096]+1, dest addr [0X7FFB64A0DF2]+1,
2013-03-26 07:50:04 0x12D4 ERR 20011 MOVECODE + -2 bytes to copy
What can i do to correct this and how can i avoid it ?
TIA
Frédéric.
Hi,
No response...
But i've a question: the table 0000000000271703 is only used for statistics.
So i've renamed it, created a new one and populated it.
So for now i've the old table,containing data and "garbage" and the new one with correct data.
I've also change the way i populate the table, so i dont have corrupted data anymore (except in my "old" table).
If i drop the old table, should i solve my problem ?
Tatayo.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi,
yes, if a table is corrupt, you can rename it and copy all or as many entries as possible to a new table. Just keep in mind that during the 'rename' of a table, all existing objects like indexes, views etc. will continue to point to the original object. So what you would need to do afterwards is to recreate any of these objects...
And, yes, unless the 'corrupt' table is dropped, you will continue to get errors whenever that corrupt structure is accessed e.g. during a 'check data' or 'statistic update'. Therefore you need to drop it first.
In general, let me recommend to run a 'check data' once a week. This check verifies the integrity of your database and is very important to identify errors early on. In my opinion, these errors are almost always hardware/driver/io system related and are not being caused by the database.
On the other hand, if you have a test setup that would repeatedly damage a table (as you have hinted at with your select union statement), let me know the steps how to recreate this and I will check it.
Thorsten
Hi,
Thanks for your reply.
I've ran a "check database structure and clear converter cache in operational state ADMIN" last monday, i didn't get any error.
I've also set the sample value of the involved table to 0, and i don't get error message anymore during statistics update.
Here the structure of the table:
CREATE TABLE "TORPEDO"."INDICATEUR_OLD"
(
"SKU_C_CODE" Char (17) ASCII NOT NULL,
"STO_C_CODE" Char (5) ASCII NOT NULL,
"TI_C_CODE" Char (10) ASCII NOT NULL,
"IND_F_VALEUR" Float (16),
"IND_D_CREATION" Timestamp DEFAULT TIMESTAMP,
PRIMARY KEY("SKU_C_CODE", "STO_C_CODE", "TI_C_CODE"),
FOREIGN KEY "TYPE IND" ("TI_C_CODE") REFERENCES "TORPEDO"."TYPE_INDICATEUR" ("TI_C_CODE") ON DELETE RESTRICT
)
And the query used to populate it:
insert into indicateur(sku_c_code,sto_c_code,ti_c_code,ind_f_valeur)
select '0100012','2ANGE','PA',23 from dual
union
select '0100012','2ANGE','PV',17.5 from dual
union
... (between 2 and 10 unions)
update duplicates
The goal was to insert/update several records with 1 query. The process ran 5 times faster than running one query by insert/update.
I ran something like 100.000 queries like that every day, and i've only find less than 10 corrupted records, 1 or 2 times a week.
The problem only occurs on 64 bits system.
Frédéric.
Ok, this sounds more like a MaxDB bug.
Can you clarify how you have identified table "TORPEDO" being accessed when the error occured? Did you use the root# given in the knldiag message file to find out the table name?
What about the check data? Did you run it while the supposedly damaged table was still in the database or after you had dropped that table?
Which database version are you using on your 32 bit Windows server - is it really the same?
Thorsten
Hi Frederic,
What process did you follow to move the database ?
Regards,
Deepak Kori
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi,
I've backed-up the database from the original server, and restore it to the new server.
I've done it on our test server, and ran several test during 3 weeks to see if everything was OK.
I've investigate a few more, and i've found something interesting: in the two case the same table was "damaged". And this is the only table i've populated with queries like that:
insert into MyTable(col1,col2,col3)
select 1,2,3 from dual
union all
select 3,5,2 from dual
union all
select 6,2,2 from dual
...
I've renamed my table, created a new one with the same name/structure, and change my queries to something more "classic" (insert into() values() ).
Frédéric.
User | Count |
---|---|
84 | |
10 | |
9 | |
8 | |
6 | |
6 | |
6 | |
5 | |
3 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.