cancel
Showing results for 
Search instead for 
Did you mean: 

Archiving to SAP Content Server/MaxDB: TCP/IP errors when deleting in paral

Former Member
0 Kudos

Dear experts.

We are struggling to get the optimal setup for archiving to SAP Content Server using MaxDB. After running one or more archiving jobs for a few hours we suddenly loose connection to the Content Server, and we need to restart the content server before continueing. We long thought it was the logspace which was running full, but this was eliminated. Now it seems as the error occurs when 4 or more deletions jobs is running in parallel. (Write->Store->Delete).

The run-times on a typical archiving job for about 3 GB data would be W: 3000sec S:40sec 😧 12000sec. In general the delete job takes about 3 times the write-job.

Everything seems fine until one suddenly get the TCP/IP error after some heavy archiving jobs.

Have anyone experienced anything like this, and are there any good ideas on how to avoid this?

Thank you very much for your inputs on this.

Accepted Solutions (0)

Answers (2)

Answers (2)

Former Member
0 Kudos

Thank you for your generous willingness to help. Our workaround is working so well that we have chosen not to persue this problem until we do upgrade on all systems. It appears that due to compatibility issues with current releases of R/3, Content Server and Win 2003 Server we could not go ahead and update the Content Server alone. Best regards,

Syver

Melanie
Advisor
Advisor
0 Kudos

Hello Syver,

Thomas did not recommend to update the content server (although that would be a good idea, too) but he recommended to update the ODBC driver.

You could do that and keep the content server itself on its current version.

Regards,

Melanie

lbreddemann
Active Contributor
0 Kudos

This does not seem to be a MaxDB problem, but rather one of the Content Server webserver part.

Are there any error messages in the database message file (KNLDIAG)?

regards,

Lars

Former Member
0 Kudos

Thank you for your reply, Lars.

The content server has been running flawless the last two days, but now only CO/PA items has been archived, and the archive job only works with one file at the time. So, no errors to see in the KNLDIAG file. I will check next time the error occurs and update this post. Probably over the weekend.

As it seems the TCP/IP error comes after the write job is finished, and as the delete job is running. -That is when R/3 is confirming the stored archive as the data is deleted. When more files are being deleted at the same time it fails. The content server is running on Win 2003 Server.

Thank you.

/Syver

TTK
Employee
Employee
0 Kudos

Hello Syver

I remember similar problems on Windows systems and TCP/IP, when the OS resources are heavily used, i.e. here open/close many connections to the db.

If you use Windows:

Please look at

[http://support.microsoft.com/default.aspx?scid=kb;en-us;196271]

and

[http://technet.microsoft.com/de-de/library/bb726981(en-us).aspx]

Did you use connection pooling?

Regards Thomas

Former Member
0 Kudos

We were running 3 parallell delete jobs yesterday, and it all went fine until I started the 4th job, then all jobs stopped immediately. The KNLDIAG file at the exact time of the problem looks like this:

2008-10-08 22:57:48 0x918 19617 DEVIO Single I/O attach, 'D:\sapdb\SDB\sapdata\DISKD0025', UKT:8

2008-10-08 22:57:48 0x918 19617 DEVIO Single I/O attach, 'D:\sapdb\SDB\sapdata\DISKD0003', UKT:8

2008-10-08 23:31:14 0x918 19637 CONNECT 'vreceive', COMMAND TIMEOUT, T74

2008-10-08 23:31:14 0x914 19637 CONNECT 'vreceive', COMMAND TIMEOUT, T72

2008-10-08 23:31:14 0x918 19651 CONNECT Connection released, T74

2008-10-08 23:31:14 0x914 19651 CONNECT Connection released, T72

This morning I just ran another test with 2 jobs which also failed, and the two job logs contains:

09.10.2008 07:06:41 Archive file 000851-001EC_PCA_ITM is being verified

09.10.2008 07:16:54 Archive file 000851-001EC_PCA_ITM is being processed

09.10.2008 07:16:55 Starting deleting data

09.10.2008 07:36:28 Connection to http://192.1.4.5:1090/ContentServer/ContentServer.: TCP/IP error

and

09.10.2008 07:13:30 Archive file 000851-010EC_PCA_ITM is being verified

09.10.2008 07:20:55 Archive file 000851-010EC_PCA_ITM is being processed

09.10.2008 07:20:56 Sletting av data begynner .

09.10.2008 07:39:28 Connection to http://192.1.4.5:1090/ContentServer/ContentServer.: Time limit exceeded

and the KNLDIAG file looked like this:

2008-10-09 07:11:34 0x914 19617 DEVIO Single I/O attach, 'D:\sapdb\SDB\sapdata\DISKD0016', UKT:7

2008-10-09 07:11:34 0x914 19617 DEVIO Single I/O attach, 'D:\sapdb\SDB\sapdata\DISKD0023', UKT:7

2008-10-09 07:11:35 0x90C 53040 SAVPOINT (3) Stop Conv I/O Pages 2119 IO 265

2008-10-09 07:11:35 0x90C 53071 SAVPOINT B20SVP_COMPLETED: 689

2008-10-09 07:12:26 0x918 19633 CONNECT Connect req. (T74, Node:'', PID:4272)

2008-10-09 07:12:26 0x918 19617 DEVIO Single I/O attach, 'D:\sapdb\SDB\sapdata\DISKD0001', UKT:8

2008-10-09 07:12:26 0x918 19617 DEVIO Single I/O attach, 'D:\sapdb\SDB\sapdata\DISKD0015', UKT:8

2008-10-09 07:12:26 0x918 19617 DEVIO Single I/O attach, 'D:\sapdb\SDB\sapdata\DISKD0004', UKT:8

2008-10-09 07:12:26 0x918 19651 CONNECT Connection released, T74

2008-10-09 07:12:35 0x918 19633 CONNECT Connect req. (T74, Node:'', PID:4272)

2008-10-09 07:12:35 0x918 19651 CONNECT Connection released, T74

2008-10-09 07:12:46 0x918 19633 CONNECT Connect req. (T74, Node:'', PID:4272)

2008-10-09 07:12:47 0x918 19651 CONNECT Connection released, T74

2008-10-09 07:31:47 0x914 19637 CONNECT 'vreceive', COMMAND TIMEOUT, T72

2008-10-09 07:31:47 0x914 19651 CONNECT Connection released, T72

2008-10-09 07:47:20 0x914 19633 CONNECT Connect req. (T72, Node:'', PID:4272)

2008-10-09 07:47:20 0x914 19651 CONNECT Connection released, T72

2008-10-09 07:49:01 0x914 19633 CONNECT Connect req. (T72, Node:'', PID:4272)

2008-10-09 07:49:01 0x914 19651 CONNECT Connection released, T72

2008-10-09 07:49:27 0x910 19637 CONNECT 'vreceive', COMMAND TIMEOUT, T70

2008-10-09 07:49:27 0x910 19651 CONNECT Connection released, T70

We have not yet looked at the tips regarding the windows errors, but will do that when we get the possibilty. In the mean time; do you read anything from this above? It looks like when several jobs are crashing at the same time there is one which has the "Time limit exceeded" and the rest TCP/IP error.

Could it possibly be the MAXUSERSESSIONS parameter, or are you still leaning towards something outside the database?

TTK
Employee
Employee
0 Kudos

"TCP/IP error" sounds more like a resource problem of the OS. At least I cannot see anything pointing to a DB problem.

If the resource problem is addressed by the links I sent previously, connection pooling of the driver manager might be help, too.

Regards Thomas

Former Member
0 Kudos

Thank you Thomas.

I will implement the registry change now and retry, and then have a look at possibly "connection pooling of the driver manager". I would first need to find someone wo can help me with this, as I do not know what this means...

I will update you after the next tests.

Thank you for your reply and regards,

Syver

Former Member
0 Kudos

All right... I have now both implemented the suggested changes, but alas, no improvement. I was positive on my first test after increasing the number of user ports, when I ran several heavy jobs in test mode successfully, but in production it crashed again... Connection pooling activated for SAP DB, but no improvement...

Job log in SAP:

13.10.2008 17:03:08 Connection to http://192.1.4.5:1090/ContentServer/ContentServer.: TCP/IP error

13.10.2008 17:03:08 Error during verification when accessing archive file 000851-015EC_PCA_ITM

Knldiag in CS:

2008-10-13 16:46:11 0x90C 19617 DEVIO Single I/O attach, 'D:\sapdb\SDB\sapdata\DISKD0029', UKT:7

2008-10-13 17:16:07 0x90C 19637 CONNECT 'vreceive', COMMAND TIMEOUT, T72

2008-10-13 17:16:07 0x90C 19651 CONNECT Connection released, T72

I have had network experts monitor the network during the error, and nothing exceptional found.

This is a SAP DB Version 7.4.3.23 running on 2003 Server installed on 64bit hardware (32 bit compatible). Could this cause any problem?

I have however found a workaround in scheduling the delete jobs using RSARCHD with max jobs:1. However, the core of the problem still exists...

Best regards,

Syver

TTK
Employee
Employee
0 Kudos

If the version of the ODBC driver is also 7.4.3.23 , please upgrade to the latest driver available in 7.4, 7.5 or 7.6. Build 23 is quite old (over 5 years IIRC). Please verify the OSS notes for updating.

You find the version with the context menu entry "Properties" in the Explorer.

Regards Thomas