cancel
Showing results for 
Search instead for 
Did you mean: 

incremental backup errors?

former_member192710
Participant
0 Kudos

Folks;

in our environment, we use to do an incremental backup each hour. So far, browsing through the log files, I figured out that these backups sometimes seems to fail for reasons not completely clear to me. There are two different kinds of errors exposed here:

- The majority of failed incremental backups seems to have failed like this:

-24988,ERR_SQL: SQL error
-903,Host file I/O error
3,Data backup failed

- Some other incrementals, however, failed in a more verbose yet still somewhat obscure way (given, in example, the volume BFDATA_CURRENT3 could very well be created during the same backup run the day before):

-24988,ERR_SQL: SQL error
-903,Host file I/O error
3,Data backup failed
1,Backupmedium #1 (BFDATA_CURRENT3) Could not create volume
6,Backup error occured, Error code 3700 "hostfile_error"
17,Servertask Info: because Error in backup task occured
10,Job 1 (Backup / Restore Medium Task) [executing] WaitingT154 Result=3700
6,Error in backup task occured, Error code 3700 "hostfile_error"

As there seems no real pattern which error does appear and/or why/when an backup actually does fail or succeed, I am pretty confused about this. At the moment, this is still MaxDB 7.6.03 on 32bit Linux. Ideas, anyone?

Thanks in advance and kind regards,

Kristian

Accepted Solutions (0)

Answers (1)

Answers (1)

lbreddemann
Active Contributor
0 Kudos

Hi there,

this looks like that there are problems with actually accessing the backup file.

Is there enought space in the filesystem for it?

Can the Kernel access the file?

regards,

Lars

former_member192710
Participant
0 Kudos

Hi Lars;

and first off, thanks for your quick response on that. To answer your questions:

- As far as I can say, there's plenty of room left in the file system to store backup (72 gig volume, 44 gigs free, the database itself is around 10 gigabytes with each of the incrementals being in between 3 and 250 megs).

- The folder where these files are stored is rwx for user sdb and group sdba, the files in there are rw- for both group sdb and user sdba. The machine is attached to an ActiveDirectory domain, backup is triggered using cron and one of the AD user accounts who is member of local sdba group.

I am not really sure, but what surprises me is that it doesn't always fail... Full backup (done at night) always worked so far. Incremental backup (done 12 times per day), in example, failed three times yesterday and five times the day before... Strange. Does this fit in somewhere?

Cheers and thanks again,

Kristian

Melanie
Advisor
Advisor
0 Kudos

Hi Kristian,

have you checked file knldiag or knldiag.err to find more information about these backup errors?

Are you always using the same backup template for the backups? Means: are you writing the backup always to the same file?

Do you use the overwrite option for the backup template?

Regards, Melanie

former_member192710
Participant
0 Kudos

Hi Melanie;

and first off, thanks very much for pointing me elsewhere. knldiag:

2009-09-03 11:00:17  4793 ERR 11277 IPC      create_sem: semget error, No space left on device
2009-09-03 11:00:17  4793 ERR 11000 d0_aopen Error during creating semaphore, resource problem 
2009-09-03 11:00:17  4809 ERR 52012 SAVE     error occured, basis_err 3700
2009-09-03 11:00:17  4809 ERR     3 Backup   Data backup failed
2009-09-03 11:00:17  4809 ERR     1 Backup    +   Backupmedium #1 (BFDATA_CURRENT4) Could not create volume
2009-09-03 11:00:17  4809 ERR     6 KernelCo  +   Backup error occured, Error code 3700 "hostfile_error"
2009-09-03 11:00:17  4809 ERR    17 SrvTasks  +   Servertask Info: because Error in backup task occured
2009-09-03 11:00:17  4809 ERR    10 SrvTasks  +   Job 1 (Backup / Restore Medium Task) [executing] WaitingT148 Result=3700
2009-09-03 11:00:17  4809 ERR     6 KernelCo  +   Error in backup task occured, Error code 3700 "hostfile_error"

Well... My backup volumes have no initial size set (create using dbmgui) and are set to "overwrite", re-used daily. My initial assumption was that, doing so, the backup medium would be likely to be overwritten and (as a file) recreated whenever used for a backup, but I am getting to the conclusion that this might not be how things work... ?

Cheers and thanks again,

Kristian

Melanie
Advisor
Advisor
0 Kudos

Hello Kristian,

well, the error is not about space in the file system, but about semaphores. The database instance requires additional semaphores to perform the backup. However, all of the semaphores available for the system are already being used. This is why the backup terminates.

You need to increase the number of possible semaphores on os level. That should solve the problem.

Your assumptions regarding the overwriting of the backup files are correct.

You should just make sure that you save the backup file to a different location before the next backup starts overwriting the file! Otherwise you might end up with no backup at all - e.g. if the second backup fails...

Best regards,

Melanie

former_member192710
Participant
0 Kudos

Hi Melanie;

and first off, thanks very much for your pointers. So I am about to increase the number of sockets available to the system and see how this will work out. By the way I remember another RDBMS having a pretty extensive documentation on which Linux kernel parameters should be set / modified before even considering an installation - is there something comparable to this related to MaxDB? Should I play with other kernel settings than the number of available semaphores to be absolutely sure?

>

> Your assumptions regarding the overwriting of the backup files are correct.

> You should just make sure that you save the backup file to a different location before the next backup starts overwriting the file! Otherwise you might end up with no backup at all - e.g. if the second backup fails...

Well yes I am at least copying out the volumes to some other host after the backup has been done in order to have them dumped to tape each evening. But my question is another one: Is an existing backup file deleted and recreated before backup starts, or is it merely treated as a "volume" which will be written to? Can a "small" incremental backup file prevent a "larger" incremental dump from being fully written?

Thanks again and best regards,

Kristian

lbreddemann
Active Contributor
0 Kudos

> is there something comparable to this related to MaxDB? Should I play with other kernel settings than the number of available semaphores to be absolutely sure?

Sure there is such a thing

Check SAP note [628131 SAP DB / MaxDB operating system parameters on Unix|https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/com.sap.km.cm.docs/oss_notes/sdn_oss_bc_db/~form/handler%7b5f4150503d3030323030363832353030303030303031393732265f4556454e543d444953504c4159265f4e4e554d3d363238313331%7d].

In general the semmi -parameter should be well above 9000.

A rule-of-thumb formula for the value would be (stolen from the note):


MaxDBs semmni >=     MAXUSERTASKS 
                        + (number of UKTs + 10) 
                        + (number of volumes * _IOPROCS_PER_DEV) 
                        + number of data volumes 
                        + number of backup devices

>

> Well yes I am at least copying out the volumes to some other host after the backup has been done in order to have them dumped to tape each evening. But my question is another one: Is an existing backup file deleted and recreated before backup starts, or is it merely treated as a "volume" which will be written to? Can a "small" incremental backup file prevent a "larger" incremental dump from being fully written?

No, the new backup is effectively a new file - otherwise the file size would stay as large as the largest backup you wrote to it, which is not exactly what one might want here...

regards,

Lars

former_member192710
Participant
0 Kudos

Hi Lars;

and again thanks for the pointers.

> Sure there is such a thing

> Check SAP note [628131 SAP DB / MaxDB operating system parameters on Unix|https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/com.sap.km.cm.docs/oss_notes/sdn_oss_bc_db/~form/handler%7b5f4150503d3030323030363832353030303030303031393732265f4556454e543d444953504c4159265f4e4e554d3d363238313331%7d].

Hmmm. The document doesn't seem to be accessible to me (403) - this is limited to SAP customers only?

> In general the semmi -parameter should be well above 9000.

> A rule-of-thumb formula for the value would be (stolen from the note):

>


> MaxDBs semmni >=     MAXUSERTASKS 
>                         + (number of UKTs + 10) 
>                         + (number of volumes * _IOPROCS_PER_DEV) 
>                         + number of data volumes 
>                         + number of backup devices

>

OK. I will revise my settings and see how things move along.

> No, the new backup is effectively a new file - otherwise the file size would stay as large as the largest backup you wrote to it, which is not exactly what one might want here...

Thanks for clarifying this.

Best regards,

Kristian

Edited by: Kristian Rink on Sep 4, 2009 10:33 AM

Edited by: Kristian Rink on Sep 4, 2009 10:33 AM