cancel
Showing results for 
Search instead for 
Did you mean: 

/hana/data/ filled up at 100% - after volume increase HDB will not start

Former Member
0 Kudos

Good morning guys,

About a week ago, our database (512GB RAM) "crashed'.  Upon investigation, we realized that the data volume had reach 100% (750GB).  We subsequently increase this by 500GB.  We then tried to start the DB.  The indexserver would not start, and other services dependent on it also would not, complaining that the "Master index server is not available yet".  We left this to run overnight...this morning, we found out that our new 1.3TB has also filled up. The index server and other services will still not startup.  Is there a reason for this? Where can one look? The index server trace file raises a DIKFULL event.  Even if we increase this by another 1TB...will it not fill up? Assistance will be appreciated.

Accepted Solutions (0)

Answers (2)

Answers (2)

former_member183326
Active Contributor
0 Kudos

Index server shows:

733]{-1}[-1/-1] 2016-06-14 08:02:03.353555 i EventHandler     EventManagerImpl.cpp(00689) : --event not handled: DiskFullEvent[id= 1, path= /hana/data/FMD/mnt00001/hdb00003/, state= HANDLING]

[9289]{-1}[-1/-1] 2016-06-14 08:03:03.353652 i EventHandler     EventManagerImpl.cpp(00683) : handleEvents(1)

[9289]{-1}[-1/-1] 2016-06-14 08:03:03.353661 i EventHandler     EventManagerImpl.cpp(00686) : --handleEvent: DiskFullEvent[id= 1, path= /hana/data/FMD/mnt00001/hdb00003/, state= NEW]

[9289]{-1}[-1/-1] 2016-06-14 08:03:03.353664 i EventHandler     LocalFileCallback.cpp(00426) : [DISKFULL] restarting queue with 10751 requests

[9289]{-1}[-1/-1] 2016-06-14 08:03:03.354360 i EventHandler     EventManagerImpl.cpp(00689) : --event not handled: DiskFullEvent[id= 1, path= /hana/data/FMD/mnt00001/hdb00003/, state= HANDLING]

[9640]{-1}[-1/-1] 2016-06-14 08:04:03.354451 i EventHandler     EventManagerImpl.cpp(00683) : handleEvents(1)

[9640]{-1}[-1/-1] 2016-06-14 08:04:03.354458 i EventHandler     EventManagerImpl.cpp(00686) : --handleEvent: DiskFullEvent[id= 1, path= /hana/data/FMD/mnt00001/hdb00003/, state= NEW]

[9640]{-1}[-1/-1] 2016-06-14 08:04:03.354462 i EventHandler     LocalFileCallback.cpp(00426) : [DISKFULL] restarting queue with 14326 requests

[9640]{-1}[-1/-1] 2016-06-14 08:04:03.355172 i EventHandler     EventManagerImpl.cpp(00689) : --event not handled: DiskFullEvent[id= 1, path= /hana/data/FMD/mnt00001/hdb00003/, state= HANDLING]

[9316]{-1}[-1/-1] 2016-06-14 08:05:03.355261 i EventHandler     EventManagerImpl.cpp(00683) : handleEvents(1)

[9

Please see SAP Note 1679938

So we can see from this that the Disk is full , so there is no space left in the data volume /hana/data/,

The only way to free space in (/hana/data/) is to backup your data. So you need to run a successful back and backup what is in (/hana/data/) to some other location. If it takes time to do the backup you may be able to free space by following the workaround in the note 1679938 but in this case for the data volume and not the log volume, what happens when you try this workaround?

Were any changes made in the system lately , like large data loads etc? Please see note 2146989 could explain the large disk size in your case? Please attach the output of SYS.M_UNIFIED_TABLE_PERSISTENCE_STATISTICS.

Also please try ALTER SYSTEM RECLAIM DATAVOLUME?

Please see the Question 33 in note 2000003 and also the information

in the note 1870858.

BR

Michael

Former Member
0 Kudos

Thanks Mike.

Note 1679938 - Log Volume is full talks about the log volume.  However, in our case, it is the data volume which has filled up.  We have subsequently increased the space by an additional 500GB. However, the DB still does not come up.

We have copied data in the hana/data to another location, however we are unable to run a DB backup with studio coz the DB is down.

former_member183326
Active Contributor
0 Kudos

Hi,

Please check SAP Note 2083715 - Analyzing log volume full situations and work with your OS team to check the filesystem space.

You may want to remove some trace files or if crashdump or core files had been created in the past

Former Member
0 Kudos

Hi

I moved the files and restarted the DB. No change...system still does not come up. I ran the "top" command and noticed that the hdbindexserver utilization is over 100%.  Is this normal? Could this be contributing to the DB not starting?

davidebruno
Participant
0 Kudos

there is something strange..

can you please stop HANA, and clean shared memory.

Before the restart of the instance you have to be serure that here is no memory allocated by user of hana on OS

(command ipcs from Linux)

after the cleanipc command, try to restart the hana instance

former_member183326
Active Contributor
0 Kudos

Can you please check if the THP is deactivated:

SAP Note 2131662

Former Member
0 Kudos

Hi.  THP is deactivated.

former_member183326
Active Contributor
0 Kudos

Did you go through everything in my reply?

Did you try ALTER SYSTEM RECLAIM DATAVOLUME?

Former Member
0 Kudos

Don't I need to execute that in hdbsql? The DB is offline, so I cannot even connect to hdbsql.  We have reached a resolution that is better we restore from a DB backup. I am also facing a another strange issue there.  We have an initial backup (taken immediately after installation) and want to restore that.  Here is what I do and the error i receive.  Kindly correct me if I am wrong.

HDBSettings.sh recoverSys.py --command="RECOVER DATABASE UNTIL TIMESTAMP '2015-05-15 17:00:00' USING DATA PATH ('location of the backup') CLEAR LOG"

After I run that command...the output I receive is this...

own pid: 40416

recoverSys started: 2016-06-15 02:16:05

testing master: jssfmd0

jssfmd0 is master

shutdown database, timeout is 120

stop system

stop system: jssfmd0

stopping system: 2016-06-15 02:16:05

stopped system: 2016-06-15 02:16:05

creating file recoverInstance.sql

restart database

restart master nameserver: 2016-06-15 02:16:20

start system: jssfmd0

2016-06-15T02:16:23+02:00  P041741      155516a8602 ERROR   RECOVERY RECOVER DATA finished with error: [448] recovery could not be completed, [71000257] incorrect syntax near "CLEAR"

2016-06-15T02:16:23+02:00  P041741      155516a8602 ERROR   RECOVERY RECOVER DATA finished with error: [448] recovery could not be completed, [71000257] incorrect syntax near "CLEAR"

start system: jssfmd0 failed: start of nameserver failed

recoverSys failed: 2016-06-15 02:16:24

But where is the incorrect syntax near CLEAR in my command???

former_member182967
Active Contributor
0 Kudos

Hi Polokego,

I'm not sure if it is the syntax problem. Can you directly initial recovery from HANA studio?

Regards,

Ning

Former Member
0 Kudos

@Mike...that is what I used.  @Ning, the DB is offline so I won't get anywhere with HANA studio.

former_member182967
Active Contributor
0 Kudos

Hi Polokego,

Just logon the corresponding system with HANA studio (it will ask for <sid>adm and its password), right click to perform recovery no matter your DB is offline or online.

Regards,

Ning

Former Member
0 Kudos

Hi Michael,

He can't do that if the DB isn't coming up.

KR,

Amerjit

Former Member
0 Kudos

Hello,

Please try with the following syntax:

python recoverSys.py --password=<password> --wait --command="RECOVER DATABASE UNTIL TIMESTAMP '2015-05-15 17:00:00' CLEAR LOG USING DATA PATH ('location of the backup') USING LOG PATH ('location of the logs')"

KR,

Amerjit

Former Member
0 Kudos

Hello,

1. What do you mean by "I moved the files" ? If you moved/copied the data files to another location with more space, did you also change the basepath to point to the new location ?

2. top showing 100% doesn't have to be a concern. I've had a system show >= 3400% during startup. Look with "top -H" to see what threads are running when the system is starting up (wouldn't be surprised if it's the LogReplay). This will give you a much better idea of what is going on with the indexserver during startup. At the same time do a tail on the indexserver trace file.

3. As someone already asked, did you identify what caused this sudden data growth on the DB ?

4. I see that you've had this problem for a week or so. Have you opened a message at SAP ?

KR,

Amerjit

Former Member
0 Kudos

Thanks everyone for your contribution. Managed to resolve the issue by reinstalling the HANA DB & restoring an old DB backup. Thanks for your input.

Former Member
0 Kudos

You are actually right because the DB needs to be done before you can initiate a recovery anyway! Don't know why I did not think of that. Thanks Ning!

davidebruno
Participant
0 Kudos

can you attach the indexserver trace?

jitendra_kumar01
Explorer
0 Kudos

Can you attached the the traces inorder to inspect the issue?

Former Member
0 Kudos

Thanks for the prompt response.  Attached find the index server trace file.