on 06-14-2016 11:00 AM
Good morning guys,
About a week ago, our database (512GB RAM) "crashed'. Upon investigation, we realized that the data volume had reach 100% (750GB). We subsequently increase this by 500GB. We then tried to start the DB. The indexserver would not start, and other services dependent on it also would not, complaining that the "Master index server is not available yet". We left this to run overnight...this morning, we found out that our new 1.3TB has also filled up. The index server and other services will still not startup. Is there a reason for this? Where can one look? The index server trace file raises a DIKFULL event. Even if we increase this by another 1TB...will it not fill up? Assistance will be appreciated.
Index server shows:
733]{-1}[-1/-1] 2016-06-14 08:02:03.353555 i EventHandler EventManagerImpl.cpp(00689) : --event not handled: DiskFullEvent[id= 1, path= /hana/data/FMD/mnt00001/hdb00003/, state= HANDLING]
[9289]{-1}[-1/-1] 2016-06-14 08:03:03.353652 i EventHandler EventManagerImpl.cpp(00683) : handleEvents(1)
[9289]{-1}[-1/-1] 2016-06-14 08:03:03.353661 i EventHandler EventManagerImpl.cpp(00686) : --handleEvent: DiskFullEvent[id= 1, path= /hana/data/FMD/mnt00001/hdb00003/, state= NEW]
[9289]{-1}[-1/-1] 2016-06-14 08:03:03.353664 i EventHandler LocalFileCallback.cpp(00426) : [DISKFULL] restarting queue with 10751 requests
[9289]{-1}[-1/-1] 2016-06-14 08:03:03.354360 i EventHandler EventManagerImpl.cpp(00689) : --event not handled: DiskFullEvent[id= 1, path= /hana/data/FMD/mnt00001/hdb00003/, state= HANDLING]
[9640]{-1}[-1/-1] 2016-06-14 08:04:03.354451 i EventHandler EventManagerImpl.cpp(00683) : handleEvents(1)
[9640]{-1}[-1/-1] 2016-06-14 08:04:03.354458 i EventHandler EventManagerImpl.cpp(00686) : --handleEvent: DiskFullEvent[id= 1, path= /hana/data/FMD/mnt00001/hdb00003/, state= NEW]
[9640]{-1}[-1/-1] 2016-06-14 08:04:03.354462 i EventHandler LocalFileCallback.cpp(00426) : [DISKFULL] restarting queue with 14326 requests
[9640]{-1}[-1/-1] 2016-06-14 08:04:03.355172 i EventHandler EventManagerImpl.cpp(00689) : --event not handled: DiskFullEvent[id= 1, path= /hana/data/FMD/mnt00001/hdb00003/, state= HANDLING]
[9316]{-1}[-1/-1] 2016-06-14 08:05:03.355261 i EventHandler EventManagerImpl.cpp(00683) : handleEvents(1)
[9
Please see SAP Note 1679938
So we can see from this that the Disk is full , so there is no space left in the data volume /hana/data/,
The only way to free space in (/hana/data/) is to backup your data. So you need to run a successful back and backup what is in (/hana/data/) to some other location. If it takes time to do the backup you may be able to free space by following the workaround in the note 1679938 but in this case for the data volume and not the log volume, what happens when you try this workaround?
Were any changes made in the system lately , like large data loads etc? Please see note 2146989 could explain the large disk size in your case? Please attach the output of SYS.M_UNIFIED_TABLE_PERSISTENCE_STATISTICS.
Also please try ALTER SYSTEM RECLAIM DATAVOLUME?
Please see the Question 33 in note 2000003 and also the information
in the note 1870858.
BR
Michael
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thanks Mike.
Note 1679938 - Log Volume is full talks about the log volume. However, in our case, it is the data volume which has filled up. We have subsequently increased the space by an additional 500GB. However, the DB still does not come up.
We have copied data in the hana/data to another location, however we are unable to run a DB backup with studio coz the DB is down.
Can you please check if the THP is deactivated:
SAP Note 2131662
Don't I need to execute that in hdbsql? The DB is offline, so I cannot even connect to hdbsql. We have reached a resolution that is better we restore from a DB backup. I am also facing a another strange issue there. We have an initial backup (taken immediately after installation) and want to restore that. Here is what I do and the error i receive. Kindly correct me if I am wrong.
HDBSettings.sh recoverSys.py --command="RECOVER DATABASE UNTIL TIMESTAMP '2015-05-15 17:00:00' USING DATA PATH ('location of the backup') CLEAR LOG"
After I run that command...the output I receive is this...
own pid: 40416
recoverSys started: 2016-06-15 02:16:05
testing master: jssfmd0
jssfmd0 is master
shutdown database, timeout is 120
stop system
stop system: jssfmd0
stopping system: 2016-06-15 02:16:05
stopped system: 2016-06-15 02:16:05
creating file recoverInstance.sql
restart database
restart master nameserver: 2016-06-15 02:16:20
start system: jssfmd0
2016-06-15T02:16:23+02:00 P041741 155516a8602 ERROR RECOVERY RECOVER DATA finished with error: [448] recovery could not be completed, [71000257] incorrect syntax near "CLEAR"
2016-06-15T02:16:23+02:00 P041741 155516a8602 ERROR RECOVERY RECOVER DATA finished with error: [448] recovery could not be completed, [71000257] incorrect syntax near "CLEAR"
start system: jssfmd0 failed: start of nameserver failed
recoverSys failed: 2016-06-15 02:16:24
But where is the incorrect syntax near CLEAR in my command???
Hello,
1. What do you mean by "I moved the files" ? If you moved/copied the data files to another location with more space, did you also change the basepath to point to the new location ?
2. top showing 100% doesn't have to be a concern. I've had a system show >= 3400% during startup. Look with "top -H" to see what threads are running when the system is starting up (wouldn't be surprised if it's the LogReplay). This will give you a much better idea of what is going on with the indexserver during startup. At the same time do a tail on the indexserver trace file.
3. As someone already asked, did you identify what caused this sudden data growth on the DB ?
4. I see that you've had this problem for a week or so. Have you opened a message at SAP ?
KR,
Amerjit
can you attach the indexserver trace?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
83 | |
10 | |
10 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.