cancel
Showing results for 
Search instead for 
Did you mean: 

SAP HANA Database is taking long time to restart

Former Member
0 Kudos

Hi All,

As part of performance issue happened in the  last night  , application team killed the sessions and those went in save point , we have bounced the database to clear the sessions and now it is not coming up ..almost it took 8hr and even thoug it is still startup state .

below is logs entry from indexserver logs.

[138168]{-1}[-1/-1] 2016-07-06 02:29:04.973228 i Logger           RecoveryHandlerImpl.cpp(00663) : Redo done up to position: 362231595456 and time: 2016-07-05 15:40:35.577132-05:00 (8%)

[138168]{-1}[-1/-1] 2016-07-06 02:39:14.978135 i Logger           RecoveryHandlerImpl.cpp(00663) : Redo done up to position: 362231595456 and time: 2016-07-05 15:40:35.577132-05:00 (8%)

[138168]{-1}[-1/-1] 2016-07-06 02:49:24.983491 i Logger           RecoveryHandlerImpl.cpp(00663) : Redo done up to position: 362231595456 and time: 2016-07-05 15:40:35.

could some one can give an idea how much it will take actually to comeup ..my guessing this is taking long time to redoing the data .

Thanks

Hakar

Accepted Solutions (0)

Answers (3)

Answers (3)

Former Member
0 Kudos

Hello Hakar,

As a starting point.

Q) What revision are you on

A)

Q) Size of DB (single node/multi node) ?

A)

Q) What was the "performance issue" that necessitated a restart ?

A)

1. Run "top -H" and this will show you the thread(s) that the indexserver is executing (quite likely LogReplay)

2. Check your disk i/o with "sar"

3. Generate a runtime dump as per: 1813020 - How to generate a runtime dump on SAP HANA

KR,

Amerjit

Former Member
0 Kudos

Amerit,

1)Rev-85

2)500+G

3)application running slow due to some blockings and application user killed the sessions which are running since longtime , after killing the session also those are not cleared and those went in savepoint.

4)

   PID USER      PR  NI  VIRT  RES  SHR S   %CPU %MEM    TIME+  COMMAND

140030 mmpadm    20   0 45.0g 8.8g 2.0g D     11  0.4 247:19.43 LogRecoveryQueu

87376 root      20   0 57380 3640 3048 S      3  0.0   0:00.10 sapuxuserchk

87097 mmpadm    20   0 21800 3204 1180 R      1  0.0   0:00.28 top

5)"sar'

12:00:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle

12:10:01 AM     all      0.81      0.00      0.13      0.32      0.00     98.73

12:20:01 AM     all      0.69      0.00      0.13      0.36      0.00     98.81

12:30:01 AM     all      0.51      0.00      0.13      0.60      0.00     98.76

12:40:01 AM     all      0.39      0.00      0.11      0.62      0.00     98.88

12:50:01 AM     all      0.38      0.00      0.10      0.57      0.00     98.94

01:00:01 AM     all      0.52      0.00      0.13      0.48      0.00     98.87

01:10:01 AM     all      0.50      0.00      0.13      0.49      0.00     98.89

01:20:01 AM     all      0.52      0.00      0.13      0.47      0.00     98.86

01:30:01 AM     all      0.50      0.00      0.13      0.53      0.00     98.84

01:40:01 AM     all      0.53      0.00      0.13      0.45      0.00     98.88

01:50:01 AM     all      0.69      0.00      0.13      0.35      0.00     98.83

02:00:01 AM     all      0.60      0.00      0.13      0.46      0.00     98.80

02:10:01 AM     all      0.58      0.00      0.13      0.51      0.00     98.78

02:20:01 AM     all      0.60      0.00      0.13      0.44      0.00     98.83

02:30:01 AM     all      0.58      0.00      0.13      0.55      0.00     98.74

02:40:01 AM     all      0.54      0.00      0.13      0.54      0.00     98.78

02:50:01 AM     all      0.56      0.00      0.13      0.41      0.00     98.90

03:00:01 AM     all      0.57      0.00      0.13      0.45      0.00     98.85

03:10:01 AM     all      0.50      0.00      0.13      0.49      0.00     98.88

03:20:01 AM     all      0.47      0.00      0.13      0.53      0.00     98.86

03:30:01 AM     all      0.61      0.00      0.13      0.43      0.00     98.82

03:40:01 AM     all      0.50      0.00      0.13      0.53      0.00     98.84

03:50:01 AM     all      0.47      0.00      0.13      0.60      0.00     98.80

04:00:01 AM     all      0.46      0.00      0.13      0.54      0.00     98.86

Average:        all      0.54      0.00      0.13      0.49      0.00     98.84

Former Member
0 Kudos

Hello Hakar,

So LogReplay as suspected.

1. Please upload (zip if necessary) the indexserver trace file.

2. Are your Log Volumes on a NFS share ? If so, run nfsstat and check if the retrans counter is incrementing.

3. If this is your production system, log a call with SAP ASAP.

As you have already picked up, restarting the system/server will just mean that the rollback will start all over again.

I encountered a similar situation on rev102 but I'd like to see the trace file(s) before jumping to any conclusions.

KR,

Amerjit

Amit_Tewatia
Active Participant
0 Kudos

Hi Sudhakar,

Please check for any hung processes and kill if any. I would suggest to take a complete server restart, if possible.

then use below commands to restart your HANA system

By using the sapcontrol program:

  1. Log on to the HANA server as root.
  2. Execute following commands for start/stop:

     Start the SAP HANA system by entering the following command:

     /usr/sap/hostctrl/exe/sapcontrol -nr <instance number> -function Start

     Stop the SAP HANA system by entering the following command:

     /usr/sap/hostctrl/exe/sapcontrol -nr <instance number> -function Stop

By using the HDB program:

  1. Log on HANA server as <sid>adm.
  2. Execute following commands for start/stop:

     Start the SAP HANA system by entering the following command:

     usr/sap/<SID>/HDB<instance number>/HDB start

     Stop the SAP HANA system by entering the following command:

     /usr/sap/<SID>/HDB<instance number>/HDB stop

Regards,

Amit T

Former Member
0 Kudos

Amit,

System is not coming up due to rollbacking is taking long time , if i bounce the server , does it again start to rollback the data ?

Thanks

Hakar

Amit_Tewatia
Active Participant
0 Kudos

I guess not. Never stuck up such a situation, so not 100% sure but it worth a shot as you already waiting even after 8 hrs of starting your system.

Regards,

Amit T

former_member183326
Active Contributor
0 Kudos

Can you see what services are currently started? Can you see any information in sapstartsrv log?

Former Member
0 Kudos

Michael,

currently HDB Nameserver and preprocessor and compileservers are running . and remaining all are in initializing state.

few lines from sapstart.log

Starting Programs

-----------------

(137385) Starting: local /usr/sap/MMP/HDB00/cnumm1hp/trace/hdb.sapMMP_HDB00 -d -nw -f /usr/sap/MMP/HDB00/cnumm1hp/daemon.ini pf=/usr/sap/MMP/SYS/profile/MMP_HDB00_cnumm1hp

(137385) New Child Process created.

(137385) Starting local Command:

Command:  /usr/sap/MMP/HDB00/cnumm1hp/trace/hdb.sapMMP_HDB00

           -d

           -nw

           -f

           /usr/sap/MMP/HDB00/cnumm1hp/daemon.ini

           pf=/usr/sap/MMP/SYS/profile/MMP_HDB00_cnumm1hp

(137296) Waiting for Child Processes to terminate.

Thanks

Hakar