cancel
Showing results for 
Search instead for 
Did you mean: 

Oracle hands for 12 min during "alter database open"

former_member84399
Participant
0 Kudos

Dear all,

After a disk failure/replacement in a SAN disk array that we are using for storage, we had to restore our Oracle database. After the restore the database does not start properly: It takes around 12 min to move from the state "Mount" to the state "Open". It finally opens but even after this, when I start SAP, only 5 workprocesses connect and logging into SAP is not possible, the gui just hangs

There are no errors in the alert_RSP.log. I attach an extract here to show the time difference between the mount state and the open state

Thu Apr 19 11:16:07 2007

Starting ORACLE instance (normal)

Thu Apr 19 11:16:29 2007

LICENSE_MAX_SESSION = 0

LICENSE_SESSIONS_WARNING = 0

SCN scheme 3

LICENSE_MAX_USERS = 0

SYS auditing is disabled

Starting up ORACLE RDBMS Version: 9.2.0.7.0.

System parameters with non-default values:

processes = 240

sessions = 288

timed_statistics = TRUE

event = 38068 trace name context forever, level 100

shared_pool_size = 620756992

sga_max_size = 1159695232

shared_pool_reserved_size= 20401094

java_pool_size = 167772160

enqueue_resources = 8000

filesystemio_options = SETALL

control_files = /oracle/RSP/sapdata1/cntrl/cntrlRSP.dbf, /oracle/RSP/origlogA/cntrl/cntrlRSP.dbf, /oracle/RSP/sa

parch/cntrl/cntrlRSP.dbf

db_block_checksum = FALSE

db_block_size = 8192

compatible = 9.2.0

log_archive_start = TRUE

log_archive_dest = /oracle/RSP/saparch/RSParch

log_buffer = 1048576

log_checkpoint_interval = 0

log_checkpoint_timeout = 0

db_files = 254

db_file_multiblock_read_count= 8

log_checkpoints_to_alert = TRUE

control_file_record_keep_time= 30

dml_locks = 5000

replication_dependency_tracking= FALSE

transaction_auditing = FALSE

undo_management = AUTO

undo_tablespace = PSAPUNDO

undo_suppress_errors = TRUE

undo_retention = 43200

remote_os_authent = TRUE

db_domain = WORLD

session_cached_cursors = 800

hash_join_enabled = FALSE

background_dump_dest = /oracle/RSP/saptrace/background

user_dump_dest = /oracle/RSP/saptrace/usertrace

max_dump_file_size = 20000

core_dump_dest = /oracle/RSP/saptrace/background

optimizer_features_enable= 9.2.0

sort_area_size = 2097152

sort_area_retained_size = 0

db_name = RSP

open_cursors = 800

optimizer_mode = choose

sortelimination_cost_ratio= 10

btree_bitmap_plans = FALSE

pushjoin_predicate = FALSE

optimizerjoin_sel_sanity_check= TRUE

optimizer_index_cost_adj = 10

optimpeek_user_binds = FALSE

optimizeror_expansion = DEPTH

PMON started with pid=2

DBW0 started with pid=3

DBW1 started with pid=4

DBW2 started with pid=5

DBW3 started with pid=6

LGWR started with pid=7

CKPT started with pid=8

SMON started with pid=9

RECO started with pid=10

Thu Apr 19 11:16:31 2007

ARCH: STARTING ARCH PROCESSES

ARC0 started with pid=11

ARC0: Archival started

ARC1 started with pid=12

ARC1: Archival started

Thu Apr 19 11:16:31 2007

ARCH: STARTING ARCH PROCESSES COMPLETE

Thu Apr 19 11:16:31 2007

ARC0: Becoming the 'no FAL' ARCH

ARC0: Becoming the 'no FAL' ARCHARC0: Thread not mounted

Thu Apr 19 11:16:31 2007

ARC1: Becoming the heartbeat ARCH

ARC1: Becoming the heartbeat ARCHARC1: Thread not mounted

Thu Apr 19 11:16:31 2007

ALTER DATABASE MOUNT

Thu Apr 19 11:16:36 2007

Successful mount of redo thread 1, with mount id 1209727119

Thu Apr 19 11:16:36 2007

Database mounted in Exclusive Mode.

Completed: ALTER DATABASE MOUNT

Thu Apr 19 11:16:36 2007

ALTER DATABASE OPEN

Thu Apr 19 11:16:37 2007

LGWR: Primary database is in CLUSTER CONSISTENT mode

Thread 1 opened at log sequence 6727

Current log# 12 seq# 6727 mem# 0: /oracle/RSP/origlogB/log_g12m1.dbf

Current log# 12 seq# 6727 mem# 1: /oracle/RSP/mirrlogB/log_g12m2.dbf

Successful open of redo thread 1

Thu Apr 19 11:16:37 2007

MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set

Thu Apr 19 11:16:38 2007

SMON: enabling cache recovery

Thu Apr 19 11:16:49 2007

Successfully onlined Undo Tablespace 27.

Thu Apr 19 11:16:49 2007

SMON: enabling tx recovery

Thu Apr 19 11:16:50 2007

Database Characterset is WE8DEC

Thu Apr 19 11:29:09 2007

Completed: ALTER DATABASE OPEN

Please note the time difference between 11:16:50 and 11:29:09, when the database is mounted and we are just waiting for it to open

I run dbverify, all ok

I reviewed note 521264 and my issue is described in case 2c. However, after reviewing all 127 datafiles and online redologs, I do not have any files with an "S" in their permissions

We run Solaris 10, Oracle 9.2.0.7 and WAS 6.40 SP16

Any help will be very much appreciated

Many thanks

Andreas

Accepted Solutions (0)

Answers (4)

Answers (4)

former_member84399
Participant
0 Kudos

Hello,

I will check about the patches, thanks. I do not think it is a recovery issue, I mean we tried to start it several times, restored it from a backup, restored the binaries, still have the same issue

At the moment, we managed to restore the system to other hardware, using the same disks and it worked fine. This denotes that it is an OS problem , we opened a call to SUN about this. I will update this thread when we have some news from them

Many thanks

Andreas

former_member84399
Participant
0 Kudos

For your information, the issue with Oracle taking too much time to move from state "mount" to state "open" after being recovered successfully

was a problem with memory fragmentation in Sun T2000 servers when running Oracle

It was solved by setting the following parameter

set disable_ism_large_pages=0x34

in /etc/system

Regards

Andreas

former_member84399
Participant
0 Kudos

Thanks for your replies so far

My Unix admin has tried various things to solve this. We just managed to prove that the disks were not part of the problem, by recreating the system to another hardware and keep the data in the same disks as before. The system started ok without any issues there

Now, we are not sure why the current hardware (it is a Solaris 10 zone actually ) can not open Oracle ok and we are investigating on this. Neither a Solaris zone reboot nor a reboot of the whole physical server has helped

Andreas

former_member204746
Active Contributor
0 Kudos

check if you installed ALL needed SOLARIS patches for Oracle.

Former Member
0 Kudos

Recovery is probably the correct answer for the time. Don't forget, you need to read all the archive logs generated since the backup. This is what takes the largest amount of time, the redo is read after the archives, and generally only involves 1 log that is the same size (or smaller ) than the archive logs.

Vinod

former_member84399
Participant
0 Kudos

Hello,

Here are the details:

R3trans -d returns 0000 instantly. I am not including trans.log here as the connection to the database works . I have no space issues neither, have around 70GB left under /oracle

The startup log is as follows

Trace of system startup/check of SAP System RSP on Fri Apr 20 08:38:10 BST 2007

Called command: /usr/sap/RSP/SYS/exe/run/startsap

Starting SAP Instance DVEBMGS00

-


SAP-R/3-Startup Program Rel 640 V1.8 (2003/04/24)

-


Starting at 2007/04/20 08:38:10

Startup Profile: "/usr/sap/RSP/SYS/profile/START_DVEBMGS00_rspda002"

Execute Pre-Startup Commands

-


(6285) Local: /usr/sap/RSP/SYS/exe/run/sapmscsa -n pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

/usr/sap/RSP/SYS/exe/run/sapmscsa: SCSA is attached and useable.

/usr/sap/RSP/SYS/exe/run/sapmscsa: finished.

(6285) Local: rm -f ms.sapRSP_DVEBMGS00

(6285) Local: ln -s -f /usr/sap/RSP/SYS/exe/run/msg_server ms.sapRSP_DVEBMGS00

(6285) Local: rm -f dw.sapRSP_DVEBMGS00

(6285) Local: ln -s -f /usr/sap/RSP/SYS/exe/run/disp+work dw.sapRSP_DVEBMGS00

(6285) Local: rm -f co.sapRSP_DVEBMGS00

(6285) Local: ln -s -f /usr/sap/RSP/SYS/exe/run/rslgcoll co.sapRSP_DVEBMGS00

(6285) Local: rm -f se.sapRSP_DVEBMGS00

(6285) Local: ln -s -f /usr/sap/RSP/SYS/exe/run/rslgsend se.sapRSP_DVEBMGS00

Starting Programs

-


(6304) Starting: local ms.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

(6305) Starting: local dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

(6306) Starting: local co.sapRSP_DVEBMGS00 -F pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

(6307) Starting: local se.sapRSP_DVEBMGS00 -F pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

(6285) Waiting for Child Processes to terminate.

Instance on host rspda002 started

ps -ef | grep dw is shown below: It only starts 5 -6 dialog processes. All the other processes are not started (normally I have around 40 )

rspda002:rspadm 33% ps -ef | grep dw

rspadm 6317 6305 0 08:38:17 ? 0:00 dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6310 6305 0 08:38:16 ? 0:01 dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6311 6305 0 08:38:16 ? 0:00 dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6624 5523 0 08:43:35 pts/1 0:00 grep dw

rspadm 6305 6285 0 08:38:11 ? 0:04 dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6312 6305 0 08:38:17 ? 0:01 dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6313 6305 0 08:38:17 ? 0:00 dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

That is ps -ef | grep sap

rspda002:rspadm 32% ps -ef | grep sap

rspadm 6317 6305 0 08:38:17 ? 0:00 dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6310 6305 0 08:38:16 ? 0:01 dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

root 12375 650 0 11:15:58 ? 3:56 /usr/sap/RSP/SYS/exe/run/saposcol

rspadm 6311 6305 0 08:38:16 ? 0:00 dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6309 6305 0 08:38:16 ? 0:01 icman -attach pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6306 6285 0 08:38:11 ? 0:00 co.sapRSP_DVEBMGS00 -F pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6307 6285 0 08:38:11 ? 0:00 se.sapRSP_DVEBMGS00 -F pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6304 6285 0 08:38:11 ? 0:00 ms.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6308 6305 0 08:38:16 ? 0:02 gwrd -dp pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6622 5523 0 08:43:29 pts/1 0:00 grep sap

rspadm 6305 6285 0 08:38:11 ? 0:04 dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 20243 650 0 15:11:34 ? 13:11 sapccm4x -DCCMS pf==/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6285 650 0 08:38:11 ? 0:00 /usr/sap/RSP/SYS/exe/run/sapstart pf=/usr/sap/RSP/SYS/profile/START_DVEBMGS00_r

rspadm 6312 6305 0 08:38:17 ? 0:01 dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

rspadm 6313 6305 0 08:38:17 ? 0:00 dw.sapRSP_DVEBMGS00 pf=/usr/sap/RSP/SYS/profile/RSP_DVEBMGS00_rspda002

Many thanks

Andreas

Former Member
0 Kudos

Hi Andreas,

Can't it be that your new disk is not correctly assigned from the SAN?

Just to test access time, you can run:

<i>time dd if=<your_physical_disk_path> of=/dev/null count=1000 bs=10k</i>

...and repeat it with other Oracle disk from the SAN. Compare the results.

Hope it could help you!

Regards,

JC Llanes.

former_member618758
Active Participant
0 Kudos

Hi,

Do you have the response or log from startsap?

Can you do a ps -ef|grep dw and send response to me?

Can you do a R3trans -d and send reply?

Thanks,

George