RAC database hung analysis
If rac database instance is slow due to deadlock or other blocking ,we can use oradebug hang analyzer to deep dive into the root of cause.
Lets troubleshoot with a simulation

I create a sample table for testing purpose
SQL> CREATE TABLE regions
( region_id NUMBER
CONSTRAINT region_id_nn NOT NULL
, region_name VARCHAR2(25)
);
CREATE UNIQUE INDEX reg_id_pk
ON regions (region_id);
ALTER TABLE regions
ADD ( CONSTRAINT reg_id_pk
PRIMARY KEY (region_id)
) ; 2 3 4 5
Table created.
SQL> 2
Index created.
SQL> 2 3 4
Table altered.
I insert rows on node 1 and donot commit the changes
SQL> SQL> SQL> SQL>
SQL>
SQL> INSERT INTO regions VALUES
( 1
, 'Europe'
);
INSERT INTO regions VALUES
( 2
, 'Americas'
);
INSERT INTO regions VALUES
( 3
, 'Asia'
);
INSERT INTO regions VALUES
( 4
, 'Middle East and Africa'
); 2 3 4
1 row created.
SQL> 2 3 4
1 row created.
SQL> 2 3 4
1 row created.
SQL> 2 3 4
1 row created.
At the same time i insert the same rows on the same table with different user but from node 2!
SQL> select name from v$database;
NAME
---------
RACDB
SQL> select instance_name from v$instance;
INSTANCE_NAME
----------------
RACDB2
SQL> show user
USER is "TEST"
SQL> INSERT INTO sys.regions VALUES
( 1
, 'Europe'
);
INSERT INTO sys.regions VALUES
( 2
, 'Americas'
);
INSERT INTO sys.regions VALUES
( 3
, 'Asia'
);
INSERT INTO sys.regions VALUES
( 4
, 'Middle East and Africa'
); 2 3 4
The database is hung as node 1 blocks node 2 to insert rows
Enable the hang analyzer on one of the RAC nodes to generate trace files for analyzing
SQL> oradebug setmypid
oStatement processed.
SQL> radebug unlimit
oradebugStatement processed.
SQL> oradebug setmypid
SP2-0734: unknown command beginning "oradebugor..." - rest of line ignored.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug unlimit
Statement processed.
SQL> oradebug setinst all
Statement processed.
SQL> oradebug -g all hanganalyze 3
Hang Analysis in /grid/base/diag/rdbms/racdb/RACDB1/trace/RACDB1_diag_3917.trc
SQL> oradebug -g all hanganalyze 3
Hang Analysis in /grid/base/diag/rdbms/racdb/RACDB1/trace/RACDB1_diag_3917.trc
SQL> oradebug -g all hanganalyze 3
Hang Analysis in /grid/base/diag/rdbms/racdb/RACDB1/trace/RACDB1_diag_3917.trc
In the hang trace file ,we can find the diagnostic details of RAC1 blocking RAC2 with wait event ,blocking id,waiting time etc which is pretty much valuable information!!
Chains most likely to have caused the hang:
[a] Chain 1 Signature: <not in a wait><='rdbms ipc reply'<='enq: TX - row lock contention'
Chain 1 Signature Hash: 0x7ba3a1e1
===============================================================================
Non-intersecting chains:
-------------------------------------------------------------------------------
Chain 1:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 2 (racdb.racdb2)
os id: 5922
process id: 40, oracle@RAC2 (TNS V1-V3)
session id: 60
session serial #: 67
}
is waiting for 'enq: TX - row lock contention' with wait info:
{
p1: 'name|mode'=0x54580004
p2: 'usn<<16 | slot'=0x6000e
p3: 'sequence'=0x32b
time in wait: 5 min 50 sec
timeout after: never
wait id: 97
blocking: 0 sessions
current sql: INSERT INTO sys.regions VALUES
( 1
, 'Europe'
)
short stack: ksedsts()+265<-ksdxfstk()+19<-ksdxcb()+1646<-sspuser()+100<-semtimedop()+36<-sskgpwwait()+202<-skgpwwait()+138<-ksliwat()+1602<-kslwaitctx()+138<-kjusuc()+6900<-ksipgetctxi()+1641<-ksqcmi()+23174<-ksqgtlctx()+3204<-ksqgelctx()+469<-ktuGetTxForXid()+95<-ktcwit1()+272<-ktbgtl0()+958<-kdiins0()+46577<-kdiinsp()+61<-kauxsin()+1634<-qesltcLoadIndexList()+718<-qesltcLoadIndexes()+43<-qerltcSimpleSingleInsRowCBK()+65<-qerltcSingleRowLoad()+228<-qerltcFetch()+301<-insexe()+567<-opiexe()+4953<-kpoal8()+1811<-opiodr()+962<-
wait history:
* time between current wait and wait #1: 0.002054 sec
1. event: 'SQL*Net message from client'
time waited: 26.277790 sec
wait id: 96 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
* time between wait #1 and #2: 0.000002 sec
2. event: 'SQL*Net message to client'
time waited: 0.000001 sec
wait id: 95 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
* time between wait #2 and #3: 0.000031 sec
3. event: 'SQL*Net message from client'
time waited: 0.000146 sec
wait id: 94 p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
}
and is blocked by
=> Oracle session identified by:
{
instance: 1 (racdb.racdb1)
os id: 11456
oradebug is the advisable tool in times of performance issues!!
Until the session 1 in node 1 releases the lock from the rows in the table,the other session couldnot modify the rows in the same table.So if the user doesnot commit and flush the block changes to the disk ,other sessions have to sit idle and wait in the queue!!