DOES ONLINE INDEX REBUILD NEED DOWNTIME AND LOCK? WHAT IS THE DIFFERENCE BETWEEN OFFLINE AND ONLINE?

DOES ONLINE INDEX REBUILD NEED DOWNTIME AND LOCK? WHAT IS THE DIFFERENCE BETWEEN OFFLINE AND ONLINE?

Does index rebuild using online clause works without any locks or require a lock to process the rebuild? Most of the DBA’s has this misconception that, index rebuilding in online mode execute without any blockings between DML. Online mode gives freedom for the database with near or zero downtime which means any DML transactions can acquire exclusive lock on the table even after an index rebuild is executed without interruptions but DDL(Online rebuild) cannot progress during modification of data in the table until commit. So, if the index is holding many records, then calculate the maintenance duration accordingly. If the DBA is concerned about faster index maintenance rather than table downtime, then offline rebuild is efficient and if it is about table downtime, then online rebuild is efficient.

Note: Table downtime means prevention of application from modifying the rows in the table using DML(Insert, Update, Delete)

The only independence that is available in online mode is that, online rebuild operation waits and cope up with the other DML transaction to complete its work and then start the rebuild process in between when getting a chance post DML commit.

In offline mode, below error is reported immediately if other DML clutch the table lock.

ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

Both offline and online index rebuild require lock at some point but the difference is,

  • Online rebuild require “Row share mode” lock and offline rebuild require “Share mode nowait”
  • Online rebuild is slower and offline rebuild is faster
  • Online rebuild may cause temporary performance issues in heavy DML application and offline rebuild require complete table downtime

Online method works with these steps

  • Create a snapshot table in the form of “SYS_JOURNAL_N”
  • Create an unique IOT index for “SYS_JOURNAL_N” table
  • Populate the newly cloned index with the defragmented records with sorted key
  • Populate the original index with the defragmented records with sorted key
  • Drop the “SYS_JOURNAL_N” table after rebuild
  • Above all, if PGA is not sufficient to sort the records, then processing spill to temporary tablespace.
[oracle@staxhost ~]$ grep -a "SYS_JOURNAL" staxdb_ora_30481_IRO_tkp.trc
--Table creation phase
create table "SYS"."SYS_JOURNAL_76683" (C0 NUMBER,  opcode char(1), partno
  number,  rid rowid, primary key( C0 , rid )) organization index TABLESPACE
  "SYSTEM"

  "SYS"."SYS_JOURNAL_76683"("C0","RID") INDEX ONLY TOPLEVEL TABLESPACE
         0          0          0    TABLE ACCESS FULL SYS_JOURNAL_76683 (cr=0 pr=0 pw=0 time=3 us starts=1)
--Index creation phase
CREATE UNIQUE INDEX "SYS"."SYS_IOT_TOP_76710" on
  "SYS"."SYS_JOURNAL_76683"("C0","RID") INDEX ONLY TOPLEVEL TABLESPACE
  "SYSTEM" STORAGE( BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE
  DEFAULT) NOPARALLEL
--Populate the clone index
insert into ind_online$ (obj#,type#,flags)
values
(:1,:2,:3)
--Populate the original index
INSERT INTO indrebuild$(obj#, dataobj#, ts#, file#, block#, pctfree$,
  initrans, maxtrans, compcols, flags)
VALUES
(:1,:2,:3,:4,:5,:6,:7,:8,:9,:10)
--Update the clone index

UPDATE ind_online$ SET flags=:2
WHERE
 obj#=:1
--Update the original index

UPDATE indrebuild$ set dataobj#=:2, ts#=:3, file#=:4, block#=:5, pctfree$=:6,
  initrans=:7, maxtrans=:8, compcols=:9, flags=:10
WHERE
 obj#=:1

--Drop table phase
drop table "SYS"."SYS_JOURNAL_76683" purge

First session does the following and does not commit the transaction.

kIsH@STAX<>select sid from v$mystat where rownum < 2;

       SID
----------
        94

kIsH@STAX<>create table dbobj as select * from dba_objects;

Table created.

kIsH@STAX<>insert into dbobj select * from dba_objects;

73462 rows created.

kIsH@STAX<>create index idxobj on dbobj(object_id);

Index created.

kIsH@STAX<>insert into dbobj select * from dba_objects;

73463 rows created.

Second session rebuild the index online and waiting for concurrent access to the table.

kIsH@STAX<>select sid from v$mystat where rownum < 2;

       SID
----------
         1
kIsH@STAX<>alter index idxobj rebuild online;

<======== waiting for lock on the table dbobj

First session acquired TX lock. User left the command line and gone for a coffee.

Here session 2 waits on enq: TX – row lock contention due to uncommitted INSERT statement

kIsH@STAX<>col sid for 99999
kIsH@STAX<>col event for a20
kIsH@STAX<>col blocking_session for 99999999
kIsH@STAX<>col sql_id for a20
kIsH@STAX<>select sid,event,sql_id,blocking_session from v$session where sid=1;

   SID EVENT                SQL_ID               BLOCKING_SESSION
------ -------------------- -------------------- ----------------
     1 enq: TX - row lock c ffnqahhcsy8g0                      94
       ontention


kIsH@STAX<>select sid,event,sql_id,blocking_session from v$session where sid=94;

   SID EVENT                SQL_ID               BLOCKING_SESSION
------ -------------------- -------------------- ----------------
    94 SQL*Net message to c 2nshzv6a0h4p2
       lient

Since the DML statement ‘INSERT’ acquired Row-X (exclusive) lock on the table which prevents concurrent sessions to access the object on shared mode and exclusive mode. Though the column BLOCKING_OTHERS shows “Not Blocking”, still DDL commands are not allowed to proceed due to change in the structure of the index.

kIsH@STAX<>set lines 200 pages 1000
kIsH@STAX<>col OWNER for a20
kIsH@STAX<>col NAME for a20
kIsH@STAX<>select SESSION_ID,OWNER,NAME,MODE_HELD,MODE_REQUESTED,BLOCKING_OTHERS from dba_dml_locks;

SESSION_ID OWNER                NAME                 MODE_HELD     MODE_REQUESTE BLOCKING_OTHERS
---------- -------------------- -------------------- ------------- ------------- ----------------------------------------
         1 SYS                  DBOBJ                Row-S (SS)    None          Not Blocking
         1 SYS                  SYS_JOURNAL_76683    Share         None          Not Blocking
        94 SYS                  DBOBJ                Row-X (SX)    None          Not Blocking

Lock type shows that “INSERT” statement has acquired exclusive lock with enqueue and prevents the SMON process from cleaning up the sort segments which need to be cleaned up for index rebuild.

kIsH@STAX<>select TYPE,NAME,DESCRIPTION from v$lock_type where type='SS';

TYPE                 NAME                           DESCRIPTION
-------------------- ------------------------------ --------------------------------------------------------------------------------
SS                   Sort Segment                   Ensures that sort segments created during parallel DML operations aren't prematu
                                                    rely cleaned up


kIsH@STAX<>select TYPE,NAME,DESCRIPTION from v$lock_type where type='SX';

TYPE                 NAME                           DESCRIPTION
-------------------- ------------------------------ --------------------------------------------------------------------------------
SX                   Spare Enqueue 1                Spare enqueue 1 for one-off backports

Use oradebug tool to trace the hang progress.

kIsH@STAX<>oradebug setospid 3944
Oracle pid: 34, Unix process pid: 3944, image: oracle@staxhost (TNS V1-V3)
kIsH@STAX<>oradebug unlimit
Statement processed.
kIsH@STAX<>oradebug hanganalyze 3
Hang Analysis in /u01/app/oracle/diag/rdbms/staxdb/staxdb/trace/staxdb_ora_4290.trc
kIsH@STAX<>--wait for some time
kIsH@STAX<>oradebug hanganalyze 3
Hang Analysis in /u01/app/oracle/diag/rdbms/staxdb/staxdb/trace/staxdb_ora_4290.trc
kIsH@STAX<>--wait for some time
kIsH@STAX<>oradebug hanganalyze 3
Hang Analysis in /u01/app/oracle/diag/rdbms/staxdb/staxdb/trace/staxdb_ora_4290.trc
kIsH@STAX<>exit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.16.0.0.0

Wait chains show that index rebuild is waiting on enq: TX – row lock contention, even though index rebuild is executed with online clause. Until, the insert statement is commited, the index rebuild wait to acquire the lock

===============================================================================

Chains most likely to have caused the hang:
 [a] Chain 1 Signature: 'CPU or Wait CPU'<='enq: TX - row lock contention'
     Chain 1 Signature Hash: 0xcc111f32

===============================================================================
Non-intersecting chains:

-------------------------------------------------------------------------------
Chain 1:
-------------------------------------------------------------------------------
    Oracle session identified by:
    {
                instance: 1 (staxdb.staxdb)
                   os id: 3944
              process id: 34, oracle@staxhost (TNS V1-V3)
              session id: 1
        session serial #: 39412
             module name: 1 (sqlplus@staxhost (TNS V1-V3))
                  pdb id: 1 (CDB$ROOT)
    }
    is waiting for 'enq: TX - row lock contention' with wait info:
    {
                      p1: 'name|mode'=0x54580004
                      p2: 'usn<<16 | slot'=0x20007
                      p3: 'sequence'=0x7e3
            time in wait: 227 min 17 sec
           timeout after: never
                 wait id: 4460
                blocking: 0 sessions
          current sql_id: 3880962692
             current sql: alter index idxobj rebuild online
             short stack: ksedsts()+426<-ksdxfstk()+58<-ksdxcb()+872<-sspuser()+223<-__sighandler()<-semtimedop()+14<-skgpwwait()+187<-ksliwat()+2192<-kslwaitctx()+200<-ksqcmi()+27366<-ksqgtlctx()+3424<-ksqgelctx()+838<-ktuGetTxForXid()+324<-ktaddlvisprocres()+828<-ktaddlvis()+1222<-kdicrws()+29979<-kdicdrv()+681<-opiexe()+25075<-opiosq0()+4494<-kpooprx()+287<-kpoal8()+824<-opiodr()+1202<-ttcpip()+1218<-opitsk()+1900<-opiino()+936<-opiodr()+1202<-opidrv()+1094<-sou2o()+165<-opimai_real()+422<-ssthrdmain()+417<-main()+256<-__libc_start_main
            wait history:
              * time between current wait and wait #1: 0.003004 sec
              1.       event: 'index (re)build lock or pin object'
                 time waited: 0.000006 sec
                     wait id: 4459             p1: 'namespace'=0x1
                                               p2: 'lock_mode'=0x3
                                               p3: 'pin_mode'=0x3
              * time between wait #1 and #2: 0.000215 sec
              2.       event: 'reliable message'
                 time waited: 0.000638 sec
                     wait id: 4458             p1: 'channel context'=0x82ab5518
                                               p2: 'channel handle'=0x82a9f038
                                               p3: 'broadcast message'=0x82c54638
              * time between wait #2 and #3: 0.001611 sec
              3.       event: 'index (re)build lock or pin object'
                 time waited: 0.000021 sec
                     wait id: 4457             p1: 'namespace'=0x4
                                               p2: 'lock_mode'=0x3
                                               p3: 'pin_mode'=0x3
    }
    and is blocked by
 => Oracle session identified by:
    {
                instance: 1 (staxdb.staxdb)
                   os id: 4290
              process id: 70, oracle@staxhost (TNS V1-V3)
              session id: 94
        session serial #: 13032
             module name: 1 (sqlplus@staxhost (TNS V1-V3))
                  pdb id: 1 (CDB$ROOT)
    }
    which is on CPU or Wait CPU:
    {
               last wait: 0.000197 sec ago
                blocking: 1 session
          current sql_id: 0
             current sql: <none>
             short stack: ksedsts()+426<-ksdxfstk()+58<-ksdxdocmdmultex()+5961<-ksdhng_request_msg_process_phase_diagn()+1534<-ksdhng_request_msg_process()+933<-ksdhng_graph_build_local()+361<-ksdhng()+961<-ksdxfhng()+676<-ksdxen_int()+3381<-opiodr()+1202<-ttcpip()+1218<-opitsk()+1900<-opiino()+936<-opiodr()+1202<-opidrv()+1094<-sou2o()+165<-opimai_real()+422<-ssthrdmain()+417<-main()+256<-__libc_start_main()+243<-0x5541F689495641D7
            wait history:
              1.       event: 'Disk file operations I/O'
                 time waited: 0.000031 sec
                     wait id: 1759             p1: 'FileOperation'=0x8
                                               p2: 'fileno'=0x0
                                               p3: 'filetype'=0x8
              * time between wait #1 and #2: 0.000036 sec
              2.       event: 'SQL*Net message from client'
                 time waited: 6 min 54 sec
                     wait id: 1758             p1: 'driver id'=0x62657100
                                               p2: '#bytes'=0x1
              * time between wait #2 and #3: 0.000006 sec
              3.       event: 'SQL*Net message to client'
                 time waited: 0.000005 sec
                     wait id: 1757             p1: 'driver id'=0x62657100
                                               p2: '#bytes'=0x1
    }

If index is rebuild offline, then an error is directly thrown with ORA-00054 because the table need to be locked in SHARE MODE NOWAIT mode because of NOWAIT clause used internally.

kIsH@STAX<>alter index idxobj rebuild;
alter index idxobj rebuild
            *
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

To summarize,

  • Both offline and online index rebuild need to lock the table no matter the method used.

[oracle@staxhost ~]$ egrep 'LOCK TABLE' /u01/app/oracle/diag/rdbms/staxdb/staxdb/trace/staxdb_ora_30460_IR.trc
LOCK TABLE  FOR INDEX "IDXOBJ" IN SHARE MODE  NOWAIT
[oracle@staxhost ~]$ egrep 'LOCK TABLE' /u01/app/oracle/diag/rdbms/staxdb/staxdb/trace/staxdb_ora_30481_IRO.trc
LOCK TABLE  FOR INDEX "IDXOBJ" IN ROW SHARE MODE
[oracle@staxhost ~]$
  • Index Rebuild offline uses Index Fast Full scan and Index Rebuild online uses Full table scan

--Index Rebuild Offline
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  INDEX BUILD NON UNIQUE IDXOBJ (cr=293873 pr=651 pw=651 time=250883 us starts=1)(object id 0)
    293833     293833     293833   SORT CREATE INDEX (cr=293845 pr=651 pw=0 time=140768 us starts=1)
    293833     293833     293833    INDEX FAST FULL SCAN IDXOBJ (cr=293845 pr=651 pw=0 time=105728 us starts=1)(object id 76683)

--Index Rebuild Online
Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  INDEX BUILD NON UNIQUE IDXOBJ (cr=5740 pr=0 pw=651 time=180561 us starts=1)(object id 0)
    293849     293849     293849   SORT CREATE INDEX (cr=5693 pr=0 pw=0 time=76729 us starts=1)
    293849     293849     293849    TABLE ACCESS FULL DBOBJ (cr=5693 pr=0 pw=0 time=14613 us starts=1 cost=1544 size=1101930 card=220386)

  • Online rebuild is slow
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        2      0.00       0.00          0          2          0           0
Execute      2      0.18       0.25        652     293867       1040           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.18       0.25        652     293869       1040           0

Misses in library cache during parse: 1

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        3        0.00          0.00
  SQL*Net message to client                       2        0.00          0.00
  SQL*Net message from client                     2        5.07          9.29
  PGA memory operation                          130        0.00          0.00
  index (re)build lock or pin object              4        0.00          0.00
  asynch descriptor resize                        1        0.00          0.00
  db file scattered read                         25        0.00          0.00
  direct path write                             166        0.00          0.06
  reliable message                                3        0.00          0.00
  enq: RO - fast object reuse                     1        0.00          0.00
  db file sequential read                         1        0.00          0.00
  enq: CR - block range reuse ckpt                1        0.00          0.00
  log file sync                                   1        0.00          0.00

Offline rebuild is fast

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        2      0.00       0.00          0          1          0           0
Execute      2      0.11       0.19          2       5739       1130           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.11       0.19          2       5740       1130           0

Misses in library cache during parse: 1

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        2        0.00          0.00
  SQL*Net message to client                       2        0.00          0.00
  SQL*Net message from client                     2        4.73          7.34
  PGA memory operation                          136        0.00          0.00
  index (re)build lock or pin object             10        0.00          0.00
  reliable message                                5        0.00          0.00
  direct path write                             166        0.00          0.08
  enq: RO - fast object reuse                     2        0.00          0.00
  db file sequential read                         2        0.00          0.00
  enq: CR - block range reuse ckpt                2        0.00          0.00
  log file sync                                   1        0.00          0.00
  • Do not schedule online rebuild activity during heavy DML hours of database since rebuild operation should wait for a long time until the DML commit.
  • If good machine power is available like CPU and MEMORY , then go for parallel online index creation and also make sure that other critical DML operations are not impacted due to resource exhaustion. Use parallel_degree_limit to a less value say CPU / 3 if CPU_COUNT > 3 to limit the parallel or use resource manager plan. By default, parallel_degree_limit is set to CPU capacity which is not always good.
kIsH@STAX<>show parameter parallel_degree_limit

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_degree_limit                string      CPU

Since offline rebuild require downtime, this activity is too easy without any hassles except that if it is difficult to get business approval.

Leave a Reply

%d bloggers like this: