This message show when MMON process utilize undo tablespace heavily during high load in database. The minact-scn feature is introduced in 11g to enhance the block clean out of the tasks. When this error trigger, AWR statistics may not be gathered during the particular time because of large waits observed.
In order to troubleshoot sysaux tablespace growth issues, MMON auto purging activity should be monitored and traced to further identify the root cause of auto purge fault.
To enable trace for MMON process for monitoring the auto purging mechanism of cleaning up SYSAUX tablespace, first execute dbms_monitor package
Execute the install.sql to install SQLT in the database
kish@exdbx<>START /home/oracle/sqlt/sqlt/install/install.sql
zip warning: name not matched: *_sq*.log
zip error: Nothing to do! (SQLT_installation_logs_archive.zip)
zip warning: name not matched: *_ta*.log
zip error: Nothing to do! (SQLT_installation_logs_archive.zip)
Ignore errors from here until @@@@@ marker as this is to test for NATIVE PLSQL Code Type
@@@@ marker . You may ignore prior errors about NATIVE PLSQL Code Type
old 1: ALTER SESSION SET PLSQL_CODE_TYPE = &&plsql_code_type
new 1: ALTER SESSION SET PLSQL_CODE_TYPE = NATIVE
Session altered.
Elapsed: 00:00:00.00
Specify optional Connect Identifier (as per Oracle Net)
Include "@" symbol, ie. @PROD
If not applicable, enter nothing and hit the "Enter" key.
You *MUST* provide a connect identifier when installing
SQLT in a Pluggable Database in 12c
This connect identifier is only used while exporting SQLT
repository everytime you execute one of the main methods.
Optional Connect Identifier (ie: @PROD):
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
Define SQLTXPLAIN password (hidden and case sensitive).
Elapsed: 00:00:00.00
Password for user SQLTXPLAIN:
Re-enter password:
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
The next step is to choose the tablespaces to be used by SQLTXPLAIN
The Tablespace name is case sensitive.
Do you want to see the free space of each tablespace [YES]
or is it ok just to show the list of tablespace [NO]?
Type YES or NO [Default NO]:
... please wait
TABLESPACE FREE_SPACE_MB
------------------------------ ----------------------------------------
EXAMPLE
USERS
Elapsed: 00:00:00.01
Specify PERMANENT tablespace to be used by SQLTXPLAIN.
Tablespace name is case sensitive.
Default tablespace [UNKNOWN]: EXAMPLE
Elapsed: 00:00:00.00
Password for user SQLTXPLAIN:
Re-enter password:
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
The next step is to choose the tablespaces to be used by SQLTXPLAIN
The Tablespace name is case sensitive.
Do you want to see the free space of each tablespace [YES]
or is it ok just to show the list of tablespace [NO]?
Type YES or NO [Default NO]:
... please wait
TABLESPACE FREE_SPACE_MB
------------------------------ ----------------------------------------
EXAMPLE
USERS
Elapsed: 00:00:00.01
Specify PERMANENT tablespace to be used by SQLTXPLAIN.
Tablespace name is case sensitive.
Default tablespace [UNKNOWN]: EXAMPLE
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.01
... please wait
TABLESPACE
------------------------------
TEMP
Elapsed: 00:00:00.01
Specify TEMPORARY tablespace to be used by SQLTXPLAIN.
Tablespace name is case sensitive.
Temporary tablespace [UNKNOWN]: TEMP
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
The main application user of SQLT is the schema
owner that issued the SQL to be analyzed.
For example, on an EBS application you would
enter APPS.
You will not be asked to enter its password.
To add more SQLT users after this installation
is completed simply grant them the SQLT_USER_ROLE
role.
Main application user of SQLT: TEST
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
SQLT can make extensive use of licensed features
provided by the Oracle Diagnostic and the Oracle
Tuning Packs, including SQL Tuning Advisor (STA),
SQL Monitoring and Automatic Workload Repository
(AWR).
To enable or disable access to these features
from the SQLT tool enter one of the following
values when asked:
"T" if you have license for Diagnostic and Tuning
"D" if you have license only for Oracle Diagnostic
"N" if you do not have these two licenses
Oracle Pack license [T]:
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.01
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.01
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
PL/SQL procedure successfully completed.
Elapsed: 00:00:00.00
TADOBJ completed.
SQDOLD completed. Ignore errors from this script
adding: 211225001426_01_sqcreate.log (deflated 91%)
Elapsed: 00:00:00.00
TAUTLTEST completed.
adding: 211225001918_09_tautltest.log (deflated 63%)
Elapsed: 00:00:00.00
SQUTLTEST completed.
adding: 211225001918_10_squtltest.log (deflated 63%)
SQLT users must be granted SQLT_USER_ROLE before using this tool.
SQCREATE completed. Installation completed successfully.
Use this query to identify the objects which are not owned by SYS or SYSTEM user respect to icol$ in system tablespace
select b.owner,
b.object_name,
b.object_type,
b.status
from
sys.icol$ a
inner join dba_objects b on ( a.obj# = b.object_id )
and b.owner not in ('SYS','SYSTEM')
order by 1 desc;
In AWR , SQL ORDERED BY USER I/O WAIT TIME section has %IO, %CPU and %TOTAL
We will see how to calculate the percentage of these metrics in AWR
Generally, % disk usage is calculated by (Time spent by disk in IO / elapsed time ) * 100
Example: If disk spend 0.30 seconds in I/O operation in one second interval, then,
% Disk usage = (0.30 / 1) * 100
Similarly,
%Total – % Time spent by specific SQL in I/O operation out of Total user I/O wait time
%CPU – % Time spent by specific SQL burning CPU out of the wall clock time(elapsed time) which means total duration of AWR report.
%IO – % Time spent by specific SQL in I/O operation out of wall clock time
User I/O wait time (s) means the I/O time spent by user (application class) like SQL accessing blocks from datafile
Calculate the total user IO wait time by adding up all the service name in Service Wait Class Stats
Calculating the value for top SQL.
%IO of SQL is calculated by,
%IO = ( User I/O wait time / Elapsed time ) * 100
= ( 1.01 / 6.07 ) * 100
~ 16.6 %
%CPU is calculated by,
%CPU = ( CPU Time / Elapsed time ) * 100
= ( 4.96 / 6.07 ) * 100
~ 81.71 %
%TOTAL of SQL is calculated by,
%TOTAL = ( User I/O wait time / Total User I/O wait time ) * 100
= ( 1.01 / ( Service Wait Class Stats -> User I/O Wt Time ) * 100
= ( 1.01 / 3 ) * 100
~ 33.7 %
Use this query to find top segments occupied in system tablespace
select * from
(select segment_name,
bytes/1073741824 GB,
segment_type
from
dba_segments
where tablespace_name='SYSTEM'
order by 2 desc)
where rownum < 6;
ORA-00060: deadlock detected while waiting for resource
Cause:
Deadlock is not an oracle error but an application design issue which happen due to two way block also known as mutual embrace
Here is a small example which may help understand deadlock
Either we need to break the barrier or construct the design of the road correctly. Similarly in technical terms, either the application sessions which cause the deadlock should be identified and killed (or) the application design should be properly constructed.
Deadlock illustration
Another simple diagram which can help understand deadlock
exclusive lock X
SID 1 ==============>TABLE1
◥ (blocked)
(1 need to ╱
acquire lock ╲ ╱
on TABLE1) ╲ ╱
╱ ╲ Deadlock
╱ ╲
(54 need to ╲
acquire lock ◢
on TABLE2 ) (blocked)
SID 54 ==============> TABLE2
exclusive lock X
Symptoms:
Whenever a deadlock occur, there occur enqueues in AWR report
A simple simulation to cause a deadlock
--Create user t1
SQL> create user t1 identified by password;
User created.
--Create another user t2
SQL> create user t2 identified by password;
User created.
--Grant privileges to t1
SQL> grant connect,resource,unlimited tablespace,create session to t1;
Grant succeeded.
--Grant privileges to t2
SQL> grant connect,resource,unlimited tablespace,create session to t2;
Grant succeeded.
SQL> conn t1
Enter password:
Connected.
--create table t1
SQL> create table t1.table1 (name varchar(20),id number(10),primary key (id));
Table created.
SQL> conn t2
Enter password:
Connected.
--create table t2
SQL> create table t2.table2 (name varchar(20),id number(10),primary key (id));
Table created.
SQL> conn / as sysdba
Connected.
SQL> grant select,insert,update,delete,references on t2.table2 to t1;
Grant succeeded.
SQL> grant select,insert,update,delete,references on t1.table1 to t2;
Grant succeeded.
--Insert rows to TABLE1
SQL> insert all into t1.table1 values('abc',1)
into t1.table1 values('bcd',2)
into t1.table1 values('cde',3)
select * from dual;
3 rows created.
SQL> commit;
Commit complete.
--Insert rows to TABLE2
SQL> insert all into t2.table2 values('abc',1)
into t2.table2 values('bcd',2)
into t2.table2 values('cde',3)
select * from dual;
3 rows created.
SQL> commit;
Commit complete.
-Add foreign key with reference to id column in TABLE2
SQL> alter table t2.table2 add constraint fk_id1 foreign key(id) references t1.table1(id);
Table altered.
--Gather stats on TABLE1 and TABLE2
SQL> exec dbms_stats.gather_table_stats('T1','TABLE1');
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats('T2','TABLE2');
PL/SQL procedure successfully completed.
--Connect to t1 and update a row in TABLE1
SQL> connect t1
Enter password:
Connected.
SQL>
SQL> update t1.table1 set name='opn' where id=1;
1 row updated.
--Connect to t2 and update a row in TABLE2
SQL> connect t2
Enter password:
Connected.
SQL> update t2.table2 set name='xyz' where id=1;
1 row updated.
--Check the sid and serial# of the two sessions t1 and t2
SQL> connect / as sysdba
SQL> select sid,serial#,program,module from v$session where program like '%sqlplus%';
SID SERIAL# PROGRAM MODULE
---------- ---------- -------------------- --------------------
1 48611 sqlplus@dptest (TNS SQL*Plus
V1-V3)
54 36555 sqlplus@dptest (TNS SQL*Plus
V1-V3)
--Here both TABLE1 and TABLE2 have exclusive locks by session t1 and t2
SQL> select OBJECT_ID,SESSION_ID,PROCESS,LOCKED_MODE from v$locked_object where SESSION_ID in ('1','54');
OBJECT_ID SESSION_ID PROCESS LOCKED_MODE
---------- ---------- ------------------------ -----------
20657 1 14965 3
20655 54 15378 3
--Using object_id, get the object name
SQL> select owner,object_id,object_name,object_type from dba_objects where object_id in ('20657','20655');
OWNER OBJECT_ID OBJECT_NAME OBJECT_TYPE
-------------------- ---------- -------------------- -----------------------
T1 20655 TABLE1 TABLE
T2 20657 TABLE2 TABLE
--Again connect to t1 and update TABLE2 from session t1
--Here we can see the message deadlock detected
SQL> connect t1
Enter password:
Connected.
SQL> update t2.table2 set name='xyz' where id=1;
update t2.table2 set name='xyz' where id=1
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
connect to t2 and update TABLE1 from session t2
SQL> connect t2
Enter password:
Connected.
SQL> update t1.table1 set name='opn' where id=1;
Interpretation of deadlock:
Always look for the trace file from alertlog by searching with the error ‘ora-00060’
ORA-00060: Deadlock detected. See Note 60.1 at My Oracle Support for Troubleshooting ORA-60 Errors. More info in file /u01/app/oracle/diag/rdbms/mtdb/mtdb/trace/mtdb_ora_15561.trc.
Almost all trace file generated during deadlock contains deadlock graph which has the information such as blocking and waiting sessions involved , SQL statement, process id from OS etc..
It is important to note the “Rows waited on:” session which contains useful information on locked rows during deadlock which can be useful for application developers for investigation
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-00070010-000002f2 25 161 X 34 19 X
TX-0006000a-0001ebb4 34 19 X 25 161 X
session 161: DID 0001-0019-0000009F session 19: DID 0001-0022-00000108
session 19: DID 0001-0022-00000108 session 161: DID 0001-0019-0000009F
Rows waited on:
Session 161: obj - rowid = 00015A6C - AAAVpsAAEAAAAILAAA
(dictionary objn - 88684, file - 4, block - 523, slot - 0)
Session 19: obj - rowid = 00015A6E - AAAVpuAAEAAAAIbAAA
(dictionary objn - 88686, file - 4, block - 539, slot - 0)
Identify the rowid and block number from the victim tables with the help of above deadlock trace information.
SQL> select object_name,object_type from dba_objects where object_id in ('88686','88684');
OBJECT_NAME OBJECT_TYPE
-------------------- --------------------
TABLE1 TABLE
TABLE2 TABLE
SQL> select rowid,dbms_rowid.rowid_block_number(rowid) "BLOCK" from t2.table2 where dbms_rowid.rowid_block_number(rowid)=539;
ROWID BLOCK
------------------ ----------
AAAVpuAAEAAAAIbAAA 539
AAAVpuAAEAAAAIbAAB 539
AAAVpuAAEAAAAIbAAC 539
SQL> select rowid,dbms_rowid.rowid_block_number(rowid) "block" from t1.table1 where dbms_rowid.rowid_block_number(rowid)=523;
ROWID BLOCK
------------------ ----------
AAAVpsAAEAAAAILAAA 523
AAAVpsAAEAAAAILAAB 523
AAAVpsAAEAAAAILAAC 523
Below are information related to the deadlock sessions and the SQL
----- Information for the OTHER waiting sessions -----
Session 1:
sid: 1 ser: 48611 audsid: 320008 user: 66/T2
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40009) -/-/INC
pid: 6 O/S info: user: oracle, term: UNKNOWN, ospid: 15503
image: oracle@dptest (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/0, ospid: 14965
machine: dptest program: sqlplus@dptest (TNS V1-V3)
application name: SQL*Plus, hash value=3669949024
current SQL:
update t1.table1 set name='opn' where id=1
----- End of information for the OTHER waiting sessions -----
Information for THIS session:
----- Current SQL Statement for this session (sql_id=ft7mktbm8vs76) -----
update t2.table2 set name='xyz' where id=1
===================================================
PROCESS STATE
-------------
Process global information:
process: 0xb9a5f458, call: 0xb9fb9ac0, xact: 0xb7e35360, curses: 0xb9e942a8, usrses: 0xb9e942a8
in_exception_handler: no
----------------------------------------
SO: 0xb9a5f458, type: 2, owner: (nil), flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0xb9a5f458, name=process, file=ksu.h LINE:13949, pg=0 conuid=0
(process) Oracle pid:39, ser:176, calls cur/top: 0xb9fb9ac0/0xb9fb9ac0
flags : (0x0) - icon_uid:0
Traces can be manually generated for deadlock using 10027 code which is an undocumented event
SQL> update t2.table2 set name='xyz' where id=1;
1 row updated.
SQL> alter session set tracefile_identifier='deadlock60';
Session altered.
SQL> alter session set events '10027 trace name context forever,level 12';
Session altered.
SQL> update t1.table1 set name='opn' where id=1;
update t1.table1 set name='opn' where id=1
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
SQL> alter session set events '10027 trace name context off';
Session altered.
SQL> exit
Check the type of enqueue which the session wait on. Whenever a deadlock is observed, always there should be an enqueue waits behind error. It is DBA’s responsibility to diagnose the enqueue type and proceed accordingly
Below wait stack from trace file shows enq: TX – row lock contention which is a common enqueue during multiple update operations to safeguard the transactions from concurrency. This TX lock is an application code issue or overlapping job schedule at same time.
This query can be used to identify the top occupants in sysaux
col OCCUPANT_NAME for a20
col OCCUPANT_DESC for a40
select *
from
(
select OCCUPANT_NAME,
OCCUPANT_DESC,
SPACE_USAGE_KBYTES/1048576 OCCUPIED_MB
from
v$sysaux_occupants
order by 3 desc
)
where rownum < 6;
Find the top segments lying on sysaux
col SEGMENT_NAME for a20
select * from
(select segment_name,
bytes/1073741824 GB,
segment_type
from
dba_segments
where tablespace_name='SYSAUX'
order by 2 desc)
where rownum < 6;
Something similar to the below query come into picture which cause high CPU usage,memory usage and generate big trace files in the BACKGROUND DUMP DEST location and continues to grow frequently.
The SQL text itself fills almost 5 pages and its execution plan length is almost 362 lines
This is due to optimizer_adaptive_plan internal job which checks for the costly SQL statements in the database
After tons of research, came to find out 12c feature “Automatic Report Capturing Feature” where MMON slaves monitor the costly SQL statements from the database and generate automated SQL monitoring report. This query uses parallel to pull the report from database
WITH MONITOR_DATA AS (SELECT * FROM TABLE(GV$(CURSOR( SELECT USERENV('instance') AS INST_ID, KEY, NVL2(PX_QCSID, NULL, STATUS) STATUS, FIRST_REFRESH_TIME, LAST_REFRESH_TIME, REFRESH_COUNT, PROCESS_NAME, SID, SQL_ID, SQL_EXEC_START, SQL_EXEC_ID, DBOP_NAME, DBOP_EXEC_ID, SQL_PLAN_HASH_VALUE, SQL_FULL_PLAN_HASH_VALUE, SESSION_SERIAL#, SQL_TEXT, PX_SERVER#, PX_SERVER_GROUP, PX_SERVER_SET, PX_QCINST_ID, PX_QCSID, CASE WHEN
..............................................................
XMLELEMENT( "stat", XMLATTRIBUTES('write_bytes' AS "name"), MA.SUM_WRITE_BYTES)), DECODE(MA.SUM_IO_UNC_BYTES, 0, NULL, XMLELEMENT( "stat", XMLATTRIBUTES('unc_bytes' AS "name"), MA.SUM_IO_UNC_BYTES)), DECODE(MA.SUM_IO_ELIG_BYTES, 0, NULL, XMLELEMENT( "stat", XMLATTRIBUTES('elig_bytes' AS "name"), MA.SUM_IO_ELIG_BYTES)), DECODE(MA.SUM_IO_RET_BYTES, 0, NULL, XMLELEMENT( "stat", XMLATTRIBUTES('ret_bytes' AS "name"), MA.SUM_IO_RET_BYTES)), CASE WHEN MA.SUM_IO_INTER_BYTES IS NULL OR MA.SUM_IO_INTER_BYTES = MA.SUM_IO_BYTES OR MA.SUM_IO_BYTES = 0 OR NVL(MA.SUM_IO_ELIG_BYTES, 0) = 0 THEN NULL ELSE XMLELEMENT( "stat", XMLATTRIBUTES('cell_offload_efficiency' AS "name"), ROUND(MA.SUM_IO_BYTES / DECODE(MA.SUM_IO_INTER_BYTES, 0, 1, MA.SUM_IO_INTER_BYTES), 2)) END, CASE WHEN NVL(MA.SUM_IO_ELIG_BYTES, 0) = 0 OR MA.SUM_IO_BYTES = 0 OR MA.SUM_IO_UNC_BYTES = MA.SUM_IO_RET_BYTES THEN NULL ELSE XMLELEMENT( "stat", XMLATTRIBUTES('cell_offload_efficiency2' AS "name"), GREATEST(0, ROUND(100 * (GREATEST(MA.SUM_IO_UNC_BYTES, MA.SUM_IO_ELIG_BYTES) - MA.SUM_IO_RET_BYTES) / MA.SUM_IO_ELIG_BYTES, 2))) END) ) END) FROM MONITOR_AGG MA) ELSE NULL END) FROM (SELECT CASE WHEN V1.XPLAN_XML IS NULL OR V1.XPLAN_XML.EXISTSNODE('/error') > 0 THEN NULL ELSE V1.XPLAN_XML END XPLAN_XML FROM (SELECT CASE WHEN :B36 = 1 THEN DBMS_XPLAN.BUILD_PLAN_XML( TABLE_NAME=>'gv$all_sql_plan', PLAN_TAG=>'plan', FILTER_PREDS=>:B35 , FORMAT=>'-PROJECTION +ALIAS +ADAPTIVE') ELSE NULL END XPLAN_XML FROM DUAL) V1) CONST_VIEW
At last we can see this weird message related to pq slaves
WATSON_PRAGMA BEGINAPPEND file=pq.wat
com='soft_assert' sql_id=1zd9dv1f2whjj fi='#FILE' li=#LINE mesg=qkaParallelizePost: PARSE1 plansig is DIFFERENT
WATSON_PRAGMA ENDAPPEND
The information related to the SQL can be seen in either in GV$SQLSTATS or DBA_HIST_SQLSTAT. High CPU and PGA consumption can be observed as symptoms
SQL> desc v$sqlstats;
Name Null? Type
----------------------------------------- -------- ----------------------------
SQL_TEXT VARCHAR2(1000)
SQL_FULLTEXT CLOB
DBA_HIST_SQLSTAT results showing significant sorts and cpu usage
Disable MMON monitoring with hidden parameter ‘_report_capture_cycle_time’. This will not negatively impact any performance or database plans. This is a dynamic parameter as no bounce required.Default value of the parameter is 60 sec
SQL> set lines 200 pages 1000
col KSPPINM for a20
col KSPPSTVL for a20
select x.ksppinm,
y.ksppstvl
from
x$ksppi x
inner join x$ksppcv y on (x.indx = y.indx)
and
x.ksppinm like '%_report_capture_cycle_time%'
order by x.ksppinm;SQL> SQL> SQL> 2 3 4 5 6 7 8
KSPPINM KSPPSTVL
-------------------- --------------------
_report_capture_cycl 60
e_time
Modify the value to 0
alter system set "_report_capture_cycle_time"=0;
or
Kill the MMON session from database level using sid and serial# or Kill all the MMON slave processes from OS using “kill” command to prevent rebirth of the process
This will stop high CPU usage, PGA usage and also big trace generation