BIG PPA TRACE FILES FILL UP QUICKLY IN BACKGROUND DUMP DEST IN ORACLE RAC WITH “WATSON_PRAGMA BEGINAPPEND file=pq.wat “
Cause:
Something similar to the below query come into picture which cause high CPU usage,memory usage and generate big trace files in the BACKGROUND DUMP DEST location and continues to grow frequently.
The SQL text itself fills almost 5 pages and its execution plan length is almost 362 lines
This is due to optimizer_adaptive_plan internal job which checks for the costly SQL statements in the database
After tons of research, came to find out 12c feature “Automatic Report Capturing Feature” where MMON slaves monitor the costly SQL statements from the database and generate automated SQL monitoring report. This query uses parallel to pull the report from database
WITH MONITOR_DATA AS (SELECT * FROM TABLE(GV$(CURSOR( SELECT USERENV('instance') AS INST_ID, KEY, NVL2(PX_QCSID, NULL, STATUS) STATUS, FIRST_REFRESH_TIME, LAST_REFRESH_TIME, REFRESH_COUNT, PROCESS_NAME, SID, SQL_ID, SQL_EXEC_START, SQL_EXEC_ID, DBOP_NAME, DBOP_EXEC_ID, SQL_PLAN_HASH_VALUE, SQL_FULL_PLAN_HASH_VALUE, SESSION_SERIAL#, SQL_TEXT, PX_SERVER#, PX_SERVER_GROUP, PX_SERVER_SET, PX_QCINST_ID, PX_QCSID, CASE WHEN
..............................................................
XMLELEMENT( "stat", XMLATTRIBUTES('write_bytes' AS "name"), MA.SUM_WRITE_BYTES)), DECODE(MA.SUM_IO_UNC_BYTES, 0, NULL, XMLELEMENT( "stat", XMLATTRIBUTES('unc_bytes' AS "name"), MA.SUM_IO_UNC_BYTES)), DECODE(MA.SUM_IO_ELIG_BYTES, 0, NULL, XMLELEMENT( "stat", XMLATTRIBUTES('elig_bytes' AS "name"), MA.SUM_IO_ELIG_BYTES)), DECODE(MA.SUM_IO_RET_BYTES, 0, NULL, XMLELEMENT( "stat", XMLATTRIBUTES('ret_bytes' AS "name"), MA.SUM_IO_RET_BYTES)), CASE WHEN MA.SUM_IO_INTER_BYTES IS NULL OR MA.SUM_IO_INTER_BYTES = MA.SUM_IO_BYTES OR MA.SUM_IO_BYTES = 0 OR NVL(MA.SUM_IO_ELIG_BYTES, 0) = 0 THEN NULL ELSE XMLELEMENT( "stat", XMLATTRIBUTES('cell_offload_efficiency' AS "name"), ROUND(MA.SUM_IO_BYTES / DECODE(MA.SUM_IO_INTER_BYTES, 0, 1, MA.SUM_IO_INTER_BYTES), 2)) END, CASE WHEN NVL(MA.SUM_IO_ELIG_BYTES, 0) = 0 OR MA.SUM_IO_BYTES = 0 OR MA.SUM_IO_UNC_BYTES = MA.SUM_IO_RET_BYTES THEN NULL ELSE XMLELEMENT( "stat", XMLATTRIBUTES('cell_offload_efficiency2' AS "name"), GREATEST(0, ROUND(100 * (GREATEST(MA.SUM_IO_UNC_BYTES, MA.SUM_IO_ELIG_BYTES) - MA.SUM_IO_RET_BYTES) / MA.SUM_IO_ELIG_BYTES, 2))) END) ) END) FROM MONITOR_AGG MA) ELSE NULL END) FROM (SELECT CASE WHEN V1.XPLAN_XML IS NULL OR V1.XPLAN_XML.EXISTSNODE('/error') > 0 THEN NULL ELSE V1.XPLAN_XML END XPLAN_XML FROM (SELECT CASE WHEN :B36 = 1 THEN DBMS_XPLAN.BUILD_PLAN_XML( TABLE_NAME=>'gv$all_sql_plan', PLAN_TAG=>'plan', FILTER_PREDS=>:B35 , FORMAT=>'-PROJECTION +ALIAS +ADAPTIVE') ELSE NULL END XPLAN_XML FROM DUAL) V1) CONST_VIEW
At last we can see this weird message related to pq slaves
WATSON_PRAGMA BEGINAPPEND file=pq.wat
com='soft_assert' sql_id=1zd9dv1f2whjj fi='#FILE' li=#LINE mesg=qkaParallelizePost: PARSE1 plansig is DIFFERENT
WATSON_PRAGMA ENDAPPEND
The information related to the SQL can be seen in either in GV$SQLSTATS or DBA_HIST_SQLSTAT. High CPU and PGA consumption can be observed as symptoms
SQL> desc v$sqlstats;
Name Null? Type
----------------------------------------- -------- ----------------------------
SQL_TEXT VARCHAR2(1000)
SQL_FULLTEXT CLOB
DBA_HIST_SQLSTAT results showing significant sorts and cpu usage
EXECUTIONS_TOTAL PX_SERVERS_EXECS_TOTAL CPU_TIME_TOTAL SORTS_TOTAL CPU_TIME_DELTA ELAPSED_TIME_TOTAL
---------------- ---------------------- -------------- ----------- -------------- ------------------
0 623 53034379 48 40730071 53269066
31 88 18513940 4 8898557 113436029
0 608 83269701 0 50561093 83868172
148 280 45490931 0 13996843 491552201
0 1380 110711355 22 36645628 111192028
172 204 46078799 0 4516104 565271800
0 1626 125725433 26 43793914 126342285
198 153 43952141 0 4482204 640999849
0 1499 112356334 24 42413703 112885548
184 166 47226834 0 6499419 602899488
0 761 106623645 0 48854081 106975202
113 273 41893345 0 17663494 382788576
0 1245 131872236 46 62460804 132846903
0 1056 111428168 0 44131328 112206218
161 319 52306125 0 6992650 538329063
74 197 32978691 0 11366575 256941792
Workaround:
Disable MMON monitoring with hidden parameter ‘_report_capture_cycle_time’. This will not negatively impact any performance or database plans. This is a dynamic parameter as no bounce required.Default value of the parameter is 60 sec
SQL> set lines 200 pages 1000
col KSPPINM for a20
col KSPPSTVL for a20
select x.ksppinm,
y.ksppstvl
from
x$ksppi x
inner join x$ksppcv y on (x.indx = y.indx)
and
x.ksppinm like '%_report_capture_cycle_time%'
order by x.ksppinm;SQL> SQL> SQL> 2 3 4 5 6 7 8
KSPPINM KSPPSTVL
-------------------- --------------------
_report_capture_cycl 60
e_time
Modify the value to 0
alter system set "_report_capture_cycle_time"=0;
or
Kill the MMON session from database level using sid and serial# or Kill all the MMON slave processes from OS using “kill” command to prevent rebirth of the process
This will stop high CPU usage, PGA usage and also big trace generation