BIG PPA TRACE FILES FILL UP QUICKLY IN BACKGROUND DUMP DEST IN ORACLE RAC WITH “WATSON_PRAGMA BEGINAPPEND file=pq.wat “

Cause:

Something similar to the below query come into picture which cause high CPU usage,memory usage and generate big trace files in the BACKGROUND DUMP DEST location and continues to grow frequently.

The SQL text itself fills almost 5 pages and its execution plan length is almost 362 lines

This is due to optimizer_adaptive_plan internal job which checks for the costly SQL statements in the database

After tons of research, came to find out 12c feature “Automatic Report Capturing Feature” where MMON slaves monitor the costly SQL statements from the database and generate automated SQL monitoring report. This query uses parallel to pull the report from database

WITH MONITOR_DATA AS (SELECT * FROM TABLE(GV$(CURSOR( SELECT USERENV('instance') AS INST_ID, KEY, NVL2(PX_QCSID, NULL, STATUS) STATUS, FIRST_REFRESH_TIME, LAST_REFRESH_TIME, REFRESH_COUNT, PROCESS_NAME, SID, SQL_ID, SQL_EXEC_START, SQL_EXEC_ID, DBOP_NAME, DBOP_EXEC_ID, SQL_PLAN_HASH_VALUE, SQL_FULL_PLAN_HASH_VALUE, SESSION_SERIAL#, SQL_TEXT, PX_SERVER#, PX_SERVER_GROUP, PX_SERVER_SET, PX_QCINST_ID, PX_QCSID, CASE WHEN 
..............................................................
XMLELEMENT( "stat", XMLATTRIBUTES('write_bytes' AS "name"), MA.SUM_WRITE_BYTES)), DECODE(MA.SUM_IO_UNC_BYTES, 0, NULL, XMLELEMENT( "stat", XMLATTRIBUTES('unc_bytes' AS "name"), MA.SUM_IO_UNC_BYTES)), DECODE(MA.SUM_IO_ELIG_BYTES, 0, NULL, XMLELEMENT( "stat", XMLATTRIBUTES('elig_bytes' AS "name"), MA.SUM_IO_ELIG_BYTES)), DECODE(MA.SUM_IO_RET_BYTES, 0, NULL, XMLELEMENT( "stat", XMLATTRIBUTES('ret_bytes' AS "name"), MA.SUM_IO_RET_BYTES)), CASE WHEN MA.SUM_IO_INTER_BYTES IS NULL OR MA.SUM_IO_INTER_BYTES = MA.SUM_IO_BYTES OR MA.SUM_IO_BYTES = 0 OR NVL(MA.SUM_IO_ELIG_BYTES, 0) = 0 THEN NULL ELSE XMLELEMENT( "stat", XMLATTRIBUTES('cell_offload_efficiency' AS "name"), ROUND(MA.SUM_IO_BYTES / DECODE(MA.SUM_IO_INTER_BYTES, 0, 1, MA.SUM_IO_INTER_BYTES), 2)) END, CASE WHEN NVL(MA.SUM_IO_ELIG_BYTES, 0) = 0 OR MA.SUM_IO_BYTES = 0 OR MA.SUM_IO_UNC_BYTES = MA.SUM_IO_RET_BYTES THEN NULL ELSE XMLELEMENT( "stat", XMLATTRIBUTES('cell_offload_efficiency2' AS "name"), GREATEST(0, ROUND(100 * (GREATEST(MA.SUM_IO_UNC_BYTES, MA.SUM_IO_ELIG_BYTES) - MA.SUM_IO_RET_BYTES) / MA.SUM_IO_ELIG_BYTES, 2))) END) ) END) FROM MONITOR_AGG MA) ELSE NULL END) FROM (SELECT CASE WHEN V1.XPLAN_XML IS NULL OR V1.XPLAN_XML.EXISTSNODE('/error') > 0 THEN NULL ELSE V1.XPLAN_XML END XPLAN_XML FROM (SELECT CASE WHEN :B36 = 1 THEN DBMS_XPLAN.BUILD_PLAN_XML( TABLE_NAME=>'gv$all_sql_plan', PLAN_TAG=>'plan', FILTER_PREDS=>:B35 , FORMAT=>'-PROJECTION +ALIAS +ADAPTIVE') ELSE NULL END XPLAN_XML FROM DUAL) V1) CONST_VIEW

At last we can see this weird message related to pq slaves

WATSON_PRAGMA BEGINAPPEND file=pq.wat
com='soft_assert' sql_id=1zd9dv1f2whjj fi='#FILE' li=#LINE mesg=qkaParallelizePost: PARSE1 plansig is DIFFERENT
WATSON_PRAGMA ENDAPPEND

The information related to the SQL can be seen in either in GV$SQLSTATS or DBA_HIST_SQLSTAT. High CPU and PGA consumption can be observed as symptoms

SQL>  desc v$sqlstats;
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 SQL_TEXT                                           VARCHAR2(1000)
 SQL_FULLTEXT                                       CLOB

DBA_HIST_SQLSTAT results showing significant sorts and cpu usage

EXECUTIONS_TOTAL PX_SERVERS_EXECS_TOTAL CPU_TIME_TOTAL SORTS_TOTAL CPU_TIME_DELTA ELAPSED_TIME_TOTAL
---------------- ---------------------- -------------- ----------- -------------- ------------------
               0                    623       53034379          48       40730071           53269066
              31                     88       18513940           4        8898557          113436029
               0                    608       83269701           0       50561093           83868172
             148                    280       45490931           0       13996843          491552201
               0                   1380      110711355          22       36645628          111192028
             172                    204       46078799           0        4516104          565271800
               0                   1626      125725433          26       43793914          126342285
             198                    153       43952141           0        4482204          640999849
               0                   1499      112356334          24       42413703          112885548
             184                    166       47226834           0        6499419          602899488
               0                    761      106623645           0       48854081          106975202
             113                    273       41893345           0       17663494          382788576
               0                   1245      131872236          46       62460804          132846903
               0                   1056      111428168           0       44131328          112206218
             161                    319       52306125           0        6992650          538329063
              74                    197       32978691           0       11366575          256941792

Workaround:

Disable MMON monitoring with hidden parameter ‘_report_capture_cycle_time’. This will not negatively impact any performance or database plans. This is a dynamic parameter as no bounce required.Default value of the parameter is 60 sec

SQL> set lines 200 pages 1000
col KSPPINM for a20
col KSPPSTVL for a20
select x.ksppinm,
            y.ksppstvl
     from
  x$ksppi x
  inner join x$ksppcv y on (x.indx = y.indx)
and
  x.ksppinm like '%_report_capture_cycle_time%'
order by x.ksppinm;SQL> SQL> SQL>   2    3    4    5    6    7    8

KSPPINM              KSPPSTVL
-------------------- --------------------
_report_capture_cycl 60
e_time

Modify the value to 0

alter system set "_report_capture_cycle_time"=0;

or

Kill the MMON session from database level using sid and serial# or Kill all the MMON slave processes from OS using “kill” command to prevent rebirth of the process

This will stop high CPU usage, PGA usage and also big trace generation

Leave a Reply