QUERY TRANSFORMER IN SQL EXECUTION ORACLE

During execution of an SQL statement in oracle database by a client or user, query has to go into certain phases.

  • SYNTAX CHECK
  • SEMANTIC CHECK
  • SHARED POOL CHECK

Above three steps altogether considered as parsing

After these checks we directly go to optimizers execution plan to access the blocks from the database. This phase is called as optimization.

But before that, we have a phase called query transformation

The sql execution goes into query transformation phase with a set of query blocks which is defined implicitly by SELECT

Each query block is either nested within another or interrelated

Query transformer determines the optimal query transformation to generate a better execution plan by rewrite of query before optimizer generate the plan

Post all these steps , we have row source generation phase called generation of query plan.

In this article , we take a peek on query transformer

Example:

Single query block:

select * from afterdrop where cid='6697053';

Dual query block:

select * from afterdrop where cid in (select cid from afterdrop);

The dual query block can be rewritten in way by using sub plan separately for the sub query or considering a rewrite for join for the inner query. But the result set is same.

Types of query transformation:

  • View merging
  • Subquery unnesting
  • Predicate pushing
  • MV query rewrite

Let us see the query transformation techniques a bit more in detail!

View merging:

Basically, view merging is merging views with the main query block with single execution. A view can also be executed with a sub plan from the main query plan. View merging is method of query transformation which transforms a view either in line or stored views that can be analyzed into separate query blocks or as a single execution

The SQL query can be rewritten without the view. Below are the scenarios when view merging occur,

  • column used in index within another query block
  • column used for partition pruning within another query block
  • condition that limit the rows returned from tables in joined view

Below are the scenarios when view merging less likely to occur,

  • ORDER BY
  • ROWNUM
  • UNION,INTERSECT,MINUS

In the below example query, view merge take place by not considering the view STB1 as a view itself with sub plan.

If you see the second step, the view is directly used as a sub plan with a child step ‘ VIEW PUSHED PREDICATE ‘. Here the view is correlated and is evaluated for each row of outer query block

1) WITH VIEW MERGING:

kish@exdbx<>select * from sales_tab stb,
(select order_id from STB1) st
where stb.order_id = st.order_id(+)
and
stb.TOTAL_PROFIT > 3000;
  2    3    4    5  Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 920325918

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |     1 |    69 |  3272   (1)| 00:00:40 |
|*  1 |  HASH JOIN OUTER             |           |     1 |    69 |  3272   (1)| 00:00:40 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SALES_TAB |     1 |    63 |     0   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | TP_STB1   |     1 |       |     0   (0)| 00:00:01 |
|   4 |   TABLE ACCESS STORAGE FULL  | SALES_T   |  1048K|  6144K|  3269   (1)| 00:00:40 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("STB"."ORDER_ID"="ORDER_ID"(+))
   3 - access("STB"."TOTAL_PROFIT">3000)

========================================================================================================
2) WITHOUT VIEW MERGING:

kish@exdbx<>select * from sales_tab stb,
(select /*+NO_MERGE*/ order_id from STB1) st
where stb.order_id = st.order_id(+)
and
stb.TOTAL_PROFIT > 3000;  2    3    4    5
Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 3052241770

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |     1 |    76 |  3271   (1)| 00:00:40 |
|   1 |  NESTED LOOPS OUTER          |           |     1 |    76 |  3271   (1)| 00:00:40 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SALES_TAB |     1 |    63 |     0   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | TP_STB1   |     1 |       |     0   (0)| 00:00:01 |
|   4 |   VIEW PUSHED PREDICATE      |           |     1 |    13 |  3271   (1)| 00:00:40 |
|*  5 |    TABLE ACCESS STORAGE FULL | SALES_T   |     1 |     6 |  3271   (1)| 00:00:40 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("STB"."TOTAL_PROFIT">3000)
   5 - filter("ORDER_ID"="STB"."ORDER_ID")

These are the hidden parameters which influence view merging. Complex view merging is enabled to TRUE by default

kish@exdbx<>select KSPPINM from x$ksppi where KSPPINM like '%merging%';

KSPPINM
--------------------------------------------------------------------------------
_complex_view_merging
_simple_view_merging
optimizer_secure_view_merging

Subquery unnesting:

Subquery unnesting is the method to convert a subquery into a join. The query transformer looks for ways to convert a subquery to join which is faster than the sub plan. Below is a simple example.

Here, the first step shows a single plan step for the both outer query and sub query with subquery unnesting in place.

Second step shows the query without unnesting the inner query which has a sub plan in it and uses a filter operation which is not efficient while joining tables. Why this happen because, the optimizer scans SALES_TAB table 1048k times up and down which is less better than table joins

1) USING UNNESTING:

kish@exdbx<>set autot traceonly explain
kish@exdbx<>select * from sales_t where order_id in (select order_id from SALES_TAB);
Elapsed: 00:00:00.09
Execution Plan
----------------------------------------------------------
Plan hash value: 522758219

------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |           |  1048K|    81M|       | 12117   (1)| 00:02:26 |
|*  1 |  HASH JOIN RIGHT SEMI      |           |  1048K|    81M|    17M| 12117   (1)| 00:02:26 |
|*  2 |   TABLE ACCESS STORAGE FULL| SALES_TAB |  1048K|  5120K|       |  3629   (1)| 00:00:44 |
|   3 |   TABLE ACCESS STORAGE FULL| SALES_T   |  1048K|    76M|       |  3274   (1)| 00:00:40 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ORDER_ID"="ORDER_ID")
   2 - storage("ORDER_ID" IS NOT NULL)
       filter("ORDER_ID" IS NOT NULL)

======================================================================================================
2) USING NO_UNNESTING:

kish@exdbx<>set autot traceonly explain
kish@exdbx<>select order_id,order_priority from sales_t where order_id in (select /*+NO_UNNEST*/ order_id from SALES_TAB where order_id > 369);
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 3081409736

----------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |           |     1 |     8 |  3772M  (1)|999:59:59 |
|*  1 |  FILTER                                |           |       |       |            |          |
|   2 |   TABLE ACCESS STORAGE FULL            | SALES_T   |  1048K|  8192K|  3270   (1)| 00:00:40 |
|*  3 |   FILTER                               |           |       |       |            |          |
|*  4 |    TABLE ACCESS STORAGE FULL FIRST ROWS| SALES_TAB |     1 |     5 |  3630   (1)| 00:00:44 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "SALES_TAB" "SALES_TAB" WHERE 369<:B1
              AND "ORDER_ID"=:B2 AND "ORDER_ID">369))
   3 - filter(369<:B1)
   4 - filter("ORDER_ID"=:B1 AND "ORDER_ID">369)

Subquery unnesting is also a faster way to execute a subquery by removing unnecessary processing of sub plan in the query.

Things which restrict subquery unnesting

  • Disjunctive linking
  • Disjunctive correlation
  • ROWNUM(where rownum > 1)
  • CONNECT BY in subquery
  • Nested aggregate functions like MAX(COUNT(*))
  • set operator in subquery(UNION,, UNION ALL)
  • Correlated subquery block reference not in the outer query

Below is another example for subquery unnesting with correlated subquery. Here the subquery is unnested in the form of an inline view, merged with the outer query and joined atlast. The column TOTAL_PROFIT becomes the join condition and rest of the plan is spent on inline view itself.

1) SUBQUERY UNNESTING WITH CORRELATED SUBQUERY
kish@exdbx<>select x.order_id,x.order_priority,x.TOTAL_PROFIT
  2  from
  3  sales_t x
  4  where x.TOTAL_PROFIT >
  5  (select avg(y.TOTAL_PROFIT)
  6  from SALES_TAB y
  7  where y.order_id=x.order_id);
Elapsed: 00:00:00.05

Execution Plan
----------------------------------------------------------
Plan hash value: 2553163008

---------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |           | 39322 |  1804K|       | 10292   (1)| 00:02:04 |
|   1 |  NESTED LOOPS                 |           | 39322 |  1804K|       | 10292   (1)| 00:02:04 |
|   2 |   NESTED LOOPS                |           |  1043K|  1804K|       | 10292   (1)| 00:02:04 |
|   3 |    VIEW                       | VW_SQ_1   |  1043K|    25M|       | 10286   (1)| 00:02:04 |
|   4 |     HASH GROUP BY             |           |  1043K|    17M|    37M| 10286   (1)| 00:02:04 |
|   5 |      TABLE ACCESS STORAGE FULL| SALES_TAB |  1398K|    24M|       |  3633   (1)| 00:00:44 |
|*  6 |    INDEX RANGE SCAN           | TP_STB    |     1 |       |       |     0   (0)| 00:00:01 |
|*  7 |   TABLE ACCESS BY INDEX ROWID | SALES_T   |     1 |    21 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("X"."TOTAL_PROFIT">"AVG(Y.TOTAL_PROFIT)")
   7 - filter("ITEM_1"="X"."ORDER_ID")

==========================================================================================================
2) REWRITTEN QUERY TO PREVIOUS QUERY
kish@exdbx<>select x.order_id,x.order_priority,x.TOTAL_PROFIT
from
sales_t x,
(select order_id,avg(TOTAL_PROFIT)
from SALES_TAB y group by order_id) y
where x.order_id=y.order_id;  2    3    4    5    6
Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 2094056384

--------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |   786K|    25M|       | 16438   (1)| 00:03:18 |
|*  1 |  HASH JOIN                   |           |   786K|    25M|    24M| 16438   (1)| 00:03:18 |
|   2 |   VIEW                       |           |  1043K|    12M|       | 10286   (1)| 00:02:04 |
|   3 |    HASH GROUP BY             |           |  1043K|    17M|    37M| 10286   (1)| 00:02:04 |
|   4 |     TABLE ACCESS STORAGE FULL| SALES_TAB |  1398K|    24M|       |  3633   (1)| 00:00:44 |
|   5 |   TABLE ACCESS STORAGE FULL  | SALES_T   |  1048K|    21M|       |  3273   (1)| 00:00:40 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("X"."ORDER_ID"="Y"."ORDER_ID")

Hidden parameters which influence the subquery unnesting are below. ‘ _unnest_subquery ‘ is the parameter which is enabled by default which directly controls correlated subquery

kish@exdbx<>select KSPPINM from x$ksppi where KSPPINM like '%unnest%';

KSPPINM
--------------------------------------------------------------------------------
_unnest_subquery
_optimizer_unnest_all_subqueries
_distinct_view_unnesting
_optimizer_unnest_disjunctive_subq
_optimizer_unnest_corr_set_subq

Disjunctive linking:

kish@exdbx<>set autot traceonly explain
kish@exdbx<>select * from sales_t
  2  where ORDER_PRIORITY in (select order_id from SALES_TAB)
  3  or
  4  order_id=369;
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 3247245147

---------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |           | 52430 |  3891K|  3276   (1)| 00:00:40 |
|*  1 |  FILTER                               |           |       |       |            |          |
|   2 |   TABLE ACCESS STORAGE FULL           | SALES_T   |  1048K|    76M|  3276   (1)| 00:00:40 |
|*  3 |   TABLE ACCESS STORAGE FULL FIRST ROWS| SALES_TAB |     1 |     5 |  3634   (1)| 00:00:44 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("ORDER_ID"=369 OR  EXISTS (SELECT 0 FROM "SALES_TAB" "SALES_TAB" WHERE
              "ORDER_ID"=TO_NUMBER(:B1)))
   3 - filter("ORDER_ID"=TO_NUMBER(:B1))

Predicate pushing:

I would call Filter early faster execution(FEFE) for a good performance of SQL.

This method pushes the predicates first from an outer query before retrieving all rows in inner query block so that the predicates result sets can be calculated first to determine the filter before inner query execution.

Here optimizer may or may not use an index to filter the data.

Things which prevent predicate pushing

Example: If we want to travel to paris , then we need to know the direction or shortest way to it. Without having a basic idea on where we are heading to and the cost estimate to travel , we will spend more money if we go around the place.

Similarly predicate pushing pushes the calculation of predicates before processing sub query to spend less processing with less time.

QUERY WITH PREDICATE PUSHING 
 
kish@exdbx<>select x.order_id,x.order_priority,x.TOTAL_PROFIT
from
sales_t x,
(select order_id,avg(TOTAL_PROFIT) avg_pft
from SALES_TAB y group by order_id) y
where x.order_id=y.order_id
and x.TOTAL_PROFIT > y.avg_pft
and x.order_id=369;  2    3    4    5    6    7    8
Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 3354926154

-------------------------------------------------------------------------------------------
| Id  | Operation                     | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |           |     1 |    36 |  3631   (1)| 00:00:44 |
|   1 |  MERGE JOIN                   |           |     1 |    36 |  3631   (1)| 00:00:44 |
|   2 |   SORT JOIN                   |           |     1 |    21 |     0   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| SALES_T   |     1 |    21 |     0   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN           | TP_STB    |     1 |       |     0   (0)| 00:00:01 |
|*  5 |   SORT JOIN                   |           |     1 |    15 |  3631   (1)| 00:00:44 |
|   6 |    VIEW                       |           |     1 |    15 |  3630   (1)| 00:00:44 |
|   7 |     HASH GROUP BY             |           |     1 |    18 |  3630   (1)| 00:00:44 |
|*  8 |      TABLE ACCESS STORAGE FULL| SALES_TAB |     1 |    18 |  3630   (1)| 00:00:44 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("X"."ORDER_ID"=369)
   5 - access(INTERNAL_FUNCTION("X"."TOTAL_PROFIT")>INTERNAL_FUNCTION("Y"."AVG_PFT"
              ))
       filter(INTERNAL_FUNCTION("X"."TOTAL_PROFIT")>INTERNAL_FUNCTION("Y"."AVG_PFT"
              ))
   8 - storage("ORDER_ID"=369)
       filter("ORDER_ID"=369)

===================================================================================================
QUERY WITH NO PREDICATE PUSHING

kish@exdbx<>select x.order_id,x.order_priority,x.TOTAL_PROFIT
from
sales_t x,
(select order_id,avg(TOTAL_PROFIT) avg_pft
from SALES_TAB y
where rownum > 1
group by order_id) y
where x.order_id=y.order_id
and x.TOTAL_PROFIT > y.avg_pft
and x.order_id=369;  2    3    4    5    6    7    8    9   10
Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 2267306622

----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |           | 52429 |  2406K|       | 13557   (1)| 00:02:43 |
|   1 |  NESTED LOOPS                  |           | 52429 |  2406K|       | 13557   (1)| 00:02:43 |
|*  2 |   TABLE ACCESS STORAGE FULL    | SALES_T   |     1 |    21 |       |  3271   (1)| 00:00:40 |
|*  3 |   VIEW                         |           | 52160 |  1324K|       | 10286   (1)| 00:02:04 |
|   4 |    SORT GROUP BY               |           |  1043K|    17M|    37M| 10286   (1)| 00:02:04 |
|   5 |     COUNT                      |           |       |       |       |            |          |
|*  6 |      FILTER                    |           |       |       |       |            |          |
|   7 |       TABLE ACCESS STORAGE FULL| SALES_TAB |  1398K|    24M|       |  3633   (1)| 00:00:44 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - storage("X"."ORDER_ID"=369)
       filter("X"."ORDER_ID"=369)
   3 - filter("X"."TOTAL_PROFIT">"Y"."AVG_PFT" AND "Y"."ORDER_ID"=369)
   6 - filter(ROWNUM>1)

Query rewrite with MV:

Materialized view are similar to a view which store the precomputed result set in table and transformer can rewrite the query before the excution of query. This method can be efficient for large result sets with aggregations

Query transformer rewrite the query by comparing the query with the materialized views and retrieve data from the matched materialized view result set

Below is an example of a query with inner joins with multiple plan steps

kish@exdbx<>set autot traceonly explain
kish@exdbx<>select x.order_id,x.order_priority,x.TOTAL_PROFIT
from
sales_t x
inner join sales_tab y on x.order_id = y.order_id
inner join tabx z on y.order_id = z.order_id;  2    3    4    5
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 972519073

----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |           |  1059K|    31M|       | 11960   (1)| 00:02:24 |
|*  1 |  HASH JOIN                     |           |  1059K|    31M|    22M| 11960   (1)| 00:02:24 |
|*  2 |   HASH JOIN                    |           |  1053K|    10M|    17M|  5944   (1)| 00:01:12 |
|*  3 |    TABLE ACCESS STORAGE FULL   | SALES_TAB |  1048K|  5120K|       |  3629   (1)| 00:00:44 |
|   4 |    INDEX STORAGE FAST FULL SCAN| IDX_ID    |  1048K|  5120K|       |   620   (1)| 00:00:08 |
|   5 |   TABLE ACCESS STORAGE FULL    | SALES_T   |  1048K|    21M|       |  3273   (1)| 00:00:40 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("X"."ORDER_ID"="Y"."ORDER_ID")
   2 - access("Y"."ORDER_ID"="Z"."ORDER_ID")
   3 - storage("Y"."ORDER_ID" IS NOT NULL)
       filter("Y"."ORDER_ID" IS NOT NULL)

I create a materialized view(MV) for the select query

kish@exdbx<>create materialized view sales_mv
  2  enable query rewrite as
  3  select x.order_id,x.order_priority,x.TOTAL_PROFIT
  4  from
  5  sales_t x
  6  inner join sales_tab y on x.order_id = y.order_id
  7  inner join tabx z on y.order_id = z.order_id;

Materialized view created.

Optimizer chooses the MV without a hint most probably due to its speedness. But still i force the rewrite with a hint to use MV.

kish@exdbx<>select /*+rewrite(sales_mv)*/ x.order_id,x.order_priority,x.TOTAL_PROFIT
  2  from
  3  sales_t x
  4  inner join sales_tab y on x.order_id = y.order_id
  5  inner join tabx z on y.order_id = z.order_id;
Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1420257564

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |   946K|    42M|   469   (2)| 00:00:06 |
|   1 |  MAT_VIEW REWRITE ACCESS STORAGE FULL| SALES_MV |   946K|    42M|   469   (2)| 00:00:06 |
-------------------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

Thus using MV reduce high amount of response time and reduces cost by providing the mirror copy of the query directly without a need to process it from scratch

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s