Oracle

Tuning for Time- CTAS and Views

Inspecting work area usage for memory is an important aspect of my job when I’m performing a tuning exercise.  This is especially true when we are talking about specific processes vs. overall database level.  Often, along with SQL enhancements, choices in source objects to a process can improve performance drastically.  Most developers are focused on the results of query, especially with tight deadlines vs. performance and the DBA can often assist them in succeeding with both.

The process for our example today is a CTAS, (create table as select) that has a number of joins , a union, along with aggregation on a number of large tables, one of which is also CTAS in earlier processing, (WRK_TBL3  in the example below) that as part of the build process, has no stats collected post creation.  The areas in bold will be explained as we go further into the post:

create table T1 nologging as
select  a11.S_ID,
    sum(a11.SCOL3_QTY) WJXBFS1,
    sum(a11.SCOL4) WJXBFS2,
    sum(a11.SCOL7_AMT) WJXBFS3
from    TBL1 a11
    join    TBL2    a12
     on    (a11.WEEK_ID = a12.WEEK_ID)
    join    TBL2_RM_TBL_VW1  a13
     on    (a11.S_ID = a13.S_ID)
where   ((a11.S_ID)
in (((select   ps21.S_ID
    from    WRK_TBL3    ps21
        join    TBL1_VW2    s22
         on    (ps21.S_ID = s22.S_ID)
    where   (s22.STAT in ('A', 'D', 'H')
     and ps21.FLG3_1 = 1))
union (select  ps21.S_ID
    from    WRK_TBL3    ps21
        join   TBL2    s22
         on    (ps21.S_ID = s22.S_ID)
    where   (s22.STAT in ('I')
     and ps21.FLG6_1 = 1))))
and a13.COL7 in (9382)
and a12.W_ID in (201319, 201320, 201321, 201322, 201323, 201324, 201325, 201326, 201327, 201328, 201329, 201330, 201331))
group by    a11.S_ID;

 

Looking at the execution plan for the query, there are a few things you will note.

1.  A call to a remote table through a view exists.

2.  The ‘group hash by’ that is using temp also shows most time elapsed in process, but it’s on the view joined to the remote table.

3.  Dynamic sampling is shown, as it was required on the WRK_TBL3 for the CBO to make some type of intelligent decision, as stats are not gathered as part of the CTAS process.

SQL> select * from table(dbms_xplan.display_awr('4d7bz2bmscqz4'));

IdOperationNameRowsBytesTempSpcCost (%CPU)Time
0CREATE TABLE STATEMENT14M(100)
1 LOAD AS SELECT
2 HASH GROUP BY442014M (1)48:45:27
3 FILTER
4 HASH JOIN OUTER442014M (1)48:45:27
5 NESTED LOOPS3839 296K14M (1)48:45:26
6 HASH JOIN3839 273K14M (1)48:45:26
7 VIEWVW_NSO_16888478 (3)00:00:01
8 SORT UNIQUE682312 78(52)00:00:01
9 UNION-ALL
10 NESTED LOOPS
11 NESTED LOOPS341156 37 (0)00:00:01
12 TABLE ACCESS FULLWRK_TBL334884 3 (0)00:00:01
13 INDEX UNIQUE SCANTBL2_PK1 0(0)
14 TABLE ACCESS BY INDEX ROWIDTBL218 1 (0)00:00:01
15 NESTED LOOPS
16 NESTED LOOPS341156 37 (0)00:00:01
17 TABLE ACCESS FULLWRK_TBL334884 3 (0)00:00:01
18 INDEX UNIQUE SCANTBL2_PK1 0 (0)
19 TABLE ACCESS BY INDEX ROWIDTBL218 1 (0)00:00:01
20 MERGE JOIN1357K 77M14M (1)48:45:25
21 SORT JOIN 76M 3641M 14M (1)48:45:25
22 VIEWTBL1_VW2 76M 3641M 14M (1)48:45:25
23 HASH GROUP BYÿ 76M 53G 64G 12M (1)42:34:04
24 VIEWTBL1_VW176M 53G 770K (1)02:34:08
25 UNION-ALL
26 PARTITION RANGE ALL 1667K 254M10762 (1)00:02:10
27 TABLE ACCESS FULLTBL1 1667K 254M10762 (1)00:02:10
28 PARTITION RANGE ALL 74M 12G 342Kÿ(2)01:08:36
29 TABLE ACCESS FULLTBL1 74M 12G342K (2)01:08:36
30 SORT JOIN1111012 (9)00:00:01
31 INLIST ITERATOR
32 TABLE ACCESS BY INDEX ROWIDTBL211110 11 (0)00:00:01
33 INDEX UNIQUE SCANTBL2_IDX113 2 (0)00:00:01
34 INDEX UNIQUE SCANTBL2_PK160 (0)
35 REMOTERM_TBL_FRM_VW17916454K67 (0)00:00:01
 

If we inspect the work area while the process is running, we can also view how many resources are being used, even if it’s not a complete picture until the process is complete:

select vst.sql_text, swa.sql_id, swa.sid, swa.tablespace
, swa.operation_type
, trunc(swa.work_area_size/1024/1024) "PGA MB"
, trunc(swa.max_mem_used/1024/1024)"Mem MB"
, trunc(swa.tempseg_size/1024/1024)"Temp MB"
from v$sql_workarea_active swa, v$session vs, v$sqltext vst
where swa.sid=vs.sid
and swa.sql_id=vs.sql_id
and vs. sql_id='4d7bz2bmscqz4'
and vs.sql_id=vst.sql_id
and vst.piece=0
order by swa.sql_id;
SQL_TEXT                             SQL_ID        SID OPERATION_TYPE       PGA MB Mem MB Temp MB
------------------------------- ------------- ---------- ---------------------------
create table T1 nologging as select  4d7bz2bmscqz4 1135 GROUP BY (HASH)       3    0
create table T1 nologging as select  4d7bz2bmscqz4 1135 HASH-JOIN             3    1
create table T1 nologging as select  4d7bz2bmscqz4 1135 LOAD WRITE BUFFERS    0    0
create table T1 nologging as select  4d7bz2bmscqz4 1135 SORT (v1)             8   97       100
create table T1 nologging as select  4d7bz2bmscqz4 1135 HASH-JOIN             3    0
create table T1 nologging as select  4d7bz2bmscqz4 1135 GROUP BY (HASH)      87 1051     21842

We already know the Group Hash By is our challenge- it’s temp usage, it’s the elapsed time in the process, but how can we minimize this in a build process?

SQL> select count(*) from dba_tab_columns
2 where table_name=’TBL1_VW1′;

COUNT(*)
———-
61

Now the view that we’re calling is coming from a remote table and combined with the local view we just queried columns on.  So this is a view, upon a view to a table.  I often see this kind of call “confuse” the optimizer and I often find myself moving to mviews or simplify the design to ensure the CBO isn’t left in this position.  The CBO is currently pulling all 61 columns of this data into the group hash by.

Look at the query above-  What columns are we really interested in of those total 61 columns in the view?

The A13.col7 is what is important from the table, then it will need to join on the S_ID to complete the process.  

First, we look to see what other tables contain this column:

1 select table_name, column_name from dba_tab_columns
 2 where column_name='COL7'
 3 and table_name in (select table_name from dba_tab_columns
 4* where column_name='S_ID')
1222 rows selected.

Well, looks like we have a lot of opportunities to find another choice in the join that could result in better performance just be changing where we are sourcing from.

After inspecting the view with the remote table, a new view was created based off the following existing view that also satisfied multiple processes:

 

SQL> desc TBL1_VW4;
 D_ID NOT NULL DATE
 S_ID NOT NULL NUMBER(8)
 S_NO  NOT NULL NUMBER(4)
 S_ID NUMBER
 S_CNT NUMBER
 S_I_CNT NUMBER

SIX rows is a tenth of the rows the other view possessed.  The hash group by on a full scan of this view would provide much better performance, again, we’re only interested in two columns.  A new view was then created, joining to the remote table, then tested:

 

SQL_TEXT                             SQL_ID        SID OPERATION_TYPE       PGA MB Mem MB Temp MB
------------------------------- ------------- ---------- ---------------------------
create table T1 nologging as select  78fqd7xtxpybv 4162 GROUP BY (HASH)      18    1005      3295
create table T1 nologging as select  78fqd7xtxpybv 4162 HASH-JOIN             3       1
create table T1 nologging as select  78fqd7xtxpybv 4162 LOAD WRITE BUFFERS    0       0
create table T1 nologging as select  78fqd7xtxpybv 4162 SORT (v1)            34      97       697
create table T1 nologging as select  78fqd7xtxpybv 4162 GROUP BY (HASH)      12       0

Upon inspection of the new execution plan, we can see an almost 10 times improvement, elapsed time decreasing from 48 minutes to just 5 1/2 minutes.  There is no other change at this time, but note the difference in temp usage and the introduction of bloom filters:

SQL> select * from table(dbms_xplan.display_awr('78fqd7xtxpybv'));

 

IdOperationNameRowsBytesTempSpcCost (%CPU)Time
0CREATE TABLE STATEMENT 1643K(100)
1 LOAD AS SELECT
2 HASH GROUP BY4290 385K 8272K 1643K (1)05:28:44
3 FILTER
4 HASH JOIN83635 7514K 1642Kÿ (1)05:28:34
5 JOIN FILTER CREATE:BF00006266260 15 (0)00:00:01
6 TABLE ACCESS FULLTBL1_VW6266260 15 (0)00:00:01
7 FILTER
8 JOIN FILTER USE:BF000083635 6697K 1642Kÿ(1)05:28:33
9MERGE JOIN OUTER83635 6697K 1642K (1)05:28:33
10 MERGE JOIN 76M 4078M 1642K (1)05:28:33
11 SORT JOIN 76M 3641M 1642K (1)05:28:32
12 VIEWTBL1_VW2 76M 3641M 1642K (1)05:28:32
13 HASH GROUP BY 76M 4223M5575M 1502K(1)05:00:27
14 VIEWTBL_VW1_NW 76M 4223M 432K (1)01:26:30
15 UNION-ALL
16 PARTITION RANGE ALL 1667K 49M10721 (1)00:02:09
17 TABLE ACCESS FULLTBL1_VW2 1667K 49M10721 (1)00:02:09
18 PARTITION RANGE ALL 74M 2279M 341K (1)01:08:13
19 TABLE ACCESS FULLTBL1_VW2 74M 2279M 341K (1)01:08:13
20 SORT JOIN39362 230K 31 (7)00:00:01
21 INDEX FAST FULL SCANTBL2_PK39362 230K 29 (0)00:00:01
22 SORT JOIN17916 454K69 (3)00:00:01
23 REMOTERM_TBL_FRM_VW17916 454K 67 (0)00:00:01
24 SORT UNIQUE268 14 (58)00:00:01
25 UNION-ALL
26 NESTED LOOPS134 5 (0)00:00:01
27 TABLE ACCESS BY INDEX ROWIDTBL218 2 (0)00:00:01
28 INDEX UNIQUE SCANTBL2_PK11 (0)00:00:01
29TABLE ACCESS FULLWRK_TBL31263 (0)00:00:01
30 NESTED LOOPS134 5 (0)00:00:01
31TABLE ACCESS BY INDEX ROWIDTBL2_VW2182 (0)00:00:01
32 INDEX UNIQUE SCANTBL2_PK11 (0)00:00:01
33TABLE ACCESS FULLWRK_TBL31263 (0)00:00:01
Remote SQL Information (identified by operation id):
  23 – SELECT “S_ID” FROM “RM_TBL_FRM_VW” “B” (accessing ‘RM_DBLINK’ )
   – dynamic sampling used for this statement (level=2)

Here’s a clear example of how a small change can make a great difference in performance.  The view available to the developer didn’t support performance and now we can continue to work on improvements on statistics and coding.

 

 

 

Kellyn

http://about.me/dbakevlar