Tuning for Time- CTAS and Views

October 9, 2013October 10, 2013 Kellyn

Inspecting work area usage for memory is an important aspect of my job when I’m performing a tuning exercise. This is especially true when we are talking about specific processes vs. overall database level. Often, along with SQL enhancements, choices in source objects to a process can improve performance drastically. Most developers are focused on the results of query, especially with tight deadlines vs. performance and the DBA can often assist them in succeeding with both.

The process for our example today is a CTAS, (create table as select) that has a number of joins , a union, along with aggregation on a number of large tables, one of which is also CTAS in earlier processing, (WRK_TBL3 in the example below) that as part of the build process, has no stats collected post creation. The areas in bold will be explained as we go further into the post:

create table T1 nologging as
select  a11.S_ID,
    sum(a11.SCOL3_QTY) WJXBFS1,
    sum(a11.SCOL4) WJXBFS2,
    sum(a11.SCOL7_AMT) WJXBFS3
from    TBL1 a11
    join    TBL2    a12
     on    (a11.WEEK_ID = a12.WEEK_ID)
    join    TBL2_RM_TBL_VW1  a13
     on    (a11.S_ID = a13.S_ID)
where   ((a11.S_ID)
in (((select   ps21.S_ID
    from    WRK_TBL3    ps21
        join    TBL1_VW2    s22
         on    (ps21.S_ID = s22.S_ID)
    where   (s22.STAT in ('A', 'D', 'H')
     and ps21.FLG3_1 = 1))
union (select  ps21.S_ID
    from    WRK_TBL3    ps21
        join   TBL2    s22
         on    (ps21.S_ID = s22.S_ID)
    where   (s22.STAT in ('I')
     and ps21.FLG6_1 = 1))))
and a13.COL7 in (9382)
and a12.W_ID in (201319, 201320, 201321, 201322, 201323, 201324, 201325, 201326, 201327, 201328, 201329, 201330, 201331))
group by    a11.S_ID;

Looking at the execution plan for the query, there are a few things you will note.

1. A call to a remote table through a view exists.

2. The ‘group hash by’ that is using temp also shows most time elapsed in process, but it’s on the view joined to the remote table.

3. Dynamic sampling is shown, as it was required on the WRK_TBL3 for the CBO to make some type of intelligent decision, as stats are not gathered as part of the CTAS process.

SQL> select * from table(dbms_xplan.display_awr('4d7bz2bmscqz4'));





	Id Operation Name Rows Bytes TempSpc Cost (%CPU) Time




	0 CREATE TABLE STATEMENT 14M(100) 


	1  LOAD AS SELECT 


	2  HASH GROUP BY 4 420 14M (1) 48:45:27


	3  FILTER 


	4  HASH JOIN OUTER 4 420 14M (1) 48:45:27


	5  NESTED LOOPS 3839  296K 14M (1) 48:45:26


	6  HASH JOIN 3839  273K 14M (1) 48:45:26


	7  VIEW VW_NSO_1 68 884 78 (3) 00:00:01


	8  SORT UNIQUE 68 2312  78(52) 00:00:01


	9  UNION-ALL 


	10  NESTED LOOPS 


	11  NESTED LOOPS 34 1156  37 (0) 00:00:01


	12  TABLE ACCESS FULL WRK_TBL3 34 884  3 (0) 00:00:01


	13  INDEX UNIQUE SCAN TBL2_PK 1  0(0) 


	14  TABLE ACCESS BY INDEX ROWID TBL2 1 8  1 (0) 00:00:01


	15  NESTED LOOPS 


	16  NESTED LOOPS 34 1156  37 (0) 00:00:01


	17  TABLE ACCESS FULL WRK_TBL3 34 884  3 (0) 00:00:01


	18  INDEX UNIQUE SCAN TBL2_PK 1  0 (0) 


	19  TABLE ACCESS BY INDEX ROWID TBL2 1 8  1 (0) 00:00:01


	20  MERGE JOIN 1357K  77M 14M (1) 48:45:25


	21  SORT JOIN  76M  3641M  14M (1) 48:45:25


	22  VIEW TBL1_VW2  76M  3641M  14M (1) 48:45:25


	23  HASH GROUP BYÿ  76M  53G  64G  12M (1) 42:34:04


	24  VIEW TBL1_VW1 76M  53G  770K (1) 02:34:08


	25  UNION-ALL 


	26  PARTITION RANGE ALL  1667K  254M 10762 (1) 00:02:10


	27  TABLE ACCESS FULL TBL1  1667K  254M 10762 (1) 00:02:10


	28  PARTITION RANGE ALL  74M  12G  342Kÿ(2) 01:08:36


	29  TABLE ACCESS FULL TBL1  74M  12G 342K (2) 01:08:36


	30  SORT JOIN 11 110 12 (9) 00:00:01


	31  INLIST ITERATOR 


	32  TABLE ACCESS BY INDEX ROWID TBL2 11 110  11 (0) 00:00:01


	33  INDEX UNIQUE SCAN TBL2_IDX1 13  2 (0) 00:00:01


	34  INDEX UNIQUE SCAN TBL2_PK 1 6 0 (0) 


	35  REMOTE RM_TBL_FRM_VW 17916 454K 67 (0) 00:00:01

Id	Operation	Name	Rows	Bytes	TempSpc	Cost (%CPU)	Time
0	CREATE TABLE STATEMENT					14M(100)
1	LOAD AS SELECT
2	HASH GROUP BY		4	420		14M (1)	48:45:27
3	FILTER
4	HASH JOIN OUTER		4	420		14M (1)	48:45:27
5	NESTED LOOPS		3839	296K		14M (1)	48:45:26
6	HASH JOIN		3839	273K		14M (1)	48:45:26
7	VIEW	VW_NSO_1	68	884		78 (3)	00:00:01
8	SORT UNIQUE		68	2312		78(52)	00:00:01
9	UNION-ALL
10	NESTED LOOPS
11	NESTED LOOPS		34	1156		37 (0)	00:00:01
12	TABLE ACCESS FULL	WRK_TBL3	34	884		3 (0)	00:00:01
13	INDEX UNIQUE SCAN	TBL2_PK	1			0(0)
14	TABLE ACCESS BY INDEX ROWID	TBL2	1	8		1 (0)	00:00:01
15	NESTED LOOPS
16	NESTED LOOPS		34	1156		37 (0)	00:00:01
17	TABLE ACCESS FULL	WRK_TBL3	34	884		3 (0)	00:00:01
18	INDEX UNIQUE SCAN	TBL2_PK	1			0 (0)
19	TABLE ACCESS BY INDEX ROWID	TBL2	1	8		1 (0)	00:00:01
20	MERGE JOIN		1357K	77M		14M (1)	48:45:25
21	SORT JOIN		76M	3641M		14M (1)	48:45:25
22	VIEW	TBL1_VW2	76M	3641M		14M (1)	48:45:25
23	HASH GROUP BYÿ		76M	53G	64G	12M (1)	42:34:04
24	VIEW	TBL1_VW1	76M	53G		770K (1)	02:34:08
25	UNION-ALL
26	PARTITION RANGE ALL		1667K	254M		10762 (1)	00:02:10
27	TABLE ACCESS FULL	TBL1	1667K	254M		10762 (1)	00:02:10
28	PARTITION RANGE ALL		74M	12G		342Kÿ(2)	01:08:36
29	TABLE ACCESS FULL	TBL1	74M	12G		342K (2)	01:08:36
30	SORT JOIN		11	110		12 (9)	00:00:01
31	INLIST ITERATOR
32	TABLE ACCESS BY INDEX ROWID	TBL2	11	110		11 (0)	00:00:01
33	INDEX UNIQUE SCAN	TBL2_IDX1	13			2 (0)	00:00:01
34	INDEX UNIQUE SCAN	TBL2_PK	1	6		0 (0)
35	REMOTE	RM_TBL_FRM_VW	17916	454K		67 (0)	00:00:01

If we inspect the work area while the process is running, we can also view how many resources are being used, even if it’s not a complete picture until the process is complete:

select vst.sql_text, swa.sql_id, swa.sid, swa.tablespace
, swa.operation_type
, trunc(swa.work_area_size/1024/1024) "PGA MB"
, trunc(swa.max_mem_used/1024/1024)"Mem MB"
, trunc(swa.tempseg_size/1024/1024)"Temp MB"
from v$sql_workarea_active swa, v$session vs, v$sqltext vst
where swa.sid=vs.sid
and swa.sql_id=vs.sql_id
and vs. sql_id='4d7bz2bmscqz4'
and vs.sql_id=vst.sql_id
and vst.piece=0
order by swa.sql_id;

SQL_TEXT                             SQL_ID        SID OPERATION_TYPE       PGA MB Mem MB Temp MB
------------------------------- ------------- ---------- ---------------------------
create table T1 nologging as select  4d7bz2bmscqz4 1135 GROUP BY (HASH)       3    0
create table T1 nologging as select  4d7bz2bmscqz4 1135 HASH-JOIN             3    1
create table T1 nologging as select  4d7bz2bmscqz4 1135 LOAD WRITE BUFFERS    0    0
create table T1 nologging as select  4d7bz2bmscqz4 1135 SORT (v1)             8   97       100
create table T1 nologging as select  4d7bz2bmscqz4 1135 HASH-JOIN             3    0
create table T1 nologging as select  4d7bz2bmscqz4 1135 GROUP BY (HASH)      87 1051     21842

We already know the Group Hash By is our challenge- it’s temp usage, it’s the elapsed time in the process, but how can we minimize this in a build process?

SQL> select count(*) from dba_tab_columns
2 where table_name=’TBL1_VW1′;

COUNT(*)
———-
61

Now the view that we’re calling is coming from a remote table and combined with the local view we just queried columns on. So this is a view, upon a view to a table. I often see this kind of call “confuse” the optimizer and I often find myself moving to mviews or simplify the design to ensure the CBO isn’t left in this position. The CBO is currently pulling all 61 columns of this data into the group hash by.

Look at the query above- What columns are we really interested in of those total 61 columns in the view?

The A13.col7 is what is important from the table, then it will need to join on the S_ID to complete the process.

First, we look to see what other tables contain this column:

1 select table_name, column_name from dba_tab_columns
 2 where column_name='COL7'
 3 and table_name in (select table_name from dba_tab_columns
 4* where column_name='S_ID')

1222 rows selected.

Well, looks like we have a lot of opportunities to find another choice in the join that could result in better performance just be changing where we are sourcing from.

After inspecting the view with the remote table, a new view was created based off the following existing view that also satisfied multiple processes:

SQL> desc TBL1_VW4;
 D_ID NOT NULL DATE
 S_ID NOT NULL NUMBER(8)
 S_NO  NOT NULL NUMBER(4)
 S_ID NUMBER
 S_CNT NUMBER
 S_I_CNT NUMBER

SIX rows is a tenth of the rows the other view possessed. The hash group by on a full scan of this view would provide much better performance, again, we’re only interested in two columns. A new view was then created, joining to the remote table, then tested:

SQL_TEXT                             SQL_ID        SID OPERATION_TYPE       PGA MB Mem MB Temp MB
------------------------------- ------------- ---------- ---------------------------

create table T1 nologging as select  78fqd7xtxpybv 4162 GROUP BY (HASH)      18    1005      3295
create table T1 nologging as select  78fqd7xtxpybv 4162 HASH-JOIN             3       1
create table T1 nologging as select  78fqd7xtxpybv 4162 LOAD WRITE BUFFERS    0       0
create table T1 nologging as select  78fqd7xtxpybv 4162 SORT (v1)            34      97       697
create table T1 nologging as select  78fqd7xtxpybv 4162 GROUP BY (HASH)      12       0

Upon inspection of the new execution plan, we can see an almost 10 times improvement, elapsed time decreasing from 48 minutes to just 5 1/2 minutes. There is no other change at this time, but note the difference in temp usage and the introduction of bloom filters:

SQL> select * from table(dbms_xplan.display_awr('78fqd7xtxpybv'));

Id	Operation	Name	Rows	Bytes	TempSpc	Cost (%CPU)	Time
0	CREATE TABLE STATEMENT					1643K(100)
1	LOAD AS SELECT
2	HASH GROUP BY		4290	385K	8272K	1643K (1)	05:28:44
3	FILTER
4	HASH JOIN		83635	7514K		1642Kÿ (1)	05:28:34
5	JOIN FILTER CREATE	:BF0000	626	6260		15 (0)	00:00:01
6	TABLE ACCESS FULL	TBL1_VW	626	6260		15 (0)	00:00:01
7	FILTER
8	JOIN FILTER USE	:BF0000	83635	6697K		1642Kÿ(1)	05:28:33
9	MERGE JOIN OUTER		83635	6697K		1642K (1)	05:28:33
10	MERGE JOIN		76M	4078M		1642K (1)	05:28:33
11	SORT JOIN		76M	3641M		1642K (1)	05:28:32
12	VIEW	TBL1_VW2	76M	3641M		1642K (1)	05:28:32
13	HASH GROUP BY		76M	4223M	5575M	1502K(1)	05:00:27
14	VIEW	TBL_VW1_NW	76M	4223M		432K (1)	01:26:30
15	UNION-ALL
16	PARTITION RANGE ALL		1667K	49M		10721 (1)	00:02:09
17	TABLE ACCESS FULL	TBL1_VW2	1667K	49M		10721 (1)	00:02:09
18	PARTITION RANGE ALL		74M	2279M		341K (1)	01:08:13
19	TABLE ACCESS FULL	TBL1_VW2	74M	2279M		341K (1)	01:08:13
20	SORT JOIN		39362	230K		31 (7)	00:00:01
21	INDEX FAST FULL SCAN	TBL2_PK	39362	230K		29 (0)	00:00:01
22	SORT JOIN		17916	454K		69 (3)	00:00:01
23	REMOTE	RM_TBL_FRM_VW	17916	454K		67 (0)	00:00:01
24	SORT UNIQUE		2	68		14 (58)	00:00:01
25	UNION-ALL
26	NESTED LOOPS		1	34		5 (0)	00:00:01
27	TABLE ACCESS BY INDEX ROWID	TBL2	1	8		2 (0)	00:00:01
28	INDEX UNIQUE SCAN	TBL2_PK	1			1 (0)	00:00:01
29	TABLE ACCESS FULL	WRK_TBL3	1	26		3 (0)	00:00:01
30	NESTED LOOPS		1	34		5 (0)	00:00:01
31	TABLE ACCESS BY INDEX ROWID	TBL2_VW2	1	8		2 (0)	00:00:01
32	INDEX UNIQUE SCAN	TBL2_PK	1			1 (0)	00:00:01
33	TABLE ACCESS FULL	WRK_TBL3	1	26		3 (0)	00:00:01

Remote SQL Information (identified by operation id):

23 – SELECT “S_ID” FROM “RM_TBL_FRM_VW” “B” (accessing ‘RM_DBLINK’ )

– dynamic sampling used for this statement (level=2)

Here’s a clear example of how a small change can make a great difference in performance. The view available to the developer didn’t support performance and now we can continue to work on improvements on statistics and coding.

Kellyn

You May Also Like

EM12c Release 4, Private Roles

Refreshing VDB With Sales History Data

SQLd360 with the AWR Warehouse

2 thoughts on “Tuning for Time- CTAS and Views”