Tuning for Time- CTAS and Views
Inspecting work area usage for memory is an important aspect of my job when I’m performing a tuning exercise. This is especially true when we are talking about specific processes vs. overall database level. Often, along with SQL enhancements, choices in source objects to a process can improve performance drastically. Most developers are focused on the results of query, especially with tight deadlines vs. performance and the DBA can often assist them in succeeding with both.
The process for our example today is a CTAS, (create table as select) that has a number of joins , a union, along with aggregation on a number of large tables, one of which is also CTAS in earlier processing, (WRK_TBL3 in the example below) that as part of the build process, has no stats collected post creation. The areas in bold will be explained as we go further into the post:
create table T1 nologging as select a11.S_ID, sum(a11.SCOL3_QTY) WJXBFS1, sum(a11.SCOL4) WJXBFS2, sum(a11.SCOL7_AMT) WJXBFS3 from TBL1 a11 join TBL2 a12 on (a11.WEEK_ID = a12.WEEK_ID) join TBL2_RM_TBL_VW1 a13 on (a11.S_ID = a13.S_ID) where ((a11.S_ID) in (((select ps21.S_ID from WRK_TBL3 ps21 join TBL1_VW2 s22 on (ps21.S_ID = s22.S_ID) where (s22.STAT in ('A', 'D', 'H') and ps21.FLG3_1 = 1)) union (select ps21.S_ID from WRK_TBL3 ps21 join TBL2 s22 on (ps21.S_ID = s22.S_ID) where (s22.STAT in ('I') and ps21.FLG6_1 = 1)))) and a13.COL7 in (9382) and a12.W_ID in (201319, 201320, 201321, 201322, 201323, 201324, 201325, 201326, 201327, 201328, 201329, 201330, 201331)) group by a11.S_ID;
Looking at the execution plan for the query, there are a few things you will note.
1. A call to a remote table through a view exists.
2. The ‘group hash by’ that is using temp also shows most time elapsed in process, but it’s on the view joined to the remote table.
3. Dynamic sampling is shown, as it was required on the WRK_TBL3 for the CBO to make some type of intelligent decision, as stats are not gathered as part of the CTAS process.
SQL> select * from table(dbms_xplan.display_awr('4d7bz2bmscqz4'));
Id Operation Name Rows Bytes TempSpc Cost (%CPU) Time
0 CREATE TABLE STATEMENT 14M(100)
1 LOAD AS SELECT
2 HASH GROUP BY 4 420 14M (1) 48:45:27
3 FILTER
4 HASH JOIN OUTER 4 420 14M (1) 48:45:27
5 NESTED LOOPS 3839 296K 14M (1) 48:45:26
6 HASH JOIN 3839 273K 14M (1) 48:45:26
7 VIEW VW_NSO_1 68 884 78 (3) 00:00:01
8 SORT UNIQUE 68 2312 78(52) 00:00:01
9 UNION-ALL
10 NESTED LOOPS
11 NESTED LOOPS 34 1156 37 (0) 00:00:01
12 TABLE ACCESS FULL WRK_TBL3 34 884 3 (0) 00:00:01
13 INDEX UNIQUE SCAN TBL2_PK 1 0(0)
14 TABLE ACCESS BY INDEX ROWID TBL2 1 8 1 (0) 00:00:01
15 NESTED LOOPS
16 NESTED LOOPS 34 1156 37 (0) 00:00:01
17 TABLE ACCESS FULL WRK_TBL3 34 884 3 (0) 00:00:01
18 INDEX UNIQUE SCAN TBL2_PK 1 0 (0)
19 TABLE ACCESS BY INDEX ROWID TBL2 1 8 1 (0) 00:00:01
20 MERGE JOIN 1357K 77M 14M (1) 48:45:25
21 SORT JOIN 76M 3641M 14M (1) 48:45:25
22 VIEW TBL1_VW2 76M 3641M 14M (1) 48:45:25
23 HASH GROUP BYÿ 76M 53G 64G 12M (1) 42:34:04
24 VIEW TBL1_VW1 76M 53G 770K (1) 02:34:08
25 UNION-ALL
26 PARTITION RANGE ALL 1667K 254M 10762 (1) 00:02:10
27 TABLE ACCESS FULL TBL1 1667K 254M 10762 (1) 00:02:10
28 PARTITION RANGE ALL 74M 12G 342Kÿ(2) 01:08:36
29 TABLE ACCESS FULL TBL1 74M 12G 342K (2) 01:08:36
30 SORT JOIN 11 110 12 (9) 00:00:01
31 INLIST ITERATOR
32 TABLE ACCESS BY INDEX ROWID TBL2 11 110 11 (0) 00:00:01
33 INDEX UNIQUE SCAN TBL2_IDX1 13 2 (0) 00:00:01
34 INDEX UNIQUE SCAN TBL2_PK 1 6 0 (0)
35 REMOTE RM_TBL_FRM_VW 17916 454K 67 (0) 00:00:01
If we inspect the work area while the process is running, we can also view how many resources are being used, even if it’s not a complete picture until the process is complete:
select vst.sql_text, swa.sql_id, swa.sid, swa.tablespace , swa.operation_type , trunc(swa.work_area_size/1024/1024) "PGA MB" , trunc(swa.max_mem_used/1024/1024)"Mem MB" , trunc(swa.tempseg_size/1024/1024)"Temp MB" from v$sql_workarea_active swa, v$session vs, v$sqltext vst where swa.sid=vs.sid and swa.sql_id=vs.sql_id and vs. sql_id='4d7bz2bmscqz4' and vs.sql_id=vst.sql_id and vst.piece=0 order by swa.sql_id;
SQL_TEXT SQL_ID SID OPERATION_TYPE PGA MB Mem MB Temp MB
------------------------------- ------------- ---------- ---------------------------
create table T1 nologging as select 4d7bz2bmscqz4 1135 GROUP BY (HASH) 3 0
create table T1 nologging as select 4d7bz2bmscqz4 1135 HASH-JOIN 3 1
create table T1 nologging as select 4d7bz2bmscqz4 1135 LOAD WRITE BUFFERS 0 0
create table T1 nologging as select 4d7bz2bmscqz4 1135 SORT (v1) 8 97 100
create table T1 nologging as select 4d7bz2bmscqz4 1135 HASH-JOIN 3 0
create table T1 nologging as select 4d7bz2bmscqz4 1135 GROUP BY (HASH) 87 1051 21842
We already know the Group Hash By is our challenge- it’s temp usage, it’s the elapsed time in the process, but how can we minimize this in a build process?
SQL> select count(*) from dba_tab_columns
2 where table_name=’TBL1_VW1′;
COUNT(*)
———-
61
Now the view that we’re calling is coming from a remote table and combined with the local view we just queried columns on. So this is a view, upon a view to a table. I often see this kind of call “confuse” the optimizer and I often find myself moving to mviews or simplify the design to ensure the CBO isn’t left in this position. The CBO is currently pulling all 61 columns of this data into the group hash by.
Look at the query above- What columns are we really interested in of those total 61 columns in the view?
The A13.col7 is what is important from the table, then it will need to join on the S_ID to complete the process.
First, we look to see what other tables contain this column:
1 select table_name, column_name from dba_tab_columns 2 where column_name='COL7' 3 and table_name in (select table_name from dba_tab_columns 4* where column_name='S_ID')
1222 rows selected.
Well, looks like we have a lot of opportunities to find another choice in the join that could result in better performance just be changing where we are sourcing from.
After inspecting the view with the remote table, a new view was created based off the following existing view that also satisfied multiple processes:
SQL> desc TBL1_VW4; D_ID NOT NULL DATE S_ID NOT NULL NUMBER(8) S_NO NOT NULL NUMBER(4) S_ID NUMBER S_CNT NUMBER S_I_CNT NUMBER
SIX rows is a tenth of the rows the other view possessed. The hash group by on a full scan of this view would provide much better performance, again, we’re only interested in two columns. A new view was then created, joining to the remote table, then tested:
SQL_TEXT SQL_ID SID OPERATION_TYPE PGA MB Mem MB Temp MB ------------------------------- ------------- ---------- ---------------------------
create table T1 nologging as select 78fqd7xtxpybv 4162 GROUP BY (HASH) 18 1005 3295 create table T1 nologging as select 78fqd7xtxpybv 4162 HASH-JOIN 3 1 create table T1 nologging as select 78fqd7xtxpybv 4162 LOAD WRITE BUFFERS 0 0 create table T1 nologging as select 78fqd7xtxpybv 4162 SORT (v1) 34 97 697 create table T1 nologging as select 78fqd7xtxpybv 4162 GROUP BY (HASH) 12 0
Upon inspection of the new execution plan, we can see an almost 10 times improvement, elapsed time decreasing from 48 minutes to just 5 1/2 minutes. There is no other change at this time, but note the difference in temp usage and the introduction of bloom filters:
SQL> select * from table(dbms_xplan.display_awr('78fqd7xtxpybv'));
Id | Operation | Name | Rows | Bytes | TempSpc | Cost (%CPU) | Time |
---|---|---|---|---|---|---|---|
0 | CREATE TABLE STATEMENT | 1643K(100) | |||||
1 | LOAD AS SELECT | ||||||
2 | HASH GROUP BY | 4290 | 385K | 8272K | 1643K (1) | 05:28:44 | |
3 | FILTER | ||||||
4 | HASH JOIN | 83635 | 7514K | 1642Kÿ (1) | 05:28:34 | ||
5 | JOIN FILTER CREATE | :BF0000 | 626 | 6260 | 15 (0) | 00:00:01 | |
6 | TABLE ACCESS FULL | TBL1_VW | 626 | 6260 | 15 (0) | 00:00:01 | |
7 | FILTER | ||||||
8 | JOIN FILTER USE | :BF0000 | 83635 | 6697K | 1642Kÿ(1) | 05:28:33 | |
9 | MERGE JOIN OUTER | 83635 | 6697K | 1642K (1) | 05:28:33 | ||
10 | MERGE JOIN | 76M | 4078M | 1642K (1) | 05:28:33 | ||
11 | SORT JOIN | 76M | 3641M | 1642K (1) | 05:28:32 | ||
12 | VIEW | TBL1_VW2 | 76M | 3641M | 1642K (1) | 05:28:32 | |
13 | HASH GROUP BY | 76M | 4223M | 5575M | 1502K(1) | 05:00:27 | |
14 | VIEW | TBL_VW1_NW | 76M | 4223M | 432K (1) | 01:26:30 | |
15 | UNION-ALL | ||||||
16 | PARTITION RANGE ALL | 1667K | 49M | 10721 (1) | 00:02:09 | ||
17 | TABLE ACCESS FULL | TBL1_VW2 | 1667K | 49M | 10721 (1) | 00:02:09 | |
18 | PARTITION RANGE ALL | 74M | 2279M | 341K (1) | 01:08:13 | ||
19 | TABLE ACCESS FULL | TBL1_VW2 | 74M | 2279M | 341K (1) | 01:08:13 | |
20 | SORT JOIN | 39362 | 230K | 31 (7) | 00:00:01 | ||
21 | INDEX FAST FULL SCAN | TBL2_PK | 39362 | 230K | 29 (0) | 00:00:01 | |
22 | SORT JOIN | 17916 | 454K | 69 (3) | 00:00:01 | ||
23 | REMOTE | RM_TBL_FRM_VW | 17916 | 454K | 67 (0) | 00:00:01 | |
24 | SORT UNIQUE | 2 | 68 | 14 (58) | 00:00:01 | ||
25 | UNION-ALL | ||||||
26 | NESTED LOOPS | 1 | 34 | 5 (0) | 00:00:01 | ||
27 | TABLE ACCESS BY INDEX ROWID | TBL2 | 1 | 8 | 2 (0) | 00:00:01 | |
28 | INDEX UNIQUE SCAN | TBL2_PK | 1 | 1 (0) | 00:00:01 | ||
29 | TABLE ACCESS FULL | WRK_TBL3 | 1 | 26 | 3 (0) | 00:00:01 | |
30 | NESTED LOOPS | 1 | 34 | 5 (0) | 00:00:01 | ||
31 | TABLE ACCESS BY INDEX ROWID | TBL2_VW2 | 1 | 8 | 2 (0) | 00:00:01 | |
32 | INDEX UNIQUE SCAN | TBL2_PK | 1 | 1 (0) | 00:00:01 | ||
33 | TABLE ACCESS FULL | WRK_TBL3 | 1 | 26 | 3 (0) | 00:00:01 |
Remote SQL Information (identified by operation id): |
23 – SELECT “S_ID” FROM “RM_TBL_FRM_VW” “B” (accessing ‘RM_DBLINK’ ) |
– dynamic sampling used for this statement (level=2) |
Here’s a clear example of how a small change can make a great difference in performance. The view available to the developer didn’t support performance and now we can continue to work on improvements on statistics and coding.
Pingback: AWR Warehouse in EM12c Rel. 4 - Oracle - Oracle - Toad World
Pingback: Kickstarting After a Failed Addition to the AWR Warehouse - Oracle - Oracle - Toad World