Yes, I love AWR and ASH reports when performing database detective work.  They are an essential tool in my performance tuning arsenal, along with the logs, trace files/profiling, Oracle Enterprise Manager and a suite of personally collect queries from over the years.  As complex as databases are these days, the more data that I have at hand to backup what I am tuning or reporting on, the more comfortable I am, (OK, so I’m not one for “theories”… :))

Just as I would not use a screw driver to hammer in a nail, all DBA’s should understand and not confuse the tools in their arsenal and when best to utilize them.

AWR and ASH reports, when licensed by the client/environment are commonly a mid-stream step I go through when trouble-shooting or analyzing an issue.  I believe that all DBA’s should understand how to perform all tasks from SQLPlus.  Being able to address issues without these “gifts from the DBA Gods” is essential, too, but when they are available, utilize them for what they are best at-  a aggregated snapshot in time of the database that can often shed light on what was going on at a high level and with some detail into the database.

Running the reports is a simple task.  From the database server, as the Oracle user, there are a number of different reports in the $ORACLE_HOME/rdbms/admin directory for both awrrpt.sql and ashrpt.sql.  Although these are the main reports, you will notice if you run:

ls awr*.sql

that there are a number of different AWR reports, but we will stick with the main ones today.

The AWR Report

Running the report is quite easy from here.  Change to a directory that you would like to write your report to and log into SQLPlus as a user with privileges to run the reports.

@$ORACLE_HOME/rdbms/admin/awrrpt.sql;

  • It will then ask you for the type of report you would prefer, text or html, enter your preference.
  • How many days of snapshot ID’s would you like to view?  Enter a number of days to go back.
  • Enter a beginning snapshot ID.
  • Enter an ending snapshot ID.
  • Enter a name for your report.  I prefer the naming method of:  <awr/ash/addm>_<sid>_<ending snapshot>.<txt/html>
  • The report will run and it will produce the output in the directory you are currently in.

Once complete, you can view the report.  The report contains the same main information at the top for almost all database environments:



WORKLOAD REPOSITORY report for
DB Name         DB Id    Instance     Inst Num Release     RAC Host
------------ ----------- ------------ -------- ----------- --- ------------
<SID>          1192134473 <Instance>         1 10.2.0.4.0  NO  <HOST>
              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:     84745 06-Feb-11 13:30:25     1,687      44.2
  End Snap:     84747 06-Feb-11 14:30:26     1,683      44.4
   Elapsed:               60.02 (mins)
   DB Time:            1,455.64 (mins)

The above tells you what database, what type of database, database version, when the snapshots are from and how many sessions were active at the time of the snapshot.

Instance efficiency is always an important, high level indicator of the overall health of a database, inspect this quickly and ensure the percentages are within acceptable limits:

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            Buffer Nowait %:   99.40       Redo NoWait %:  100.00
            Buffer  Hit   %:   97.33    In-memory Sort %:  100.00
            Library Hit   %:   99.93        Soft Parse %:   99.88
         Execute to Parse %:   86.13         Latch Hit %:   99.61
Parse CPU to Parse Elapsd %:   99.08     % Non-Parse CPU:   99.54

From here, it is most effective to move to the top five wait events to get a solid feel for what data is going to be relevant in your research:

Top 5 Timed Events                                         Avg %Total
~~~~~~~~~~~~~~~~~~                                        wait   Call
Event                                 Waits    Time (s)   (ms)   Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
enq: TX - allocate ITL entry          9,799      28,698   2929   32.9 Configurat
db file sequential read           4,827,509      25,964      5   29.7   User I/O
read by other session             2,998,307      18,118      6   20.7   User I/O
CPU time                                          6,872           7.9
direct path read                    222,425       4,782     21    5.5   User I/O
          -------------------------------------------------------------

As the wait events are shown clearly as enq: TX on ITL, (Initial Transaction),  db file sequential read, (single block reads, most likely involving indexes) and read by other session, (concurrency/buffer busy waits) this gives us a solid foundation for our next step into the AWR report-  Segment Statistics

This section is towards the bottom of the report, but it very helpful for us to see exactly where the physical pressure is on the database.  There will be three areas we will be interested in:

  • Buffer busy waits
  • ITL Waits
  • Physical Reads
Segments by Buffer Busy Waits           DB/Inst: ORC1/ORC1  Snaps: 84745-84747
-> % of Capture shows % of Buffer Busy Waits for each top segment compared
-> with total Buffer Busy Waits for all segments captured by the Snapshot
                                                                  Buffer
           Tablespace                      Subobject  Obj.          Busy    % of
Owner         Name    Object Name            Name     Type         Waits Capture
---------- ---------- -------------------- ---------- ----- ------------ -------
USER1       USER_LG    PTBL1_DT_IDX                 INDEX        1,877   48.79
USER2       USERS      TBL2            P201102      TABLE          806   20.95
USER2       USERS      PTBL4           P201102      INDEX          443   11.52
USER1       USER_LG    PTBL1                        TABLE          260    6.76
USER2       USERS      PTBL2_AU_IDX    P201102      INDEX          206    5.35
          -------------------------------------------------------------

By reviewing the data above, we can see that almost 50% of our buffer busy waits on are the USER1.PTBL1_DT_IDX.

This will most likely show much of our db file sequential read waits and our concurrency, so this is a good object to make note of.

Segments by Physical Reads              DB/Inst: ORC1/ORC1  Snaps: 84745-84747
-> Total Physical Reads:      29,381,297
-> Captured Segments account for    34.9% of Total
           Tablespace                      Subobject  Obj.      Physical
Owner         Name    Object Name            Name     Type         Reads  %Total
---------- ---------- -------------------- ---------- ----- ------------ -------
USER1       USER_LG    TBL2                            TABLE    2,947,780   10.03
USER1       USER_LG    PTBL1                           TABLE    2,101,923    7.15
USER2       USERS      TBL_16_AUD            P201102   TABLE    1,334,319    4.54
USER2       USERS      TBL_12_AUD            P201012   TABLE      884,322    3.01
USER2       USERS      PTBL_AUD              P201012   INDEX      711,388    2.42
          -------------------------------------------------------------

We see another clue above to what we should be looking for next. The USER1.TBL2 has large physical reads on it. We should keep an eye out for statements involving the first three objects

due to the quantity of physical reads, in my opinion.

Segments by ITL Waits                   DB/Inst: ORC1/ORC1  Snaps: 84745-84747
-> % of Capture shows % of ITL waits for each top segment compared
-> with total ITL waits for all segments captured by the Snapshot
           Tablespace                      Subobject  Obj.           ITL    % of
Owner         Name    Object Name            Name     Type         Waits Capture
---------- ---------- -------------------- ---------- ----- ------------ -------
USER2       USERS      PK_PTBL1_AUD        P201102    INDEX            9   26.47
USER2       USERS      PTBL1_AUD           P201102    INDEX            7   20.59
USER2       USERS      UK_TBL16_IDX        P201102    INDEX            5   14.71
USER1       USER_LG    PK_TBL16_IDX                   INDEX            2    5.88
USER1       USER_LG    PTBL1                          TABLE            2    5.88
          -------------------------------------------------------------

The actual waits may be misleading in this report, as there are a number of partitions that might be hiding small numbers individually that are adding up together.

We can tell that we can address most of them just be changing the initial transactions on the three objects in the top of the list shown though.

We now have the data from the Segment Statistics section to now move into specific SQL reporting data.  We will use what we have gathered so far to help us zero in on the statements that can be tuned to offer

us the most “bang for the buck”.  If we want to tune to remove impact, we will be looking at what is taking the most time in the database to complete.  If we are looking for a historical culprit for a performance issue at a given time, we will then be looking for a similar issue, but not take the secondary steps of categorizing complexity of resolution unless we are going to address it.

In the elapsed time is the statement that goes with much of our data:

  Elapsed      CPU                  Elap per  % Total
  Time (s)   Time (s)  Executions   Exec (s)  DB Time    SQL Id
---------- ---------- ------------ ---------- ------- -------------
    29,053         54       28,705        1.0    33.3 3m3pjyfj29dpv
Module: user@hostname (TNS V1-V3)
UPDATE USER1.PTBL1 SET COL1 = :1, COL2 = :2, COL3 = :3, COL4=
 :4 where COL6=:6 and COL7= :7

The statement above involves the following from the top event waits:

  • Concurrency
  • Buffer busy waits
  • a third of the total database time
  • ITL waits

After running a quick explain plan on the SQL Statement, we realize the following-  It is using  objects from each section:

  • USER1.PTBL1 from physical reads, buffer busy waits and ITL waits.
  • In the explain plan, it uses the USER1.PTBL1_DT_IDX for the where clause in the update statement.

From this type of research, you should be able to create a top five set of statements that are causing high performance hits or were the cause of a performance degradation incident in a database that any manager or developer can appreciate, with data that can back it up.

The ASH Report

As with the AWR report, the ASH report is executed in a very similar manner from SQLPlus.  Ash is a 1 second snapshot of trace level data from the database, this can give you a very solid view of the database, especially if you do not have a snapshot ID to rely on due to an outage or need more granule data, but still are required to look historically.

From SQL Plus:

 @$ORACLE_HOME/rdbms/admin/ashrpt.sql;

  • Specify what type of report you would like, text or html.
  • The report will then request a beginning time for the report- Note just above how up to the minute the data is available to:
Latest ASH sample available:     01-Mar-11 01:57:42       <--Database in question is on GMT, (no, I'm not working in the future... :))
[                                     0 mins in the past]
  • You are then shown an example of how to enter in time for an ASH report:
Specify the timeframe to generate the ASH report
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter begin time for report:
--    Valid input formats:
--      To specify absolute begin time:
--        [MM/DD[/YY]] HH24:MI[:SS]
--        Examples: 02/23/03 14:30:15
--                  02/23 14:30:15
--                  14:30:15
--                  14:30
--      To specify relative begin time: (start with '-' sign)
--        -[HH24:]MI
--        Examples: -1:15  (SYSDATE - 1 Hr 15 Mins)
--                  -25    (SYSDATE - 25 Mins)
  • I prefer 15 minute snapshots max,  as aggregation in these reports can miss a lot of valuable data.
  • Enter in your beginning time:

02/28/11 12:30:00

  • Put in your duration, either forward or negative in minutes, see examples above.
  • Name your report, I prefer ash_<sid>_timestamp.<txt/html>

You can then review your ASH report:

The first section will look similar for all databases:

ASH Report For <sid>/<inst>
DB Name         DB Id    Instance     Inst Num Release     RAC Host
------------ ----------- ------------ -------- ----------- --- ------------
<sid>          11921373  <inst>                1 10.2.0.4.0  NO  <host>
CPUs           SGA Size       Buffer Cache        Shared Pool    ASH Buffer Size
---- ------------------ ------------------ ------------------ ------------------
  16     47,104M (100%)    40,960M (87.0%)      4,121M (8.7%)              M (%)
          Analysis Begin Time:   28-Feb-11 12:30:00
            Analysis End Time:   28-Feb-11 13:00:00
                 Elapsed Time:        30.0 (mins)
                 Sample Count:       1,122
      Average Active Sessions:        6.23
  Avg. Active Session per CPU:        0.39
                Report Target:   None specified

The Top User Events are a solid place to view first:

Event                               Event Class     % Activity   Sessions
----------------------------------- --------------- ---------- ----------
db file sequential read             User I/O             52.67       3.28
CPU + Wait for CPU                  CPU                  25.85       1.61
db file scattered read              User I/O             15.33       0.96
          -------------------------------------------------------------

It is broken down by wait events, which allow us to look up “what” object is causing the db file sequential reads:

Event                          % Event  P1 Value, P2 Value, P3 Value % Activity
------------------------------ ------- ----------------------------- ----------
Parameter 1                Parameter 2                Parameter 3
-------------------------- -------------------------- --------------------------
db file sequential read          52.76             "2","2500882","1"       0.09
file#                      block#                     blocks
db file scattered read           15.33             "5","10380","128"       0.09
file#                      block#                     blocks

          ————————————————————-

The query to inspect the object involved in the blocks would look like this:

select
   owner,
   segment_name,
   segment_type
from
   dba_extents
where
   file_id = 2
and
  2500882 between block_id and block_id + blocks -1;
OWNER SEGMENT_NAME SEGMENT_TYPE
------------------
USER1 PTBL1        TABLE

We now know that 52% of the db file sequential reads came from this object. As this can be single block reads, not just index reads, we know we are not looking for a full table scan on this object.

It will also tell us the types of commands utilzing the database most:

Top SQL Command Types              DB/Inst: ORC1/ORC1  (Feb 28 12:30 to 13:00)
-> 'Distinct SQLIDs' is the count of the distinct number of SQLIDs
with the given SQL Command Type found over all the ASH samples
      in the analysis period
                                                 Distinct            Avg Active
SQL Command Type                             SQLIDs % Activity   Sessions
---------------------------------------- ---------- ---------- ----------
SELECT                                          111      66.49       4.14
INSERT                                           11      17.38       1.08
UPDATE                                           10       9.54       0.59
          -------------------------------------------------------------

And the exact statements that were in the “TOP”, note that it will break it down by percentage of wait type per statement. If there is more than one that the statement falls under,

it will show show each wait with percentage:

       SQL ID    Planhash % Activity Event                             % Event
------------- ----------- ---------- ------------------------------ ----------
2b4794ntzxabs  2477665643      14.62 db file scattered read              10.61
select rpad(e.COL1,:"SYS_B_0"), count(*),
 sum(b.COL_8) from TBL_89 b,TBL_65 e
 where b.COL2 in (select x.COL1 from TBL_45 x where
3mqu6p3gwmtvc         N/A       8.29 db file sequential read              8.29
INSERT INTO USER.TBL1(COL1,COL2,COL3,COL4,COL5,COL6,COL7) VALUES ( :1
, :2, :3, :4, :5, :6, :7)
a30x4fqw20590  3391073822       5.70 db file sequential read              5.53
SELECT /*+ index (T PK_PTBL1) */ SMALL.COL1, SMALL.COL2, SMA
LL.COL3, SMALL.COL4, SMALL.COL5, SMALL.COL6 SM
ALL.COL7, ... 

Another valuable area is PLSQL Subprograms:

PLSQL Entry Subprogram                                            % Activity
----------------------------------------------------------------- ----------
PLSQL Current Subprogram                                           % Current
----------------------------------------------------------------- ----------
USER1.PROC_d                                                        5.61
USER1.PROC_h                                                        5.61
USER2.PROC_b                                                        3.39
USER1.TRIGGER_a                                                     1.69

Top Sessions, Types of Waits with Parallel!  If you were looking for how certain processes impacted the system, this is a good place to look:

   Sid, Serial# % Activity Event                             % Event
--------------- ---------- ------------------------------ ----------
User                 Program                          # Samples Active     XIDs
-------------------- ------------------------------ ------------------ --------
     2433,33697      15.78 db file sequential read             15.69
USER1               user@host (TNS V1-V3)                176/180 [ 98%]      8     
3358,53811      15.78 db file scattered read              10.61
USER3                  ?  @host (TNS V1-V3)              119/180 [ 66%]      0                          
 CPU + Wait for CPU                   4.46
                                                         50/180 [ 28%]       0    
3383, 7232       5.61 db file sequential read              5.44
USER2                 sqlplus#host (TNS V1-V3)           61/180 [ 34%]       0

What are the top objects during the small window?

Top DB Objects                     DB/Inst: ORC1/ORC1  (Feb 28 12:30 to 13:00)
-> With respect to Application, Cluster, User I/O and buffer busy waits only.
      Object ID % Activity Event                             % Event
--------------- ---------- ------------------------------ ----------
Object Name (Type)                                    Tablespace
----------------------------------------------------- -------------------------
          13932      11.76 db file sequential read             11.76
USER1.PTBL1 (TABLE)                                  USER_LG
          13181      10.70 db file scattered read              10.61
USER1.TBL2.PART_MAX (TABLE PARTITION)                USER_DATA
          13391       5.53 db file sequential read              5.53
USER1.TBL1_IDX1 (INDEX)                              USER_LG
          13220       5.44 db file sequential read              5.44
 

From here you can pull specific datafile waits, and even 5 minute interval top waits. 

ASH and AWR reports are very powerful tools in a DBA’s arsenal, but they should be used in the right context and for the right reason.