DBA Kevlar

Tips, tricks, (and maybe a few rants) so more DBAs become bulletproof!

For the Love of AWR and ASH…

Yes, I love AWR and ASH reports when performing database detective work.  They are an essential tool in my performance tuning arsenal, along with the logs, trace files/profiling, Oracle Enterprise Manager and a suite of personally collect queries from over the years.  As complex as databases are these days, the more data that I have at hand to backup what I am tuning or reporting on, the more comfortable I am, (OK, so I’m not one for “theories”… :))

Just as I would not use a screw driver to hammer in a nail, all DBA’s should understand and not confuse the tools in their arsenal and when best to utilize them.

AWR and ASH reports, when licensed by the client/environment are commonly a mid-stream step I go through when trouble-shooting or analyzing an issue.  I believe that all DBA’s should understand how to perform all tasks from SQLPlus.  Being able to address issues without these “gifts from the DBA Gods” is essential, too, but when they are available, utilize them for what they are best at-  a aggregated snapshot in time of the database that can often shed light on what was going on at a high level and with some detail into the database.

Running the reports is a simple task.  From the database server, as the Oracle user, there are a number of different reports in the $ORACLE_HOME/rdbms/admin directory for both awrrpt.sql and ashrpt.sql.  Although these are the main reports, you will notice if you run:

ls awr*.sql

that there are a number of different AWR reports, but we will stick with the main ones today.

The AWR Report

Running the report is quite easy from here.  Change to a directory that you would like to write your report to and log into SQLPlus as a user with privileges to run the reports.

@$ORACLE_HOME/rdbms/admin/awrrpt.sql;

  • It will then ask you for the type of report you would prefer, text or html, enter your preference.
  • How many days of snapshot ID’s would you like to view?  Enter a number of days to go back.
  • Enter a beginning snapshot ID.
  • Enter an ending snapshot ID.
  • Enter a name for your report.  I prefer the naming method of:  <awr/ash/addm>_<sid>_<ending snapshot>.<txt/html>
  • The report will run and it will produce the output in the directory you are currently in.

Once complete, you can view the report.  The report contains the same main information at the top for almost all database environments:



WORKLOAD REPOSITORY report for
DB Name         DB Id    Instance     Inst Num Release     RAC Host
------------ ----------- ------------ -------- ----------- --- ------------
<SID>          1192134473 <Instance>         1 10.2.0.4.0  NO  <HOST>
              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:     84745 06-Feb-11 13:30:25     1,687      44.2
  End Snap:     84747 06-Feb-11 14:30:26     1,683      44.4
   Elapsed:               60.02 (mins)
   DB Time:            1,455.64 (mins)

The above tells you what database, what type of database, database version, when the snapshots are from and how many sessions were active at the time of the snapshot.

Instance efficiency is always an important, high level indicator of the overall health of a database, inspect this quickly and ensure the percentages are within acceptable limits:

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            Buffer Nowait %:   99.40       Redo NoWait %:  100.00
            Buffer  Hit   %:   97.33    In-memory Sort %:  100.00
            Library Hit   %:   99.93        Soft Parse %:   99.88
         Execute to Parse %:   86.13         Latch Hit %:   99.61
Parse CPU to Parse Elapsd %:   99.08     % Non-Parse CPU:   99.54

From here, it is most effective to move to the top five wait events to get a solid feel for what data is going to be relevant in your research:

Top 5 Timed Events                                         Avg %Total
~~~~~~~~~~~~~~~~~~                                        wait   Call
Event                                 Waits    Time (s)   (ms)   Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
enq: TX - allocate ITL entry          9,799      28,698   2929   32.9 Configurat
db file sequential read           4,827,509      25,964      5   29.7   User I/O
read by other session             2,998,307      18,118      6   20.7   User I/O
CPU time                                          6,872           7.9
direct path read                    222,425       4,782     21    5.5   User I/O
          -------------------------------------------------------------

As the wait events are shown clearly as enq: TX on ITL, (Initial Transaction),  db file sequential read, (single block reads, most likely involving indexes) and read by other session, (concurrency/buffer busy waits) this gives us a solid foundation for our next step into the AWR report-  Segment Statistics

This section is towards the bottom of the report, but it very helpful for us to see exactly where the physical pressure is on the database.  There will be three areas we will be interested in:

  • Buffer busy waits
  • ITL Waits
  • Physical Reads
Segments by Buffer Busy Waits           DB/Inst: ORC1/ORC1  Snaps: 84745-84747
-> % of Capture shows % of Buffer Busy Waits for each top segment compared
-> with total Buffer Busy Waits for all segments captured by the Snapshot
                                                                  Buffer
           Tablespace                      Subobject  Obj.          Busy    % of
Owner         Name    Object Name            Name     Type         Waits Capture
---------- ---------- -------------------- ---------- ----- ------------ -------
USER1       USER_LG    PTBL1_DT_IDX                 INDEX        1,877   48.79
USER2       USERS      TBL2            P201102      TABLE          806   20.95
USER2       USERS      PTBL4           P201102      INDEX          443   11.52
USER1       USER_LG    PTBL1                        TABLE          260    6.76
USER2       USERS      PTBL2_AU_IDX    P201102      INDEX          206    5.35
          -------------------------------------------------------------

By reviewing the data above, we can see that almost 50% of our buffer busy waits on are the USER1.PTBL1_DT_IDX.

This will most likely show much of our db file sequential read waits and our concurrency, so this is a good object to make note of.

Segments by Physical Reads              DB/Inst: ORC1/ORC1  Snaps: 84745-84747
-> Total Physical Reads:      29,381,297
-> Captured Segments account for    34.9% of Total
           Tablespace                      Subobject  Obj.      Physical
Owner         Name    Object Name            Name     Type         Reads  %Total
---------- ---------- -------------------- ---------- ----- ------------ -------
USER1       USER_LG    TBL2                            TABLE    2,947,780   10.03
USER1       USER_LG    PTBL1                           TABLE    2,101,923    7.15
USER2       USERS      TBL_16_AUD            P201102   TABLE    1,334,319    4.54
USER2       USERS      TBL_12_AUD            P201012   TABLE      884,322    3.01
USER2       USERS      PTBL_AUD              P201012   INDEX      711,388    2.42
          -------------------------------------------------------------

We see another clue above to what we should be looking for next. The USER1.TBL2 has large physical reads on it. We should keep an eye out for statements involving the first three objects

due to the quantity of physical reads, in my opinion.

Segments by ITL Waits                   DB/Inst: ORC1/ORC1  Snaps: 84745-84747
-> % of Capture shows % of ITL waits for each top segment compared
-> with total ITL waits for all segments captured by the Snapshot
           Tablespace                      Subobject  Obj.           ITL    % of
Owner         Name    Object Name            Name     Type         Waits Capture
---------- ---------- -------------------- ---------- ----- ------------ -------
USER2       USERS      PK_PTBL1_AUD        P201102    INDEX            9   26.47
USER2       USERS      PTBL1_AUD           P201102    INDEX            7   20.59
USER2       USERS      UK_TBL16_IDX        P201102    INDEX            5   14.71
USER1       USER_LG    PK_TBL16_IDX                   INDEX            2    5.88
USER1       USER_LG    PTBL1                          TABLE            2    5.88
          -------------------------------------------------------------

The actual waits may be misleading in this report, as there are a number of partitions that might be hiding small numbers individually that are adding up together.

We can tell that we can address most of them just be changing the initial transactions on the three objects in the top of the list shown though.

We now have the data from the Segment Statistics section to now move into specific SQL reporting data.  We will use what we have gathered so far to help us zero in on the statements that can be tuned to offer

us the most “bang for the buck”.  If we want to tune to remove impact, we will be looking at what is taking the most time in the database to complete.  If we are looking for a historical culprit for a performance issue at a given time, we will then be looking for a similar issue, but not take the secondary steps of categorizing complexity of resolution unless we are going to address it.

In the elapsed time is the statement that goes with much of our data:

  Elapsed      CPU                  Elap per  % Total
  Time (s)   Time (s)  Executions   Exec (s)  DB Time    SQL Id
---------- ---------- ------------ ---------- ------- -------------
    29,053         54       28,705        1.0    33.3 3m3pjyfj29dpv
Module: user@hostname (TNS V1-V3)
UPDATE USER1.PTBL1 SET COL1 = :1, COL2 = :2, COL3 = :3, COL4=
 :4 where COL6=:6 and COL7= :7

The statement above involves the following from the top event waits:

  • Concurrency
  • Buffer busy waits
  • a third of the total database time
  • ITL waits

After running a quick explain plan on the SQL Statement, we realize the following-  It is using  objects from each section:

  • USER1.PTBL1 from physical reads, buffer busy waits and ITL waits.
  • In the explain plan, it uses the USER1.PTBL1_DT_IDX for the where clause in the update statement.

From this type of research, you should be able to create a top five set of statements that are causing high performance hits or were the cause of a performance degradation incident in a database that any manager or developer can appreciate, with data that can back it up.

The ASH Report

As with the AWR report, the ASH report is executed in a very similar manner from SQLPlus.  Ash is a 1 second snapshot of trace level data from the database, this can give you a very solid view of the database, especially if you do not have a snapshot ID to rely on due to an outage or need more granule data, but still are required to look historically.

From SQL Plus:

 @$ORACLE_HOME/rdbms/admin/ashrpt.sql;

  • Specify what type of report you would like, text or html.
  • The report will then request a beginning time for the report- Note just above how up to the minute the data is available to:
Latest ASH sample available:     01-Mar-11 01:57:42       <--Database in question is on GMT, (no, I'm not working in the future... :))
[                                     0 mins in the past]
  • You are then shown an example of how to enter in time for an ASH report:
Specify the timeframe to generate the ASH report
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter begin time for report:
--    Valid input formats:
--      To specify absolute begin time:
--        [MM/DD[/YY]] HH24:MI[:SS]
--        Examples: 02/23/03 14:30:15
--                  02/23 14:30:15
--                  14:30:15
--                  14:30
--      To specify relative begin time: (start with '-' sign)
--        -[HH24:]MI
--        Examples: -1:15  (SYSDATE - 1 Hr 15 Mins)
--                  -25    (SYSDATE - 25 Mins)
  • I prefer 15 minute snapshots max,  as aggregation in these reports can miss a lot of valuable data.
  • Enter in your beginning time:

02/28/11 12:30:00

  • Put in your duration, either forward or negative in minutes, see examples above.
  • Name your report, I prefer ash_<sid>_timestamp.<txt/html>

You can then review your ASH report:

The first section will look similar for all databases:

ASH Report For <sid>/<inst>
DB Name         DB Id    Instance     Inst Num Release     RAC Host
------------ ----------- ------------ -------- ----------- --- ------------
<sid>          11921373  <inst>                1 10.2.0.4.0  NO  <host>
CPUs           SGA Size       Buffer Cache        Shared Pool    ASH Buffer Size
---- ------------------ ------------------ ------------------ ------------------
  16     47,104M (100%)    40,960M (87.0%)      4,121M (8.7%)              M (%)
          Analysis Begin Time:   28-Feb-11 12:30:00
            Analysis End Time:   28-Feb-11 13:00:00
                 Elapsed Time:        30.0 (mins)
                 Sample Count:       1,122
      Average Active Sessions:        6.23
  Avg. Active Session per CPU:        0.39
                Report Target:   None specified

The Top User Events are a solid place to view first:

Event                               Event Class     % Activity   Sessions
----------------------------------- --------------- ---------- ----------
db file sequential read             User I/O             52.67       3.28
CPU + Wait for CPU                  CPU                  25.85       1.61
db file scattered read              User I/O             15.33       0.96
          -------------------------------------------------------------

It is broken down by wait events, which allow us to look up “what” object is causing the db file sequential reads:

Event                          % Event  P1 Value, P2 Value, P3 Value % Activity
------------------------------ ------- ----------------------------- ----------
Parameter 1                Parameter 2                Parameter 3
-------------------------- -------------------------- --------------------------
db file sequential read          52.76             "2","2500882","1"       0.09
file#                      block#                     blocks
db file scattered read           15.33             "5","10380","128"       0.09
file#                      block#                     blocks

          ————————————————————-

The query to inspect the object involved in the blocks would look like this:

select
   owner,
   segment_name,
   segment_type
from
   dba_extents
where
   file_id = 2
and
  2500882 between block_id and block_id + blocks -1;
OWNER SEGMENT_NAME SEGMENT_TYPE
------------------
USER1 PTBL1        TABLE

We now know that 52% of the db file sequential reads came from this object. As this can be single block reads, not just index reads, we know we are not looking for a full table scan on this object.

It will also tell us the types of commands utilzing the database most:

Top SQL Command Types              DB/Inst: ORC1/ORC1  (Feb 28 12:30 to 13:00)
-> 'Distinct SQLIDs' is the count of the distinct number of SQLIDs
with the given SQL Command Type found over all the ASH samples
      in the analysis period
                                                 Distinct            Avg Active
SQL Command Type                             SQLIDs % Activity   Sessions
---------------------------------------- ---------- ---------- ----------
SELECT                                          111      66.49       4.14
INSERT                                           11      17.38       1.08
UPDATE                                           10       9.54       0.59
          -------------------------------------------------------------

And the exact statements that were in the “TOP”, note that it will break it down by percentage of wait type per statement. If there is more than one that the statement falls under,

it will show show each wait with percentage:

       SQL ID    Planhash % Activity Event                             % Event
------------- ----------- ---------- ------------------------------ ----------
2b4794ntzxabs  2477665643      14.62 db file scattered read              10.61
select rpad(e.COL1,:"SYS_B_0"), count(*),
 sum(b.COL_8) from TBL_89 b,TBL_65 e
 where b.COL2 in (select x.COL1 from TBL_45 x where
3mqu6p3gwmtvc         N/A       8.29 db file sequential read              8.29
INSERT INTO USER.TBL1(COL1,COL2,COL3,COL4,COL5,COL6,COL7) VALUES ( :1
, :2, :3, :4, :5, :6, :7)
a30x4fqw20590  3391073822       5.70 db file sequential read              5.53
SELECT /*+ index (T PK_PTBL1) */ SMALL.COL1, SMALL.COL2, SMA
LL.COL3, SMALL.COL4, SMALL.COL5, SMALL.COL6 SM
ALL.COL7, ... 

Another valuable area is PLSQL Subprograms:

PLSQL Entry Subprogram                                            % Activity
----------------------------------------------------------------- ----------
PLSQL Current Subprogram                                           % Current
----------------------------------------------------------------- ----------
USER1.PROC_d                                                        5.61
USER1.PROC_h                                                        5.61
USER2.PROC_b                                                        3.39
USER1.TRIGGER_a                                                     1.69

Top Sessions, Types of Waits with Parallel!  If you were looking for how certain processes impacted the system, this is a good place to look:

   Sid, Serial# % Activity Event                             % Event
--------------- ---------- ------------------------------ ----------
User                 Program                          # Samples Active     XIDs
-------------------- ------------------------------ ------------------ --------
     2433,33697      15.78 db file sequential read             15.69
USER1               user@host (TNS V1-V3)                176/180 [ 98%]      8     
3358,53811      15.78 db file scattered read              10.61
USER3                  ?  @host (TNS V1-V3)              119/180 [ 66%]      0                          
 CPU + Wait for CPU                   4.46
                                                         50/180 [ 28%]       0    
3383, 7232       5.61 db file sequential read              5.44
USER2                 sqlplus#host (TNS V1-V3)           61/180 [ 34%]       0

What are the top objects during the small window?

Top DB Objects                     DB/Inst: ORC1/ORC1  (Feb 28 12:30 to 13:00)
-> With respect to Application, Cluster, User I/O and buffer busy waits only.
      Object ID % Activity Event                             % Event
--------------- ---------- ------------------------------ ----------
Object Name (Type)                                    Tablespace
----------------------------------------------------- -------------------------
          13932      11.76 db file sequential read             11.76
USER1.PTBL1 (TABLE)                                  USER_LG
          13181      10.70 db file scattered read              10.61
USER1.TBL2.PART_MAX (TABLE PARTITION)                USER_DATA
          13391       5.53 db file sequential read              5.53
USER1.TBL1_IDX1 (INDEX)                              USER_LG
          13220       5.44 db file sequential read              5.44
 

From here you can pull specific datafile waits, and even 5 minute interval top waits. 

ASH and AWR reports are very powerful tools in a DBA’s arsenal, but they should be used in the right context and for the right reason.

Print Friendly

facebook comments:

9 thoughts on “For the Love of AWR and ASH…
  • newbie01.oracle says:

    Love AWR – absolutely … but doesn’t it cost money to generate the AWR report? Lots of my customers are dumping Oracle and moving to SQLServer due to the lowdown on Oracle Licensing

  • KPedersen says:

    AWR does cost in the licensing, but most DBA’s are pushing their managers to realize the benefit to that cost. As for SQL Server, I’ve been in many re-ups on licensing and SQL Server is not THAT much less than Oracle when you get into the “get what you paid for” arena. I love SQL Server, I work in both mediums, have only one year less experience in SQL Server as Oracle, love the DMV’s in SQL Server, (which brought down my time in SQL Profiler by literally HOURS! :)) and Microsoft is leveraging this product in, what I feel, is the correct direction for the market. With that said, I don’t feel they are competing with Oracle at this venture in the game. Oracle, once it brought Exadata and other VLDB features to the table, put them into a different direction than SQL Server that focuses more in smaller, OLTP environments.
    If anyone needs to be concerned, I would say it’s SQL Server and the demand I’m seeing for MySQL DBA’s in the market. That has got to be telling us all something about the attraction to the MySQL platform… :)

  • Tim Gorman says:

    Licensing AWR/ASH is not as expensive as one might think. Looking at “www.oracle.com/store” and looking up “Diagnostics Pack”, it turns out that it can be licensed for as little as $20/named-user for 1 year, with a minimum of 25 named-users per processor on Enterprise Edition. On a four-core system, this works out to US$2,000, plus first-year support of $22/named-user, which is another US$2,200. Obviously, these amounts are less for Standard Edition and other products.

    While $4,200 for one year is not something for which I’d like to flourish my own personal credit card, it is peanuts when someone has already paid for Enterprise Edition and well worth the investment considering the valuable data that Kellyn has described. And these are list prices of the store website and not discounted the way most Oracle products are actually licensed in real life.

    Furthermore, there is even more valuable data in the AWR and ASH that the standard reports don’t display, so if you treat the AWR and ASH views like a little performance-tuning data warehouse using BI techniques, the rewards are even greater…

    But, even if your organization won’t cough up the fees for Diagnostic Pack by itself, there’s always STATSPACK, which doesn’t capture everything that AWR does, but it beats having nothing…

  • newbie01.oracle says:

    Hi Tim,

    I agree totally with your assessment. I do agree with the importance of the AWR reports and all.

    Unfortunately, for the customers that we support, we cannot justify that cost. So it is either we cough up the cost or include that cost hidden somewhere in the maintenance contract and then we deal the licensing cost with Oracle on their behalf.

    When you go to an Oracle training and they show you all these gadgets and stuff and how you can use them but they never mention the cost and so some dude will go back to where they come in and ended up facing the wall that they cannot use any of those OEM mouse tricks that they are shown in the course :-)

    In most cases, some company I work for ended up with having to use CA Unicenter which is absolutely useless. Am looking at using ashmon although I got a colleague who is smart enough to pull out the STATSPACK information and make it look like OEM’s diagnostic pack using PHP and Javascript complete with the time dragging scrollees :-)

  • wenmei says:

    Great article !
    keep on with this nice blog

  • goutham says:

    Hi
    Good doc.
    I am new to dba .when you need to go for AWR,ASH reports.can you give any issues where dba needs to query this reports for analysing the performance issues.

  • Prabhjot says:

    Hi Kevlar,

    I read your blogs whenever i get time. Thanks for sharing information.
    I would like to know which schema owns the AWR and ASH report. I know that these reports are stored in SYSAUX tablespace .

  • dbakevlar says:

    Hello Prabhjot,
    Most of the AWR/ASH are views based on tables owned by SYS.
    Hope this helps!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

  • Facebook
  • Google+
  • LinkedIn
  • Twitter