Extensive Elapsed Time for ASH Report

The other day I had a customer who was to send me an ASH report after there was a challenge in providing me EM Diagnostics from their repository database.  After the report hadn’t been submitted after a number of days, the customer admitted that they would execute an ASH report request and would subsequently kill the session after eight hours or so of non-completion.

images (1)

Although I hadn’t experienced this myself, there are a number of causes behind an ASH report not running and troubleshooting it is a straight forward approach.

Analysis

The first thing to do, is to gather some information on the cause of the issue.  Running the report repeatedly without looking into the cause isn’t going to make the end result any better.

Upon executing the ASH report, (either from Enterprise Manager or SQL*Plus, using the $ORACLE_HOME/rdbms/admin/ashrpt.sql report) the completion should be seen in less than a minute or so.  If it doesn’t complete in that time, inspect the session details, pulling the execution plan for the SQL that is currently running for the session.

In the customer’s experience, the ASH report was “hung up” on a step to capture the top SQL using a join between GV$_ACTIVE_SESSION_HISTORY and a number of X$ tables.  The plan had a number of merge join cartesians, ending with a result of 3.501 million rows.

The amount of ASH data involved could in no way come up to that kind of row count and the X$ tables aren’t “static” values you can work off of, but it became apparent that the stats were definitely awry, but not on the ASH object.  To update the X$ tables, I had the user collect fixed stats and then run the ASH report again.

exec DBMS_STATS.GATHER_FIXED_OBJECTS_STATS;

Upon executing the report afterwards, the report ran in a matter of a few minutes and I could now have the original report I had requested, but I also asked them to run one more for me:

$ORACLE_HOME/rdbms/admin/awrinfo.sql;

This is the General AWR Information report.  I often recommend this report for sizing an AWR Warehouse, but for a very specific reason.  Many earlier releases of 11g experienced issues with clean up of ASH data.

There is a section in the report that states the size of individual components in the SYSAUX tablespace.

Here’s the output from a “healthy” AWR in a clustered repository, DB Version 11.2.0.4.0:

COMPONENT MB % AWR KB_PER_SNAP MB_PER_DAY MB_PER_WEEK TABLE% : INDEX% 
--------- --------- ------ ------------ ---------- ----------- ---------------- 
SQLPLAN 697.0 30.2 1,028 24.1 168.7 69% : 31% 
FIXED 618.5 26.8 913 21.4 149.7 50% : 50% 
EVENTS 180.5 7.8 266 6.2 43.7 47% : 53% 
SQL 52.1 2.3 77 1.8 12.6 73% : 27% 
ASH 48.2 2.1 71 1.7 11.7 89% : 11% 
SPACE 41.4 1.8 61 1.4 10.0 72% : 28% 
SQLTEXT 16.3 0.7 24 0.6 3.9 98% : 2% 
RAC 12.2 0.5 18 0.4 3.0 52% : 48% 
SQLBIND 1.3 0.1 2 0.0 0.3 55% : 45%

 

Here’s the output from the customer’s AWR in their repository, DB version 11..2.0.2:

COMPONENT MB % AWR KB_PER_SNAP MB_PER_DAY MB_PER_WEEK TABLE% : INDEX%
--------- --------- ------ ------------ ---------- ----------- ----------------
ASH ######### 67.0 222,497 5,214.8 36,503.5 87% : 13%
FIXED 28,382.3 10.9 36,149 847.2 5,930.6 48% : 52%
EVENTS 19,503.7 7.5 24,841 582.2 4,075.4 40% : 60%
SQLPLAN 16,093.0 6.2 20,497 480.4 3,362.7 70% : 30%
SPACE 1,460.4 0.6 1,860 43.6 305.2 67% : 33%
RAC 552.6 0.2 704 16.5 115.5 59% : 41%
SQL 100.6 0.0 128 3.0 21.0 69% : 31%
SQLBIND 6.0 0.0 8 0.2 1.3 50% : 50%
SQLTEXT 4.2 0.0 5 0.1 0.9 96% : 4%

 

Although both Enterprise Manager’s are 12.1.0.5, are both on RAC and both have the same interval and retention time for AWR set, the customer’s 11.2.0.2.0 is retaining much larger amounts of data and ASH is by far the largest contributor to space allocated, (to the point that its more than the report was formatted to display…)

There are a number of MOS notes covering what may be of the cause, but there are helpful steps to follow to diagnose the specific challenge that’s causing the unbridled size demands for AWR and/or ASH in your own environment.

One of the most important is to follow the appropriate steps to isolate what is consuming space, often in the way of partitions not splitting properly and manually purging data, as seen in the following MOS Doc:

Doc 1919268.1

The Oracle Management Repository, (OMR) is a busy database, having to serve the Oracle Management Service, the web tier, the targets and anything else that EM expects of it.  If undue pressure is added by other features, as we see here with ASH and AWR data, it is important to dig down into the cause and address it.  This issue was impacting our ability to even pull diagnostics data to offer an intelligent recommendation on upgrade options.  By investigating the problem in a logical manner, we’re able to quickly diagnose the cause to the problems and correct the issue.

Print Friendly
March 10th, 2016 by

facebook comments:

  • Foued Grayâa

    Thank you for the post.
    Foued

  • Facebook
  • Google+
  • LinkedIn
  • Twitter