Oracle

REPVFY Diagnostics, Part I of…Well, A Zillion

There are a number of Verification Utilities for Enterprise Manager 12c, (EM12c) and I’ve written about them before, but today I’m going to start on the Repository Verification Utility, (REPVFY).  This will be an ongoing series, as there are so many valuable features rolled into the utility and new ones that will be added as new patches and releases happen.

For this post, I’m going to be reviewing a level 2 diagnostics report from a new repository database.  This is going to offer us all kinds of valuable data about the environment and should be considered by any EM12c administrator when doing a health check of their EM environment.

Inspecting the files from the output, you’ll note that they don’t require a lot of space, even when unzipped and this allows the admin to retain historical information about the EM12c environment.

Note that the files are broken up by type of review and marked with the timestamp for many of these files.

em_blog_post_0418

For this first blog post, I’m going to jump right into the first log alphabetically, the advisors log file, which as you might guess, is advising on performance.  The interesting thing is as small as this file is, it is actually comprised of a number of reports that one would be surprised how much work to run all of it manually.

The Advisors Log is comprised of the following reports:

  • ADDM Report
    • Space Advisor Report
    • ASH Report
  • AWR Report
  • SGA Resizing Report

The report then ends with the CBO Repository information before closing up this one of many log files generated by a simple and single execution from the command line.  I would definitely read through to this final section to understand the stats collection in the EM Repository database a bit more clearly, as I do touch on it in this post.

Advisor Log Header

When viewing the log file, the first thing it tells you is where the source SQL for the report comes from.  This is helpful if you want to inspect the SQL used.

The next is information about the release versions and such involved.  This is standard for most of the log files in the repvfy output:

 

COMPONENT          INFO
------------------ ----------------------------------------
EMDIAG Version     2014.0318
Repository Version 12.1.0.3.0
Database Version   11.2.0.3.0
Test Version       2014.0402
Repository Type    CENTRAL

The next sections of the report will go over the repository database and report on the type of database, (single instance or RAC, protected, version, etc.) and other pertinent information.  This is multiple query output I would like to see rewritten and compiled into one result set, but it’s still good to have this data, no matter what the output.

We then jump into SGA information and init/sp parameters.  What surprised me is that next section was dedicated to System statistics and the OS stats data.  This data is often only updated when a change to hardware has been made or if parallel is implemented and it’s set up in an upper level of the report indicating that the author of the report found this information valuable to the performance of the repository database, (not agreeing or disagreeing here, found it more curious than anything as I find it can be important in large databases, which the EM repository rarely is…)

--- Database OS statistics
SNAME                          PNAME                               PVAL1 PVAL2
------------------------------ ------------------------------ ---------- ----------------------------------------
SYSSTATS_INFO                  STATUS                                    COMPLETED
SYSSTATS_INFO                  DSTART                                    03-29-2014 02:00
SYSSTATS_INFO                  DSTOP                                     03-29-2014 02:00
SYSSTATS_INFO                  FLAGS                                   1
SYSSTATS_MAIN                  CPUSPEEDNW                           1405
SYSSTATS_MAIN                  IOSEEKTIM                              18
SYSSTATS_MAIN                  IOTFRSPEED                           4096

 

ADDM Report

Next the utility runs an ADDM report for the previous hour’s snapshot.  Now ADDM reports rarely receive a lot of love from DBAs.  It’s not the top info it includes or the findings, it’s the recommendations that commonly leave them less than thrilled.  How often do we read one that doesn’t recommend some type of hardware solution to a problem that we know with hardware applied, will only rear its ugly head again a few months to a year later?  If there was one recommendation I could make to this valuable report is to take those recommendations out and I think you’d see a lot more DBAs embracing ADDM reports.

The report for the repository database is true to form, only this time it’s Oracle’s objects being offered hardware solutions!

ECMAM_PK                                  0          0          0          0
Move object SYSMAN.ECMAM_PK to flash storage, estimated I/O wait reduction is 10 millisec

EM_CS_SCORE_HIST                          0          0          0          0
Move object SYSMAN.EM_CS_SCORE_HIST partition 2014-03-25 00:00 to flash storage, estimated I/O wait reduction is 125 millisec.

EM_EVENT_MSGS_PK                          0          0          0          0
Move object SYSMAN.EM_EVENT_MSGS_PK partition OPEN to flash storage, estimated I/O wait reduction is 5 millisec.

EM_METRIC_KEY_GUID_IDX                    0          0          0          0
Move object SYSMAN.EM_METRIC_KEY_GUID_IDX to flash storage, estimated I/O wait reduction is 336 millisec

Now who is to say that a savings of 10-336ms is worth the cost of flash storage?  I’m not, but this is just more proof to me that the recommendations section is still not where most DBAs would like to see the ADDM report.

Further sections of the report do live up to the hype and offer similar value to what we see in AWR for the EM repository database:

Top Background Events            DB/Inst: OEM/OEM  (Apr 02 14:00 to 15:00)

Avg Active
Event                               Event Class     % Activity   Sessions
----------------------------------- --------------- ---------- ----------
os thread startup                   Concurrency          47.91       0.03
CPU + Wait for CPU                  CPU                   9.77       0.01

 

To clearly show that it is Concurrency that is causing the OS Thread Startup % of activity is important.  This helps keep the administrator on the right track if they were to investigate the wait event.

The following section goes through the code and action involved in the last hour’s ADDM activity.  You can see if there is anything causing an issue, but as we can see in our repository, not much of anything-  it’s nice an quiet…. J

Top Service/Module               DB/Inst: OEM/OEM  (Apr 02 14:00 to 15:00)

Service        Module                   % Activity Action               % Action
-------------- ------------------------ ---------- ------------------ ----------
SYS$USERS      DBMS_SCHEDULER                 8.84 EM_REPOS_SEV_EVAL        2.79
EM_SYSTEM_MEMBER_S       1.40
EM_TASK_WORKER_3         1.40
MGMT_COLLECTION.COLLECTI       8.37 Task:881 Repositor       4.19
Task:37 Performanc       1.40
OEM.PbsSystemPool              6.51 PingHeartBeatWork:       2.33
LoaderWorker             1.40
GATHER_SYSMAN_STATS            2.33 UNNAMED                  0.93

 

Next it reports on sessions and then on percentage of activity was allocated to different SQL types and the percentage allocated to different phases of the SQL, (what percentage of activity was consumed by hard parsing vs. SQL or PL/SQL execution….)

Just like an ADDM report, the top SQL statements, including SQL_ID and percentage of activity is reported:

Sampled #
SQL ID             Planhash        of Executions     % Activity
----------------------- -------------------- -------------------- --------------
Event                          % Event Top Row Source                    % RwSrc
------------------------------ ------- --------------------------------- -------
7mqykaw68fkhb           2691428406                    6           4.19
CPU + Wait for CPU                4.19 TABLE ACCESS - FULL                  2.79
SELECT COUNT(*) FROM DBA_SCHEDULER_JOB_RUN_DETAILS WHERE OWNER = MGMT_USER.GET_R
EPOSITORY_OWNER AND JOB_NAME = UPPER(:B3 ) AND STATUS = 'SUCCEEDED' AND ACTUAL_S
TART_DATE >= :B2 AND ACTUAL_START_DATE < :B1

 

The report will go through this for top row source, literals, actions, etc.  It will then jump into the top procedures that were called:

PLSQL Entry Subprogram                                            % Activity
----------------------------------------------------------------- ----------
PLSQL Current Subprogram                                           % Current
----------------------------------------------------------------- ----------
SYSMAN.EM_TASK.WORKER                                                   8.84
SQL                                                                  6.05

SYS.DBMS_ASSERT.SIMPLE_SQL_NAME                                      1.40
SYSMAN.EM_PING.RECORD_BATCH_HEARTBEAT                                   3.26
SQL                                                                  1.86

 

This can help you quickly find what procedures to search for the SQL shown in the sections before.

Blocking sessions, top sessions and top latches are just some of the areas, just as in a standard AWR report that are covered, but this is all for the EM repository database!

This section of the report is finished up with information about the AWR retention so if you need to go back and investigate further, you know up front what you have to work with in the AWR, (Automatic Workload Repository.)

--- AWR retention

INTERVAL  RETENTION
---------- ----------
60      11520

 

No, the report is not even close to being finished for the Advisor log at this point.  It jumps promptly into a workload report to give you a good view of what kind of processing this repository database is handling daily.

FULL AWR Report

Just as with a standard AWR report, the basic information about the repository for the previous hour is included with the Advisor log.  Keep this in mind as you review this report.  If this is not a heavy use time for your EM repository database, the data may not be as valid as if we were looking at a larger timeframe.

If you are more comfortable with statspack or AWR reports, then you will want to spend more time here than the ADDM section, but both provide their own view on the data provided.

The IO Statistics by Function Summary

Reads:   Reqs   Data    Writes:  Reqs   Data    Waits:    Avg
Function Name   Data    per sec per sec Data    per sec per sec Count    Tm(ms)
--------------- ------- ------- ------- ------- ------- ------- ------- -------
Buffer Cache Re    818M     2.1 .227007      0M     0.0      0M    7628     0.1
Others              62M     1.1 .017205     44M     0.8 .012210    6643     0.8
DBWR                 0M     0.0      0M     94M     1.7 .026086    5938     0.5
LGWR                 1M     0.0 .000277     40M     2.2 .011100   15.3K     0.5
Direct Reads         2M     0.0 .000555      0M     0.0      0M      17     0.0
Direct Writes        0M     0.0      0M      1M     0.0 .000277      11     0.7
TOTAL:             883M     3.2 .245045    179M     4.6 .049675   35.5K     0.5
------------------------------------------------------------

Direct Reads
2M     0.0 .000555      0M     0.0      0M       0     N/A
Direct Reads (Data File)
2M     0.0 .000555      0M     0.0      0M       0     N/A
Direct Writes
0M     0.0      0M      1M     0.0 .000277       0     N/A
Direct Writes (Data File)
0M     0.0      0M      1M     0.0 .000277       0     N/A
TOTAL:
883M     3.2 .245045    178M     4.6 .049397   11.6K     0.1
------------------------------------------------------------

Enqueue Activity

Enqueue Type (Request Reason)
------------------------------------------------------------------------------
Requests    Succ Gets Failed Gets       Waits  Wt Time (s) Av Wt Time(ms)
------------ ------------ ----------- ----------- ------------ --------------
JS-Job Scheduler (queue lock)
25,724       25,724           0          59            0           1.19
KO-Multiple Object Checkpoint (fast object checkpoint)
30           30           0           3            0           6.67
SQ-Sequence Cache
75           75           0           3            0           3.33
------------------------------------------------------------

Also spend some time inspecting the Latch and Mutex sections.  These are areas of memory that many Oracle databases are experiencing challenges with from time to time.  These allocations in memory are good to understand and even the EM repository database experiences issues with both when concurrency is commonly involved.

I’ve been a long time proponent of the Segment reporting section in the AWR.  This report can often seal up remaining questions that DBAs have after seeing the top SQL by elapsed time that were answered upon viewing the execution plan.

Segments by Physical Reads
-> Total Physical Reads:         105,015
-> Captured Segments account for   98.1% of Total

Tablespace                      Subobject  Obj.      Physical
Owner         Name    Object Name            Name     Type         Reads  %Total
---------- ---------- -------------------- ---------- ----- ------------ -------
SYS        SYSAUX     SCHEDULER$_JOB_RUN_D            TABLE       59,463   56.62
SYS        SYSAUX     SCHEDULER$_EVENT_LOG            TABLE       43,048   40.99
SYS        SYSTEM     OBJ$                            TABLE          266     .25
SYS        SYSAUX     I_WRI$_OPTSTAT_H_OBJ            INDEX          107     .10
SYSMAN     MGMT_TABLE EM_EV_METRIC_ALERT   P201404    TABLE           47     .04

 

There will be sections for logical, physical and unoptimized reads, along with row locks, table scans, writes and even ITL waits.  The report continues with library cache and all variations of memory information.  This is an area that I feel more than an hour could be beneficial for the REPVFY utility output and a future enhancement might be to follow with an argument for time frame of snapshots to be utilized for the report.

SGA Resizing Report

This report provides not just SGA information, but also PGA and caches.  It starts out by giving the reader a view of any specially sized buffer caches and each of the pools, (large, java, ASM, etc.)

-- SGA Dynamic Components
--  Cache: D:DEFAULT, K:KEEP, R:RECYCLE
--   Mode: DEF:DEFerred mode, IMM:IMMediate mode

Last Oper  Last
Component              Size (M) Type/Mode  Operation Time
---------------------- -------- ---------- --------------------
D: 2K buffer cache            0 STATIC
D: 4K buffer cache            0 STATIC
D: 8K buffer cache            0 STATIC
D:16K buffer cache            0 STATIC
D:32K buffer cache            0 STATIC
D:buffer cache              448 INITIALIZI

From there, it steps into resize operations for SGA and then PGA.

-- PGA Resize Operations overview

I# SNAP_TIME            PGA_TARGET PGA_ALLOC    PROCS        AUTO      MANUAL
---- -------------------- ---------- --------- -------- ----------- -----------
1 02-APR-2014 15:00:00        493       281       60        0.00        0.00
1 02-APR-2014 14:00:00        493       290       66        0.00        0.00
1 02-APR-2014 13:00:00        493       311       70        0.00        0.00
1 02-APR-2014 12:00:00        493       286       66        0.00        0.00

 

This part of the report is not held to the 1 hour snapshot as we experience in the ADDM and AWR section of the Advisor log.  The data for these sections went back for one week and provided a clear picture of resizing activities for the SGA and PGA.

Repository CBO Report

This section is built out of SQL added to the advisor.sql in the REPVFY utility.  Its only found in the Advisor report and is essential to understanding specific information in terms to CBO, (cost based optimizer) information to the repository database.  To think of this as schema specific is more in line with the value provided from the information.

If you remember back to when you installed the EM12c on the repository database, it has you disable the automated dbms_stats job?  This is now providing you with the internal stats job information that provides the correct statistical information to the CBO-

TABLE_NAME                       INCREMENTAL GRANULARITY     STALE_PCT
-------------------------------- ----------- --------------- ---------
ADP_METRIC0                      TRUE
ADP_METRIC1                      TRUE
ADP_METRIC2                      TRUE
EMCT_CBA_CHARGE_1DAY             TRUE
EMCT_CBA_CHARGE_1HOUR            TRUE
EM_EVENTS_INCIDENT               TRUE                        10
EM_EVENT_ANNOTATIONS             TRUE                        10
EM_EVENT_CONTEXT                 TRUE
EM_EVENT_MSGS                    TRUE                        10
EM_EVENT_PROCESSING_STATE        TRUE
EM_EVENT_RAW                     TRUE                        10
EM_EVENT_SEQUENCES               TRUE                        10
EM_EVENT_SEQ_STATE                                           10
EM_METRIC_VALUES                             AUTO            10
EM_METRIC_VALUES_DAILY                       AUTO            10
EM_METRIC_VALUES_HOURLY                      AUTO            10

 

As the repository database scales, the EM stats feature can be scaled to support more extensive statistics information as well.  I am cautious to say this and will say it with a grain of salt, but it backs up my experience that 95% of databases are served successfully with the current automated stats job, but that there are types that require a DBA with knowledge of the schema, data and code to ensure the CBO has what it needs to perform efficiently.  The biggest challenge is that there aren’t a lot of DBAs with that knowledge or time to invest in this task, (I know, I did this in two companies and it must be a true passion.)  EM12c has removed this challenge from the DBAs plate by setting up the one-off stats collection requirement for them as part of the repository installation.

They’ve enhanced that in the REPVFY by having a utility step that checks to see what is currently in place and then the ability to review it as part of this output.

TABLE_NAME                       RUN_TYPE                         INCREMENTAL GRANULARITY     STALE_PCT
-------------------------------- -------------------------------- ----------- --------------- ---------
EM_EVENTS_INCIDENT               DAILY                                                        20
EM_EVENT_ANNOTATIONS             DAILY                                                        20
EM_EVENT_MSGS                    DAILY                                                        20
EM_EVENT_RAW                     DAILY                                                        20

 

Curious about when and how long the jobs that do this work run?  This is included as well-

[----- CBO statistics gathering log (last 3 days) ---------------------------]

LOG_DATE             RUN_TYPE                           DURATION OBJECTS_ANALYZED OBJECTS_STALE OBJECTS_FILTER VALUE
-------------------- -------------------------------- ---------- ---------------- ------------- -------------- --------------------------------------------------------------------------------
02-APR-2014 14:24:16 PROACTIVE                                 5                3            69             24 P:1.86:EM_EVENT_ANNOTATIONS;1.13:MGMT_JOB_EXECUTION;.71:EM_EV_METRIC_ALERT;
02-APR-2014 12:24:16 PROACTIVE                                 9                5            71             24 P:2.47:EM_EVENT_RAW;1.97:EM_EVENT_SEQUENCES;1.34:MGMT_JOB_EXECUTION;.87:EM_EVENT
02-APR-2014 10:24:16 DAILY                                   123              272           457                D:6.91:EM_JOB_METRICS;5.66:MGMT_SYSTEM_PERFORMANCE_LOG;4.81:MGMT_ARU_PATCH_FIXES
02-APR-2014 08:24:16 PROACTIVE                                13                7            74             24

 

Table stats in the repository?  Yep, that’s included, too!

[----- Table statistics -----------------------------------------------------]

TABLE_NAME                       PARTITION_NAME                 LAST_ANALYZED          NUM_ROWS     BLOCKS AVG_ROW_LEN GLO USE STATT STA
-------------------------------- ------------------------------ -------------------- ---------- ---------- ----------- --- --- ----- ---
EM_METRIC_VALUES_DAILY                                          29-MAR-2014 10:25:19      95257       2040          16 YES NO  ALL   YES
EM_METRIC_VALUES_HOURLY                                         01-APR-2014 10:26:49     681018      15820          16 YES NO  ALL   YES
MGMT_DB_FEATUREUSAGE                                            02-APR-2014 10:25:13       5713        244         149 YES NO  ALL   YES

 

And with all of that,  (pages and pages…) we’re done with the first of many reports.  I’ll keep to the most essential data in each, but hopefully you’ve learned something you hadn’t known before about the REPVFY utility and I’ll update with the next log report soon!

Kellyn

http://about.me/dbakevlar

One thought on “REPVFY Diagnostics, Part I of…Well, A Zillion

Comments are closed.