Category: ASH and AWR

May 10th, 2016 by dbakevlar

There are two ways to compare one database to another in the AWR Warehouse.  I covered the ADDM Comparison Report here and now we’ll go through the second one, which is much more involved and has us empowering the AWR Warehouse taking two AWR Warehouse reports and comparing two databases to each other.

possible

The AWR Warehouse, once setup and databases that are targets already monitored by your EM12c or EM13c environment, can then be added and upload all AWR snapshots to this central repository.

Comparison Period Report

The AWR Warehouse second comparison reporting option is accessible from the drop down menu in the AWR Warehouse dashboard:

addm_compare1

Once you click on Compare Period Report, you’re offered to choose a baseline or snapshots from the list for the databases you wish to compare:

compare1

In my example, I simply chose the DNT database, with a one hour snapshot window to compare to an OMR, (Oracle Management Repository) database for another one hour snapshot interval.  Clicking on Generate Report will then create an HTML formatted report.

The Actual Report

awrc1

In the report summary, not only does the report show that I’m comparing two different databases from two different hosts, but any differences about the main configuration will be displayed.  We can see that although I’m comparing the same amount of time, the average number of users is twice and the DB Time is extensively different for the two databases.

The report will then start comparing the high level information, including the host, the memory and I/O configuration-

awrc2

The Top Ten Foreground events are displayed for each environment, ensuring there isn’t anything missed that could be confusing if a comparison was performed.  In a more similar database, (let’s say test against production or old production vs. a newly consolidated environment)  there’s going to be more similarities and you’d be able to see how the workload had changed between systems.

awrc3

Each section contains values for the specific database and then the differences, saving the DBA considerable time manually calculating what has changed.  Once you get to the Top SQL, the report updates it’s format again to display the SQL in order, over all, for time elapsed, CPU, etc. and then bread down between the times for each environment run or not and the difference.

awrc4

After breaking down the SQL in every way possible, as commonly seen in an AWR report, but with the added benefit of comparisons between two different AWR reports and databases, the report digs into each of the Activity Stats and compares all of those:

awrc5

The report then does comparisons for SGA, PGA, interconnects and even IO:

awrc6

Once completed with these, it then digs into the objects and tablespaces to see if there are any outliers or odd differences in what objects are being called by both or either database.

awrc7

As with all AWR reports, it also pulls up all Initialization Parameters and performs a clear comparison of what is set for each database so you can view if there is anything amiss that would cause performance impacts.

awrc8

This is an incredibly valuable report for those that want to perform a deep analysis comparison between two databases for time periods around performance, workload, migration or consolidation.  The comparison reports are one of the top features of the AWR Warehouse and is so infrequently considered a selling point of the product, (and if you already have the diagnostic and tuning pack, heck, it comes with it’s own limited EE license like the RMAN catalog and Enterprise Manager repository database) so what are you waiting for??

 

Posted in ASH and AWR, AWR Warehouse, EM13c, Enterprise Manager Tagged with: , , ,

May 9th, 2016 by dbakevlar

A lot of my ideas for blog posts come from questions emailed to me or asked via Twitter.  Today’s blog is no different, as I was asked by someone in the community what the best method of comparing databases using features within AWR when migrating from one host and OS to another.

Don't_know_wat

There is a  lot of planning that must go into a project to migrate a database to another host or consolidate to another server, but when we introduce added changes, such as a different OS, new applications, workload or other demands, these need to be taken into consideration.  How do you plan for this and what kind of testing can you perform to eliminate risk to performance and the user experience once you migrate over?

AWR Warehouse

I won’t lie to any of you, this is where the AWR Warehouse just puts it all to shame.  The ability to compare AWR data is the cornerstone of this product and it’s about to shine here again.  For a project of this type, it may very well be a consideration to deploy one and load the AWR data into the warehouse, especially if you’re taking on a consolidation.

There are two main comparison reports, one focused on AWR, (Automatic Workload Repository) data and the other on ADDM, (Automatic Database Diagnostic Monitor).

addm_compare1

From the AWR Warehouse, once you highlight a database from the main dashboard, you’ll have the option to run either report and the coolest part of these reports is that you don’t just get to compare time snapshots from the same database, but you can compare one snapshot from a database source in the AWR Warehouse to ANOTHER database source that resides in the warehouse!

ADDM Comparison Period

This report is incredibly valuable and offers the comparisons to pinpoint many of the issues that are going to create the pain-points of a migration.  The “just the facts” and crucial information about what is different, what has changed and what doesn’t match the “base” for the comparison will be displayed very effectively.

When you choose this report, the option to compare from any snapshot interval for the current database is offered, but you can then click on the magnifying glass icon for the Database to compare to and change to compare to any database that is loaded into the AWR Warehouse-

 

compare2

For our example, we’re going to use a day difference, same timeline to use as our Base Period.  Once we fill in these options, we can click Run to request the report.

The report is broken  down into three sections-

  • A side by side comparison of activity by wait event.
  • Details of differences via tabs and tables
  • Resource usage graphs, separated by tabs.

compare3

We can clearly compare between the two comparisons of activity that there was more commit waits during the base period, along with user I/O in the comparison period.  During a crisis situation, these graphs can be very beneficial when needed to show waits to less technical team members.

compare4

The Configuration tab below the activity graphs will display quickly what differences in OS, initialization parameters, host and other external influences to the database.  The Findings tab will then go into the performance comparisons differences.  Did the SQL perform better or degrade?  In the below table, the SQL ID, along with detailed information about the performance change is displayed.

Resources are the last tab to display graphs about the important area of resource usage.  Was there an impact difference to CPU usage between one host and the other?

compare6

Was there swapping or other memory issues?

compare7

In our example, we can clearly see the extended data reads and for Exadata consolidations, the ever valuable single block read latency is shown-

compare8

Now for those in engineered systems and RAC environments, you’re going to want to know waits for interconnect.  Again, these are simply and clearly compared, then displayed in graph form.

compare9

This report will offer very quick answers to

“What Changed?”

“What’s different?”

“What Happened at XXpm?”

The value this report provides is easy to see, but when offered to compare one database to another, even when on different hosts, you can see how valuable the AWR Warehouse becomes that even the consolidation planner can’t offer.

Next post, I’ll go over the AWR Warehouse AWR Comparision Period Report.

 

 

 

 

 

 

Posted in ASH and AWR, AWR Warehouse, Oracle Tagged with: , , ,

April 7th, 2016 by dbakevlar

It’s that time of year again and the massive undertaking of the Collaborate conference is upon us.  This yearly conference, a collaboration between Quest, Oracle Applications User Group, (OAUG) and Independent Oracle User Group, (IOUG) is one of the largest conferences in the world for those that specialize in all areas of the Oracle database.

The conferene is held in different cities, but recently its been sticking to the great destination of Las Vegas, NV.  We’ll be at the Mandalay, which like many casinos, is like it’s own little self-contained city within a city.

vegasbaby

The week will be crazy and I’ll start right out, quite busy with my partners in crime, Courtney Llamas and Werner De Gruyter with Sunday’s pre-conference hands on lab. “Everything I Ever Wanted to Know About Enterprise Manager I Learned at Collaborate” was a huge hit last year, so we’re repeating it this year, but we’ve updated it to the newest release, EM13c.  For those that are able to gain a coveted spot in this HOL, it will be a choice event.  We’re going to not just cover the new user interface, but some of the coolest need-to-know features of the new release.

Sunday evening is the Welcome Reception and Awards Ceremony.  This year I’m receiving the Ken Jacobs award for my contributions to the user community as an Oracle employee.  I’m very honored to be receiving this and thank everyone at IOUG for recognizing the importance that even as an Oracle employee, you can do a lot to help make the community great!

Throughout the week, I’ll have a number of technical sessions:

Monday

Now my Database as a Service Session is up first for the week on Monday, 9:15am in Palm B, but I’m going to warn you, since this abstract was submitted very early on, the abstract isn’t as descriptive as I wanted.  Know that this is a DBaaS session and I’ll be covering on-premise, private cloud and even Oracle Public Cloud!  Come learn how easy it can be and forget all those datapump, transportable tablespace and other silly commands people are telling you have to do to provision… ☺

Right  after my DBaaS session, 10:15, same room, (Palm B) we’ll have a special session covering the new product that so many of us have put so much energy, time and vision into-  The Oracle Management Cloud, (OMC)!  Read more about this session here.

The Welcome Reception in the Exhibit Hall is from 5:30-8pm.  Don’t miss out on getting there first and see all the cool exhibitors.  I’ll be at the EM13c booth, so come say hi!

Tuesday

So Tuesday morning, the 12th, I’m back in Palm B at noon for the first of my certification sessions, covering 30 minutes of Enterprise Manager 13c New Features.

Wednesday

Wednesday, at noon, I’m back in my favorite room, Palm B to finish the second part of the certification sessions on new features with Enterprise Manager 13c.

I’ll be presenting at Oak Table World at Collaborate at 2pm in the Mandalay Bay Ballroom.  I’ll be doing my newest session on Enterprise Manager 13c and DB12c.  It’s always a great venue when we have Oakies at conferences and I almost squeaked out of it this year, but dragged back in at the last minute!

The Enterprise Manager SIG is right afterwards at 3-4  in the South Seas Ballroom E.  This is where we meet and geek out over everything Enterprise Manager, so don’t miss out on that!

Thursday

For the last day, Thursday at 9:45am, I’ll be in- wait for it….  Palm B!  Yes, I know it’s a surprise for both of us, but I’ll be using my experience helping customers Upgrade to Enterprise Manager 13c and sharing it with everyone at Collaborate.  This is another certification session, so collect those certificates and get the most out of your conference!

I’ve made a lot of updates with new material to my slides recently, so I promise to upload my slides to SlideShare after the conference, too!

See you next week in Las Vegas!

 

 

Posted in ASH and AWR, EM13c, Oracle, Oracle Management Cloud Tagged with: , , , ,

March 10th, 2016 by dbakevlar

The other day I had a customer who was to send me an ASH report after there was a challenge in providing me EM Diagnostics from their repository database.  After the report hadn’t been submitted after a number of days, the customer admitted that they would execute an ASH report request and would subsequently kill the session after eight hours or so of non-completion.

images (1)

Although I hadn’t experienced this myself, there are a number of causes behind an ASH report not running and troubleshooting it is a straight forward approach.

Analysis

The first thing to do, is to gather some information on the cause of the issue.  Running the report repeatedly without looking into the cause isn’t going to make the end result any better.

Upon executing the ASH report, (either from Enterprise Manager or SQL*Plus, using the $ORACLE_HOME/rdbms/admin/ashrpt.sql report) the completion should be seen in less than a minute or so.  If it doesn’t complete in that time, inspect the session details, pulling the execution plan for the SQL that is currently running for the session.

In the customer’s experience, the ASH report was “hung up” on a step to capture the top SQL using a join between GV$_ACTIVE_SESSION_HISTORY and a number of X$ tables.  The plan had a number of merge join cartesians, ending with a result of 3.501 million rows.

The amount of ASH data involved could in no way come up to that kind of row count and the X$ tables aren’t “static” values you can work off of, but it became apparent that the stats were definitely awry, but not on the ASH object.  To update the X$ tables, I had the user collect fixed stats and then run the ASH report again.

exec DBMS_STATS.GATHER_FIXED_OBJECTS_STATS;

Upon executing the report afterwards, the report ran in a matter of a few minutes and I could now have the original report I had requested, but I also asked them to run one more for me:

$ORACLE_HOME/rdbms/admin/awrinfo.sql;

This is the General AWR Information report.  I often recommend this report for sizing an AWR Warehouse, but for a very specific reason.  Many earlier releases of 11g experienced issues with clean up of ASH data.

There is a section in the report that states the size of individual components in the SYSAUX tablespace.

Here’s the output from a “healthy” AWR in a clustered repository, DB Version 11.2.0.4.0:

COMPONENT MB % AWR KB_PER_SNAP MB_PER_DAY MB_PER_WEEK TABLE% : INDEX% 
--------- --------- ------ ------------ ---------- ----------- ---------------- 
SQLPLAN 697.0 30.2 1,028 24.1 168.7 69% : 31% 
FIXED 618.5 26.8 913 21.4 149.7 50% : 50% 
EVENTS 180.5 7.8 266 6.2 43.7 47% : 53% 
SQL 52.1 2.3 77 1.8 12.6 73% : 27% 
ASH 48.2 2.1 71 1.7 11.7 89% : 11% 
SPACE 41.4 1.8 61 1.4 10.0 72% : 28% 
SQLTEXT 16.3 0.7 24 0.6 3.9 98% : 2% 
RAC 12.2 0.5 18 0.4 3.0 52% : 48% 
SQLBIND 1.3 0.1 2 0.0 0.3 55% : 45%

 

Here’s the output from the customer’s AWR in their repository, DB version 11..2.0.2:

COMPONENT MB % AWR KB_PER_SNAP MB_PER_DAY MB_PER_WEEK TABLE% : INDEX%
--------- --------- ------ ------------ ---------- ----------- ----------------
ASH ######### 67.0 222,497 5,214.8 36,503.5 87% : 13%
FIXED 28,382.3 10.9 36,149 847.2 5,930.6 48% : 52%
EVENTS 19,503.7 7.5 24,841 582.2 4,075.4 40% : 60%
SQLPLAN 16,093.0 6.2 20,497 480.4 3,362.7 70% : 30%
SPACE 1,460.4 0.6 1,860 43.6 305.2 67% : 33%
RAC 552.6 0.2 704 16.5 115.5 59% : 41%
SQL 100.6 0.0 128 3.0 21.0 69% : 31%
SQLBIND 6.0 0.0 8 0.2 1.3 50% : 50%
SQLTEXT 4.2 0.0 5 0.1 0.9 96% : 4%

 

Although both Enterprise Manager’s are 12.1.0.5, are both on RAC and both have the same interval and retention time for AWR set, the customer’s 11.2.0.2.0 is retaining much larger amounts of data and ASH is by far the largest contributor to space allocated, (to the point that its more than the report was formatted to display…)

There are a number of MOS notes covering what may be of the cause, but there are helpful steps to follow to diagnose the specific challenge that’s causing the unbridled size demands for AWR and/or ASH in your own environment.

One of the most important is to follow the appropriate steps to isolate what is consuming space, often in the way of partitions not splitting properly and manually purging data, as seen in the following MOS Doc:

Doc 1919268.1

The Oracle Management Repository, (OMR) is a busy database, having to serve the Oracle Management Service, the web tier, the targets and anything else that EM expects of it.  If undue pressure is added by other features, as we see here with ASH and AWR data, it is important to dig down into the cause and address it.  This issue was impacting our ability to even pull diagnostics data to offer an intelligent recommendation on upgrade options.  By investigating the problem in a logical manner, we’re able to quickly diagnose the cause to the problems and correct the issue.

Posted in ASH and AWR, EM12c Performance

February 18th, 2016 by dbakevlar

This question was posted in Twitter from @matvarsh30, who asked, “How can I display CPU usage over different periods of time for databases in Enterprise Manager?”

Everyone loves their trusty Top Activity, but the product’s functionality is limited when it comes to custom views and this is what our user had run into. There are numerous ways to display this data, but I’m going to focus on one of my favorite features in the product that was created to replace Top Activity, ASH Analytics.

Retaining AWR Data for ASH Analtyics

Note: This process to display CPU graphs will work for EM12c and EM13c.  Other than the location of the target menu, not much else has changed.

The default display is for one hour and as ASH Analytics is dependent upon AWR data, so although 8 days of detailed information is easy, it is important that you set your retention in the source, (target) databases appropriately to ensure you’re able to view and or research past the default 8 day retention of AWR in any database.  I am a firm believer that if you have the diagnostic and tuning pack for your EE databases, you should be getting the most out of these tools and up the retention time from the default by running the following command via SQL*Plus with the appropriate privileges:

BEGIN
  DBMS_WORKLOAD_REPOSITORY.modify_snapshot_settings(
    retention => 86400,        -- In minutes, 86400 is 60 days
    interval  => 30);          -- In minutes, only change from 60 if doing workload tests, etc. 60 min interval is sufficient.
END;
/

Now you have not just the EM metric data that rolls up, but the AWR data for ASH Analytics to do deep analysis and reporting.

With this in place, let’s create some graphs that answer the question – “How do I display CPU usage for a database over one week, 10 days, 30 days, 60 days?”

Default to Power User View

Once logged into any database, you can access ASH Analytics from the Performance drop down.  If you have an 11g or earlier database, you may have to install the package to create the EMCC views, but this will need to be done to utilize this powerful tool in Enterprise Manager.  ASH Analytics works in databases version 10.2.0.4.0 and above.

Logging into ASH Analytics will display data for the instance based off one hour, but to change to a one week view, you’ll simply click on “Week” and then move the displayed view for the bottom section graph out to encompass the entire week:

ashan1

Using this example, you can then see that I’m now showing a similar graph to Top Activity, but for a whole week and without the aggregation that Top Activity some times suffers from.

ashan2

We’re not going to stick with this view though.  Leaving it on “Activity”, click on Wait Class and go to Resource Consumption and click on Wait Event, (it’s on Wait Class by default.)

As you can see on the right side, there is an overlap in the legend that needs to be fixed, (I’ll submit an ER for it, I promise!)  but luckily, we’re focused on CPU and we all know, in EM, CPU is green!

ashan4

When we highlight the green, it turns a bright yellow.  Now that you have chosen this by hovering your cursor over it, double click to choose it.  The graph will now update to display CPU:

ashan5

You now possess a graph that displays all CPU usage for over the last week vs. total wait classes.  You can see the overall percentage of activity in the left hand side table and on the right bottom is even displayed by top user sessions.  You can also see the total CPU cores for the machine, which can offer a clear perspective on how CPU resources are being used.

Now you may want to see this data without the “white noise”.  We can uncheck the “Total Activity” box to remove this information and only display CPU:

ashan6

We could also choose to remove the CPU Cores and just display what is being used:

ashan7

By removing the top core info, we see the patterns of usage and just cores used much clearer. We could also decide we want to view all the wait classes again, without the CPU Cores or Total Activity.  The only drawback is the overlap in the legend, (I so hate this bug in the browser display….)ashan8

Now, as requested, how would you do this for 10, 30 and 60 days?  As noted in the top view, you note that you’re offered a view by hour, day, week, month and custom.  As many months have 31 days in them, you may choose to do custom view for all three of those requests, but a custom request is quite simple:

ashan9

Yep, just put in your dates that you request and then click OK.  If you have already stretched your window to beginning to end on the lower view, don’t be surprised if it retains this view and shows you all the data, but yes, all of it will display, that is if your example database was active during that time… 🙂  Yes, the database I chose, as it’s from one of our Oracle test environments was pretty inactive during the Christmas and January time period.

ashan10

And that’s how you create custom CPU activity reports in ASH Analtyics in Enterprise Manager!

 

Posted in ASH and AWR, EM12c Performance, EM13c, Enterprise Manager Tagged with: , , , , ,

February 17th, 2016 by dbakevlar

There was a question posted on Oracle-l forum today that should have a blog post for easy lookup for folks.  Regarding your Enterprise Manager repository database, (aka OMR.)   This database has a restricted use license, which means you can use it for the Enterprise Manager repository, but you can’t add partitioning to it or RAC or dataguard features without licensing those features.  You also can’t use the diagnostic and tuning pack features available in Enterprise Manager on the repository database without licensing it outside of the EMDiagnostics tool.  You can view information about the license that is part of the OMR here.

No one wants to be open to an audit or have a surprise when inspecting what management packs they’re using.

horrified

To view what management packs you’re using for any given EMCC page, you can use the console and access it from the Setup menu from EM12c or EM13c:

mgmt_pack

With that said, Hans Forbrich made a very valuable addition to the thread and added how to disable EM management control access in your OMR database-

Run the following to disable it via SQL*Plus as SYSDBA:

ALTER SYSTEM SET CONTROL_MANAGEMENT_PACK_ACCESS='NONE' scope=BOTH;

Other packs are disabled using the EM Cloud Control with the appropriate privileges in the console using the SETUP menu in 12.1.0.4 with a patch or higher:

mgnt_pck_acs

The view can be changed from licensed databases to all databases and then you can go through and adjust management packs as licensed and then apply.

mngt_acs_2

Don’t make yourself open to an audit when Enterprise Manager can make it really easy to manage the management packs you are accessing.

Posted in ASH and AWR, Database, EM13c, Enterprise Manager, Oracle Tagged with: , ,

May 18th, 2015 by dbakevlar

When you need to have information about one SQL_ID and don’t need everything and the kitchen sink, there are a few different ways to collect this via Oracle.  I’m surprised how rarely this is covered in performance tuning/optimization, (whatever the current “acceptable” term is for fixing a database when there are performance issues arise… J)  classes, manuals and documentation.  I’d like to cover two of my favorite here and know, there are other options that I won’t be digging into, (SQLd360, SQLTXPLAIN, tracing, etc…)

The Workload Repository SQL Report

I also refer to this as the “SQL ID Specific AWR Report”.  Now many of you may have come across this report via Cloud Control when you go into a unique SQL_ID in the console and upon looking at the details, you notice there are two options below the wait event actrivity, one for run an AWR SQL Report and one for an ASH report.  What many don’t realize, is that this report also is available from the command line on the database host.

awrsq1

Report Location:  $ORACLE_HOME/rdbms/admin/awrsqrpt.sql

Report Format:  HTML from Cloud Control and the command line, TEXT is also an option from SQL Plus.

Requirements:  Execution DBMS_Repository and privileges to run reports and store on the host or if you run via Cloud Control, then you just can save off the HTML report to your desktop.

Licensing:  Yes, you do need diagnostics pack for this feature.

What’s so great about this report?

This report, like “Search SQL”, which we’ll go over soon enough, allows you to go through all of the AWR history you possess and let you know the minimum and maximum runs, execution plans for each plan hash value that exists, along with elapsed time and other pertinent statistical information about the SQL ID in question.

Notice that the version of the database, that it’s a RAC instance, along with which node in the RAC environment the statement was run in, is quickly stated, so if anyone questions this, you can quickly locate this information at the top of the report.

awrsq2

The report will display the SQL_ID in question, along with identifying if there was more than one plan hash value, (which the report will number and identify by PHV, (plan hash value) and the plan hash value ID.  This will make it easy to know which execution plan and events belong to which if there is more than one.  For our example, we do only have one here.

awrsq3

Remember for the elapsed time, this is for the total time for the snapshot timeline involved and you need to look at the per execution to see how long each one actually required to complete.  These statistics are very helpful when identifying the wait event(s) that are involved in the performance challenge.

awrsq4

The plan is nicely formatted and you can step into the time used for each step in the process.  I wish it defaulted with temp usage, but alas, that is not to be in the HTML report.  This is the only area of the report where I prefer the HTML report over the text, as with this report, there is considerable size difference for storing the report and I hate giving up any data…. 🙂

If you’ve had an outage during your snapshot timeline, then this report, just like AWR, is dependent upon up time and will notify you of the issue, requiring you to pick snapshots that have no downtime involved.

Search SQL

Search SQL has been around since EM10g, but it’s one of the least known features.  I like it as it gives a clean, high level view of performance over a large period of time by snapshot and then you can dig down to do research from there.

Report:  None really from the command line-  it’s a compilation of report features.

Report Format:  None-  this is via the console, but some of the reports can be saved off, (such as ADDM and other reports…)  you can also copy and paste or screenshot pages.

Licensing:  Due to the way we are using it here, both the diagnostics and the tuning pack will be required.

Using Search SQL

Using our same run as we used above, we can identify a SQL_ID that is causing us concern and we want to investigate further-

awrss1

Copy the SQL_ID so you can paste it in the proper field in the Search SQL console.  You’re now going to leave the Top Activity, (or ASH Analytics, whichever you are using, both work great for this type of initial identification…)

awrss2

Each Search SQL console is database specific, so click on the Performance drop down, SQL and then Search SQL for EM12c, (EM10g and EM11g, this is directly under the Performance menu….)

awrss3

Update the following sections:

  • Add AWR Snapshots and Baselines.
  • Choose All snapshots to verify that you are seeing the entire performance data contained in your AWR.  If your database is set with the default time for retaining AWR, this is 8 days for 11.2 and above.
  • Enter in or paste the SQL ID into the SQL ID value section.
  • Click on Search, (don’t click on Show SQL, as I know this is a bit misleading, but it will just show you the SQL statement to be run by Cloud Control…)

awrss4

Four tabs will be shown, don’t be surprised if you don’t notice the completion as the cache often only returns one row:

  • Cursor Cache, (in the cache currently)
  • AWR Snapshot data, (where we’ll spend most of our time…)
  • AWR Baselines, (good data to have to back up changes in plan hash values if you have your database set to update automatically or if someone’s put in a plan from a baseline run.)
  • SQL Tuning Set if you’ve chosen to examine those, too.

You CAN examine the existing run and looking at it this way really shows you how it’s currently running vs. how it has run, but if you’re just looking for when it really ran well, then skip the curor section and jump to the AWR snapshots to see this SQL ID for ALL snapshots in the AWR-

awrss5

You’re seeing all of this from the largest elapsed time to the least descending, so although we’re able to view any patterns on how the SQL may be degrading over time, note if there are plan has value changes, (which in our example, there are and we didn’t see that for this statement in our AWR SQL Report) and click on the “Next 25” link at the bottom to get to the lesser elapsed run times.

awrss6

There are some runs with 0.00 on the run times.  You can investigate these, but we’d find that they hadn’t completed or no rows were returned.  What is more important are the runs where we see overall run times that are efficient in the elapsed time and that rows were returned.  I’ve marked them in our report.

awrss7

Upon closer inspection, you’ll also notice that they all belong to the same plan hash value except for one run.  This is a good indicator that this plan has value should be identified as the best over all for this query.

Now as we all know, it’s not just about the one query.  The database could impact the run as well.

  • What resource limitations were in play?
  • Where there any concurrency issues?
  • Where there any other factors that overall effect performance?

On the right, you’ll notice that the snapshots are links.  You can use these to inspect the over all performance of the database during the snapshot interval the execution occurred.

awrss9

Here’s all that high level statistics info at the database level.  You also have the opportunity to run an ADDM report from here and see that “all that’s important” view of your database.

awrss8

Now the second tab is going to offer you the workload repository report for this database, including the summary to tell you, is there anything up in this database at that time that you should be aware of.

You can go back to the Search SQL report and then look at the individual executions listed, too.

awrss12

This will take you to the SQL Details page for whichever one you click on-

awrss10

Just as with any SQL Details, you can see the activity, the plan, plan control, etc.  The example above is for an execution from the day before, so it’s kind of an easy way to get to it vs. going and searching historically through Top Activity.

Baselines

Now why did we choose the Baselines as part of our search?  Baselines, especially when they are automatically implemented, can impact performance at times and knowing this information can quickly let you know it wasn’t another culprit involved.

awrss11

We can see what plan hash value is attributed to what baseline, the overall elapsed time and you can click on the Baseline Name link to see when and if it was implemented to diagnose if its part of your problem.

Hopefully this gives you some more tools for your arsenal when that boss or peer is standing at your desk and demanding an answer to a challenge-  what more can we ask for but answers! 🙂

Posted in ASH and AWR, EM12c Performance, Oracle Tagged with: , ,

April 28th, 2015 by dbakevlar

When sizing the AWR Warehouse, one of my requirements is to have certain reports for a sampling of databases that will source the AWR Warehouse. This report provides me the right information to create the correct sizing requirements vs. any assumptions done with other choices. The report is the General AWR Information report and can be found in the $ORACLE_HOME/rdbms/admin directory.  It provides the following information:

  • Time of Report
  • AWR Retention and interval settings, including if non-default setting.
  • Schema percentage of space used in the SYSAUX tablespace.
  • Objects and space consumption in the SYSAUX tablespace, including breakdown of the AWR space usage by type.
  • Automated tasks scheduled.
  • ASH Usage Info
  • Foreground vs. Background processing info.

I’ve now run into a few projects where questions were raised on why so much space was required and there is sometimes an issue with the data being retained in the AWR that must be identified or addressed that will impact the long term size demand on the warehouse before the data is brought over from the source database, (target) via the ETL to the AWR Warehouse.  Knowing how to identify this is crucial, but many folks aren’t looking at AWR data space usage regularly, so how would you know?  Well, that’s what this post is about and hopefully will save me time with how much is on my plate these days… 🙂

The General AWR Information Report

To run the AWR Info report, log into the host of the database in question as a user that has rights to AWR reports.  From the directory you would like to have the report housed, run the following:

SQL> $ORACLE_HOME/rdbms/admin/awrinfo.sql;

Once the report is generated, simply open it in a text editor on the host or FTP it to your workstation.

We’ll start with what a standard AWR Info report with “normal” space usage looks like and use it as a baseline.  This will help you, as a DBA understand what is common for AWR space consumption.

Our Standard retention and interval is 8 days and 60 minute intervals on snapshots and the space usage for the SYSAUX tablespace looks like this:

*****************************************************
(1a) SYSAUX usage - Schema breakdown (dba_segments)
*****************************************************
| 
| Total SYSAUX size 2,993.9 MB ( 74% of 4,046.0 MB MAX with AUTOEXTEND OFF ) 
| 
| Schema SYS occupies 2,750.9 MB ( 91.9% ) 
| Schema XDB occupies 67.3 MB ( 2.2% ) 
| Schema AUDSYS occupies 65.4 MB ( 2.2% ) 
| Schema MDSYS occupies 61.7 MB ( 2.1% ) 
| Schema ORDDATA occupies 16.1 MB ( 0.5% ) 
| Schema SYSTEM occupies 15.7 MB ( 0.5% ) 
| Schema WMSYS occupies 7.1 MB ( 0.2% ) 
| Schema EXFSYS occupies 3.7 MB ( 0.1% ) 
| Schema CTXSYS occupies 3.7 MB ( 0.1% )

Non Default AWR Settings

If the settings have been changed from the default, the AWR info report will display the settings and let you know they aren’t the default. Space consumption will change vs. what you see in our first example, too.

Warning: Non Default AWR Setting!
——————————————————————————–
Snapshot interval is 30 minutes and Retention is 8 days    (5.2 GB, vs. 3GB for a 60 min. interval.)

OR

Snapshot interval is 60 minutes and Retention is 42 days  (10.5 GB)

With an increase in interval or retention, an increase in space consumption will result, but it won’t be 1:1.  There are two features that impact what space is required-  rollup, which saves space, then partitioning, that requires a bit more, so approximately 70% increase on average with the interval to every 30 minutes.

When an increase in retention of AWR data is implemented, then you should calculate about 2.5G of data for each one week of AWR data retained, at 1hr interval on snapshots, (this depends on version of the database, too.  AWR in 10g is much smaller than 11g, which is also smaller than 12c…)

AWR Components

We also need to look at the consumption used by a standard AWR schema breakdown to understand WHAT components are using the space:

**********************************
(3a) Space usage by AWR components (per database)
**********************************
COMPONENT MB % AWR KB_PER_SNAP MB_PER_DAY MB_PER_WEEK TABLE% : INDEX% 
--------- --------- ------ ------------ ---------- ----------- ---------------- 
FIXED 1,559.1 63.3 7,750 181.6 1,271.5 45% : 55% 
EVENTS 489.9 19.9 2,435 57.1 399.5 40% : 60% 
SQL 238.3 9.7 1,184 27.8 194.3 64% : 36% 
SPACE 111.1 4.5 552 12.9 90.6 63% : 37% 
ASH 35.3 1.4 175 4.1 28.7 83% : 17% 
SQLPLAN 11.0 0.4 55 1.3 9.0 64% : 36% 
SQLTEXT 0.9 0.0 5 0.1 0.8 87% : 13% 
SQLBIND 0.6 0.0 3 0.1 0.5 50% : 50% 
RAC 0.6 0.0 3 0.1 0.5 50% : 50%

Note that fixed objects are at the top of the list, followed by events, SQL, space and then, ASH.  This is how the flow of greatest to least should commonly be displayed.

Now lets looks at an AWR Info report where the data consumption is experiencing an issue:

********************************** 
(3a) Space usage by AWR components (per database) 
********************************** 
 
COMPONENT MB % AWR KB_PER_SNAP MB_PER_DAY MB_PER_WEEK TABLE% : INDEX% 
--------- --------- ------ ------------ ---------- ----------- ---------------- 
ASH 2,410.3 42.5 1,494 70.0 490.2 89% : 11% 
FIXED 2,149.7 37.9 1,332 62.5 437.2 48% : 52% 
EVENTS 489.7 8.6 304 14.2 99.6 43% : 57% 
SPACE 224.4 4.0 139 6.5 45.6 58% : 42% 
SQL 160.6 2.8 100 4.7 32.7 55% : 45% 
SQLPLAN 82.0 1.4 51 2.4 16.7 67% : 33% 
RAC 58.3 1.0 36 1.7 11.8 70% : 30% 
SQLTEXT 7.3 0.1 5 0.2 1.5 96% : 4% 
SQLBIND 6.0 0.1 4 0.2 1.2 33% : 67%

Note that the ASH data is the first component listed and the size is extensively larger than the FIXED, EVENTS, etc.  There are numerous reasons for this to have occurred, so we’ll investigate what could have caused the increase in space consumption, as over time, the extended retention into the AWR Warehouse will consume more space on the destination side, increasing requirements for the AWR Warehouse.

Why Is ASH Using So Much?

First, we’ll check to see what the minimum and maximum snap_id’s from both the dba_hist_snapshot in comparison to the AWR:

select min(snap_id),MAX(snap_id) from dba_hist_snapshot;

MIN(SNAP_ID) MAX(SNAP_ID)
------------ ------------
       15027        15189

select min(snap_id),MAX(snap_id) from WRH$_ACTIVE_SESSION_HISTORY;

MIN(SNAP_ID) MAX(SNAP_ID)
------------ ------------
           1        15189

As you can see, the AWR contains ASH data from the first snap_id when the dba_hist snapshot shows that only data from 15027 on should exist.

We’ll next check for orphaned rows of ASH data in the AWR:

SELECT COUNT(*) FROM wrh$_active_session_history a
  WHERE NOT EXISTS
  (SELECT 1
  FROM wrm$_snapshot
  WHERE snap_id       = a.snap_id
  AND dbid            = a.dbid
  AND instance_number = a.instance_number
  );

If this exists, follow the steps from Oracle to Manually Purge the Optimizer Statistics & AWR Snaphots to Reduce Space Usage of SYSAUX Tablespace (Doc ID 1965061.1) to split the partitions and purge the data manually from the AWR from the SOURCE DATABASE, (target) to address before the ETL extracts and load the data to the AWR Warehouse.

ASH Parameters

The next reason for the extensive ASH data in the AWR could result in a change to the parameters involving how ASH data is written to the AWR.  I’ve only recently heard that some shops are doing this as a way to “audit” the SQL happening in their databases.  I have to admit, I would prefer to see DBAs use auditing features vs. use ASH samples to track this, but it is happening and they should expect the following:

1.  Extensive space usage by the AWR

2.  Inaccurate results in ASH and AWR reports due to Oracle expecting only 1:10 samples existing in the AWR and having 10:10 will impact the results.

The parameters controlling this feature are underscore parameters and should only be changed under the guidance of Oracle.

_ash_sampling_interval = 100  The interval that ASH samples, lessened, causing samples to be created more often than the default of 1000.

_ash_sample_all = TRUE  True results in samples of even inactive sessions to be created, increasing the amount of ASH data by 10X or more.

_ash_disk_filter_ratio = 1 Would result in ASH writing all samples to the AWR instead of 1:10.

Once you’ve addressed any issues in storage of the AWR and loaded all snapshots to your new AWR Warehouse, also remember to “dial down” the retention in the source database to a the default of 8 days, (or something close) and shrink the SYSAUX tablespace to reallocate the space back to the database, having no longer need of the space it once consumed.

The AWR Warehouse does require considerable storage for the performance data housed within the Automatic Workload Repository, but with the right process to inspect what is kept in your AWR before building out your environment, you can avoid having to allocate more storage than you really need to.

 

 

Posted in ASH and AWR, AWR Warehouse, Oracle Tagged with: , ,

March 31st, 2015 by dbakevlar

Mauro decided to give me another challenge-  run SQLd360 against the AWR Warehouse and let him know how it does straight “out of the box”.  It’s a simpler installation process than SQLTXPLAIN, you simply unzip and run, (no installation of any packages…)

I chose a SQL_ID from one of the source databases loaded and ran it for a 31 day history:

SQL> @sqld360.sql 8ymbm9h6ndphq T 31

I saw a few errors scroll by on the screen as the process to collect the data and zip it up ran.

There isn’t the same log file to see on successful runs to view parameters, but the SQL is very straight forward and right in the SQL directory, so easy enough to see where the DBID is populated as part of the sqld360_0b_pre.sql:

-- get dbid
COL sqld360_dbid NEW_V sqld360_dbid;
SELECT TRIM(TO_CHAR(dbid)) sqld360_dbid FROM v$database;

As this is pulled from the v$ objects, it’s going to populate incorrectly for the AWR Warehouse, which requires a query of the AWR, preferably the DBA_HIST_DATABASE_INSTANCE for this information.

SQL> desc dba_hist_database_instance;
 Name Null? Type
 ----------------------------------------- -------- --------------
 DBID NOT NULL NUMBER
 INSTANCE_NUMBER NOT NULL NUMBER
 STARTUP_TIME NOT NULL TIMESTAMP(3)
 PARALLEL NOT NULL VARCHAR2(3)
 VERSION NOT NULL VARCHAR2(17)
 DB_NAME VARCHAR2(9)
 INSTANCE_NAME VARCHAR2(16)
 HOST_NAME VARCHAR2(64)
 LAST_ASH_SAMPLE_ID NOT NULL NUMBER
 PLATFORM_NAME VARCHAR2(101)
 CON_ID NUMBER

Then the pre run script also queries the plan_table for the SQL_ID:

-- check if SQLD360 is getting called by EDB360
COL from_edb360 NEW_V from_edb360;
SELECT CASE WHEN count(*) > 0 THEN '--' END from_edb360
 FROM plan_table
 WHERE statement_id = 'SQLD360_SQLID' -- SQL IDs list flag
 AND operation = '&&sqld360_sqlid.'
 AND rownum = 1;

We will need it to call the DBA_HIST_PLAN_TABLE if we would want this data, but it does have a flag so that it can continue on as a stand alone run, so it’s not a game stopper.

We then get tripped up again when it looks for the instance number:

-- get instance number
COL connect_instance_number NEW_V connect_instance_number;
SELECT TO_CHAR(instance_number) connect_instance_number FROM v$instance;

Again, with the AWR Warehouse, we need to call upon the DBA_HIST_DATABASE_INSTANCE view after adding in a where clause for the dbname.

SELECT TO_CHAR(instance_number)
from dba_hist_database_instance where db_name=p_dbname;

As with SQLTEXPLAIN, we are shot on get_sdatabase_name_short, get_host_name and get_rdbms_version, too:

-- get database name (up to 10, stop before first '.', no special characters)
COL database_name_short NEW_V database_name_short FOR A10;
SELECT LOWER(SUBSTR(SYS_CONTEXT('USERENV', 'DB_NAME'), 1, 10)) database_name_short FROM DUAL;
SELECT SUBSTR('&&database_name_short.', 1, INSTR('&&database_name_short..', '.') - 1) database_name_short FROM DUAL;
SELECT TRANSLATE('&&database_name_short.',
'abcdefghijklmnopqrstuvwxyz0123456789-_ ''`~!@#$%&*()=+[]{}\|;:",.<>/?'||CHR(0)||CHR(9)||CHR(10)||CHR(13)||CHR(38),
'abcdefghijklmnopqrstuvwxyz0123456789-_') database_name_short FROM DUAL;
-- get host name (up to 30, stop before first '.', no special characters)
COL host_name_short NEW_V host_name_short FOR A30;
SELECT LOWER(SUBSTR(SYS_CONTEXT('USERENV', 'SERVER_HOST'), 1, 30)) host_name_short FROM DUAL;
SELECT SUBSTR('&&host_name_short.', 1, INSTR('&&host_name_short..', '.') - 1) host_name_short FROM DUAL;
SELECT TRANSLATE('&&host_name_short.',
'abcdefghijklmnopqrstuvwxyz0123456789-_ ''`~!@#$%&*()=+[]{}\|;:",.<>/?'||CHR(0)||CHR(9)||CHR(10)||CHR(13)||CHR(38),
'abcdefghijklmnopqrstuvwxyz0123456789-_') host_name_short FROM DUAL;
-- get rdbms version
COL db_version NEW_V db_version;
SELECT version db_version FROM v$instance;
DEF skip_10g = '';
COL skip_10g NEW_V skip_10g;
SELECT '--' skip_10g FROM v$instance WHERE version LIKE '10%';
DEF skip_11r1 = '';
COL skip_11r1 NEW_V skip_11r1;

You can understand how all of this is going to return the information on the AWR Warehouse repository, (actual database it’s run on) instead of the source database that the SQL_ID is from.

CPU and RAC Info

We have the same issue when it goes to gather the information on CPU, and RAC-

-- get average number of CPUs
COL avg_cpu_count NEW_V avg_cpu_count FOR A3;
SELECT ROUND(AVG(TO_NUMBER(value))) avg_cpu_count FROM gv$system_parameter2 WHERE name = 'cpu_count';
-- get total number of CPUs
COL sum_cpu_count NEW_V sum_cpu_count FOR A3;
SELECT SUM(TO_NUMBER(value)) sum_cpu_count FROM gv$system_parameter2 WHERE name = 'cpu_count';
-- determine if rac or single instance (null means rac)
COL is_single_instance NEW_V is_single_instance FOR A1;
SELECT CASE COUNT(*) WHEN 1 THEN 'Y' END is_single_instance FROM gv$instance;

The first two queries just need to update to use

WRI$_DBU_CPU_USAGE
CPU_COUNT

WRI$_DBU_CPU_USAGE
CPU_CORE_COUNT

And the final one can easily be updated to use the DBA_HIST_DATABASE_INSTANCE after passing in the db_name.

What does make SQLd360 great, is that upon quick inspection, a change to the “pre” script to pass is the DB_ID and query the DBA_HIST vs. the V$/GV$ views is all that is really required to make most of the SQLd360 AWR Warehouse compliant.

SQLd360 Wins

Without any changes-  as is, the product runs and simply provides some wrong info when it comes to stats data, environment and such.  The actual SQL is viable and the code disregards the DBID when it queries the AWR objects.  It will fail on the ASH pull, but that is something that shouldn’t be expected unless they want to hit the DBA_HIST_ACTIVE_SESSION_HISTORY at some point… 🙂

Thanks to Mauro for the great idea to try SQLd360 against the AWR Warehouse and review it for changes to see what would be required.  Hope this helps, guys!

 

 

 

 

Posted in ASH and AWR, AWR Warehouse, Oracle Tagged with: , ,

March 31st, 2015 by dbakevlar

I finally have a moment to look into my “hacked” SQLT XPRECT/XTRACT runs and see if the changes I made were enough to run properly with the AWR Warehouse.

The answer is yes and no… and maybe “it depends”… 🙂

The Results

The data from the AWR Warehouse to pull the corresponding data for the appropriate SQL_ID from the correct source database in the AWR repository worked, but the environment information still populates incorrectly, as I didn’t update anything outside of the get_database_id function in the SQLT$A package and the executables that call it, as documented in my earlier blog post.

The first indicator that my assumptions that additional functions would need to be updated were verified when I noted the environment values in the main html page, the main entry point into the zip file provided as output in any SQLT run:

sqlt8

 

The SQL_ID was run against the SH database, but it instead shows the information for the AWR Warehouse, displaying incorrectly.  The issue is its populating the database name, DBID and the shortname from the local views, (v$instance, v$database, etc…)

The CBO environment and system statistics are also wrong, as it pulls this information from the database and this is simply the AWR repository data, not the database the data resides in:

sqlt10

 

Yes, the statistics and initialization parameters have no way to currently pull from the dba_hist tables, which means that we have the wrong data for most of the higher level info on the main HTML page.

Where It’s Right

Where I made changes, which were minimal vs. what was required, is where it displays correctly.  The cursor sharing, the SQL information, along with SQL_ID data is all correct for the statement at hand.

sqlt11

The bind peek info is correct, too-

sqlt12

You can quickly view that the data is pulled from the DBA_HIST_SQL_PLAN table and since I’m pushing the correct SQL_ID and DBID to the query, it responds with the correct information in return.

I have results in any source that shows the DBA_HIST_SQL_PLAN and distinct failures when anything queries the GV$SQL_PLAN.  This should be expected-  the AWR Warehouse is only going to house the SH database information for this SQL_ID in the DBA_HIST objects.  The GV$ objects aren’t going to know anything about SH source database information.

Success Rate

So how successful was I?  About 30%… 🙂

Any objects, including the r_rows objects populated as part of the SQLT runs, were successful if they sourced from the Automatic Workload Repository, (WRH$, WRI$ and WRM$ objects, which feed the DBA_HIST* views) and failed when they sourced from an GV$ or V$ object.  I also had to ensure that they had the appropriate DBID called, along with the SQL_ID.  Without this added element, failures also occurred.

Recommendations for Change

Additional Changes that need to be made to complete the transition to AWR Warehouse compliance would include:

At the time of installation, the additional option for not just tuning pack, but also add “A” for AWR Warehouse.  If this option is chosen, a second version of the SQLT$A package is then deployed, along with the executables for the XPRECT and XTRACT SQL statements.

SQLT$A Package would require updates for the following functions:

  1. get_database_id
  2. get_database_name
  3. get_sid
  4. get_instance_number
  5. get_instance_name
  6. get_host_name
  7. get_host_name_short

The XPRECT and XTRACT would then require an additional parameter addition-  dbname.

So instead of:

>START /u01/home/sqlt/run/sqltxprext.sql aycb49d3343xq <sqlt password>

it would be:

>START /u01/home/oraha/kellyn/sqlt/run/sqltxprext.sql aycb49d3343xq  <dbname> <sqltpassword>

Carlos and Mauro were wonderful and introduced me to Abel, so I’ll be passing this along to him, so cross your fingers and lets see if we can get this valuable tool working with the AWR Warehouse….and soon! 🙂

 

 

Posted in ASH and AWR, AWR Warehouse, Oracle Tagged with: , ,

March 18th, 2015 by dbakevlar

Yeah, so I did it-  I installed SQLTXPLAIN, (SQLT) on the AWR Warehouse!  From previous experience with this fantastic tool, I was pretty sure I’d have some great new adventures with the AWR Warehouse data and I wanted to try and test out the AWR pieces of this tool with the Warehouse. This is a work in progress, so I’m going to do this blog in multiple parts to ensure we don’t miss anything.

Installation

I didn’t expect SQLT AWR features to work out of the box.  I still needed to install it, run a few reports and see what would trip it up from using the AWR Warehouse repository data.

The installation was pretty much standard-  no differences from the requirements on any other database, including a default tablespace, connection information for the AWR Warehouse repository and other pertinent data.  The one thing I did do and you should do to work with the AWR Warehouse is to use the “T” value for having the Tuning and Diagnostic Pack, so you can use its’ features with AWR Warehouse.

  1. Licensed Oracle Pack. (T, D or N)

You can specify T for Oracle Tuning, D for Oracle Diagnostic or N for none. If T or D is selected, SQLT may include licensed content within the diagnostics files it produces. Default isT. If N is selected, SQLT installs with limited functionality.

The features that are currently supported with the AWR Warehouse from the command line are the following, per the Oracle Support Doc (Doc ID 1614107.1)

SQLT may use the Oracle Diagnostic and/or the Oracle Tuning Packs if your site has a license for them. These two provide enhanced functionality to the SQLT tool. During SQLT installation you can specify if one of these two packages is licensed by your site. If none, SQLT still provides some basic information that can be used for initial SQL diagnostics.

sqlt1

With the installation complete, now the fun starts, or that’s what I thought!

The DBID Conundrum

For my AWR Warehouse environment, (my tiny one that suffices for my testing currently… :))  I can gather the information on my databases that I’m consolidating the performance data and show the “source database” that is of interest.  I want to see if I can run AWR xtracts and other processes from the warehouse or if it will fail.  My initial query to look at what exists in my AWR Warehouse is going to take me to the DBNSMP.CAW_DBID_MAPPING table:

SQL> select new_dbid, target_name from dbsnmp.caw_dbid_mapping;

 

  NEW_DBID
----------
TARGET_NAME
-----------------------------------------------------------------
3017376167
cawr
2710603395 ß This is the one we want to work with!
SH
2973036624
repo

Which we can match up to the distinct DBIDs in one of the AWR objects that also contain SQL_IDs and PLAN_HASH_VALUE data:

SQL> select distinct DBID from <insert AWR Object Here>; 
      DBID
----------
2710603395
2973036624
3017376167

As this is a 12.1.0.2 database, (expected with an AWR Warehouse repository, we recommend 12.1.0.2) I’ve granted the inherit privileges that are required for new DB12c execution of SQLT by SYS, but I’m still having issues even running the AWR centric SQT features.  I’m curious why, but I have some ideas where the problems might lie.

If you run a SQLID Specific, (or any AWR report) report, it will display what Instance the AWR will to choose to run the report against:

sqlt2

 

As we can see, we have multiple DBIDs, Instance name and host information, (although I’ve hidden that… :))

SQLT And The Quest For the DBID Value

The SQLT, I first assume must do something very similar, bypassing the ability to utilize the other DBID data and coming back with an error stating that the SQL_ID’s doesn’t exist when you attempt to run reports against them:

SQL> START /u01/home/kellyn/sqlt/run/sqltxprext.sql 8ymbm9h6ndphq  <SQT Password>

sqlt3

To verify the issue, I run the XTPREXT with a SQL_ID from the repository DBID, which would show in the local AWR:

SQL> START /u01/home/kellyn/sqlt/run/sqltxprext.sql aycb49d3343xq <SQLT Password>

Per the run of the successful SQL_ID, I was able to view the following from the sqlxtract.log the step that sets the DBID:

sqlt4

We can see in the above log the DEFINE 2 passes in the DBID for the For me to take advantage of SQLT, I need to find a way around the code that is setting the DBID and other instance level information.  The objects exist to support an AWR Warehouse design, just as with a local AWR, the DBID is then populated into the SQLT objects to produce reports:

sqlt5

Now if I can update the code that populates these tables to produce the reports, then SQLT becomes AWR Warehouse compliant.

Making SQLT AWR Warehouse Compliant

The code that needs to be updated are in all coming from one location and due to the “clean” coding practices from Carlos, Mauro and others, this is quite simple to perform.

Working with the package body, sqcpkga.pkb and the package source sqcpkga.pks.  We’ll identify the following areas that the AWR Warehouse will be dependent upon-

You’ll  see the calls that are populated with the values:

  FUNCTION awr_report_html (
   p_dbid        IN NUMBER,
   p_inst_num    IN NUMBER,
   p_bid         IN NUMBER,
   p_eid         IN NUMBER,
   p_rpt_options IN NUMBER DEFAULT 0 )

These values are then part of the package body:

  /* -------------------------
   *
   * public get_database_id
   *
   * ------------------------- */
  FUNCTION get_database_id
  RETURN NUMBER
  IS
    l_dbid v$database.dbid%TYPE;
  BEGIN
    EXECUTE IMMEDIATE 'SELECT dbid FROM v$database'||s_db_link INTO l_dbid;
    RETURN l_dbid;
  END get_database_id; 
  /*************************************************************************************/

Now this isn’t going to do me any good “as is” with the AWR Warehouse, where we have multiple dbids and instance ID’s but we need to pass the value in properly.

We’ll start going through the changes step by step.  The code is well written, but involved in what it produces and we’ll ensure that we take each one into consideration before updating and making it AWR Warehouse compliant.

I foresee this as part of the installation someday, (this is for you, SQLT guys)-  If the installer states, as we demonstrated earlier, that they have the tuning pack, then the install will then know you have licenses to use the AWR and will switch from the existing code to the one that I will propose to make it AWR Warehouse complaint.  The AWR Warehouse, as we’ve discussed, retains the DBID and instance allocation for all databases added to the AWR Warehouse repository, so we just need to make sure we use it if we are allowed by our licensing.

I focused on the dba_hist_database_instance object, as it contains about 95% of the pertinent data that SQLT was getting from the v$database, v$instance, gv$** objects and so on.

SQL> desc dba_hist_database_instance;
 Name Null? Type
 ----------------------------------------- -------- --------------
 DBID NOT NULL NUMBER
 INSTANCE_NUMBER NOT NULL NUMBER
 STARTUP_TIME NOT NULL TIMESTAMP(3)
 PARALLEL NOT NULL VARCHAR2(3)
 VERSION NOT NULL VARCHAR2(17)
 DB_NAME VARCHAR2(9)
 INSTANCE_NAME VARCHAR2(16)
 HOST_NAME VARCHAR2(64)
 LAST_ASH_SAMPLE_ID NOT NULL NUMBER
 PLATFORM_NAME VARCHAR2(101)
 CON_ID NUMBER

Armed with this information, we can then make the first of many necessary changes-

  /* -------------------------
   *
   * public get_database_id
   *
   * Enhancement by KPGorman
   * AWRW compliant, now passes in DBNAME 
   * 03/09/2015
   * ------------------------- */
  FUNCTION get_database_id (p_dbname IN VARCHAR2)
  RETURN NUMBER
  IS
    l_dbid.dbid%TYPE;
  BEGIN
    EXECUTE IMMEDIATE 'SELECT distinct(dbid) FROM DBA_HIST_DATABASE_INSTANCE'||s_db_link || 'where DB_NAME = p_dbname' INTO l_dbid;    RETURN l_dbid;
  END get_database_id;

  /*************************************************************************************/

There were six changes that were required to this code to get started-

  1. get_database_id
  2. get_database_name
  3. get_sid
  4. get_instance_number
  5. get_instance_name
  6. get_host_name
  7. get_host_name_short

Now there are more areas that need attention, like code that populates the database version, the OS platform, database properties, etc.  These are pulled from the instance level and not from the AWR tables, too.

Luckily for me, this is all set in one place and not all over in the code, (great development work is wonderful to see!)  There is some other code that is using the gv$** to ensure it captures global data for RAC environments, too.   Again, I have to stress how well this is written and how easy they are making my job for my Proof of Concept, (POC).

After making the required changes, I recompile the one package and package body, along with the sqltxprext.sql in the SQLT/run director that is involved.  The objects inside the database that I’m working with are in the SQLTXADMIN schema-  not the SQLTXPLAIN schema.  Keep in mind, this is just for the POC, but for a proper installation, I would expect the installer for SQLTXPLAIN to look and see if we have the tuning pack and then with this verification, switch to the correct SQLT$A package and executables to be released and ensure we are using the AWR data instead of the v$** objects.

SQL> alter package SQLTXADMIN.SQLT$A compile body;
Package body altered.

SQL> alter package SQLTXADMIN.SQLT$A compile;
Package altered.

Moment of Truth

Now that I’ve made the changes and everything has compiled successfully, it’s time to test it out with a SQL_ID from one of my source databases.  Now one thing to keep in mind, I left for the OUGN Norway conference between the start of this POC and the end, so I had to pick up where I left off.  It took me some time to verify that I’d left off in the right spot and I had to make a few more changes for the host data, etc., but we’re ready to run this now!

sqlt6

I now pass the DBID and was initially worried about a few of the data results in the screen, but after viewing the sqlxtract.log, I was less worried.  Most of the values that are required to pass to SQLT to ensure proper handling is correct, but it appears I have a couple more changes to implement before I’m finished making EVERYTHING compliant.  The parameters for the call look very good upon first inspection though:

sqlt7

The DEFINE 2 now shows the correct DBID, as does the DEFINE_CONNECT_IDENTIFIER.  The zip file is created as part of the XPRECT output and I just need to dig into it to see if there is anything more I need to change that I might have missed vs.making any assumptions from the sqltxtract.log, which is a bit high level and jumbled with execution errors vs. the demands I’m making on this command line tool.

Thanks for staying with me on this one and stay tuned for the output to see how successful I was! 🙂

 

Posted in ASH and AWR, AWR Warehouse, Oracle Tagged with: , ,

February 9th, 2015 by dbakevlar

Active Session History, (ASH) reports are one of my favorite when investigating a performance issue.  The biggest challenge I think that faces many administrators is to know when to turn to an ASH report and how to use the report to diagnose an issue.

Today’s example is brought to us thanks to a Workload replay that should be a simple, but also, a clear step by step of when and why to use an ASH report.

Top Activity

In our EM12c environment, we note some high activity in a database and use Top Activity to view what’s going on:

0209_01

As a DBA, I’m not a fan of pink, red, brown or worse, black, in my Top Activity.  I’m displaying the information from the Top SQL and Top Sessions from just before 3pm and we can see that “Other” was the wait event that is shown to be most impacting in our Top SQL.

SQL Details

We can then double click on the highlighted Top SQL session to dig down a bit more into that unique SQL_ID.

0209_02

Noting that this is an insert statement, I can then scan lower to see more detailed information about the SQL statement, waits and sessions involved:

0209_03

The top wait is Buffer Exterminate and this wait event might not be familiar to everyone, but we’re going to work through this post to investigate the information provided from the ASH report to experience the type of impact this wait event has on the database.

Report Options

You’ll notice that the SQL Details page offers you two reports that you can run-  AWR SQL Report and the ASH Report.  These reports can both be found in the $ORACLE_HOME/rdbms/admin directory.

AWR SQL_ID specific report is named awrsqrpt.sql and the ASH report is named ashrpt.sql if you’re interested in running them from the command line via SQLPlus instead of using EM12c.

We are going to choose to use the ASH report instead of the SQL_ID specific AWR report OR a full AWR report for a couple reasons:

  1. We are only interested in this specific SQL_ID, so the full AWR report would be overkill.
  2. The specific AWR SQL_ID report wouldn’t include the session, blocked session and other wait info we are interested in.
  3. We aren’t interested in execution plans, which the SQL_ID specific AWR Report includes.
  4. The ASH report allows us to use samples and drill down to by time vs. snapshots, which comes in handy when we are inspecting particular set of transactions.  We aren’t concerned with number of executions, which is offered in AWR, but not in samples with ASH.

Running an ASH Report

Running an ASH report after clicking on the button is quite easy.  Not only is the time pre-filled-

0209_04

we also have the SQL_ID filter to give us more accurate information about this one SQL_ID.  You can change the window for the beginning and end time if you’d like to extend it, but know that ASH is much more effective for smaller snapshots of time.

Once satisfied with the criteria for the report, click on Run Report and the HTML version of an ASH Report will display in short order.

The Actual Report via Cloud Control

The first thing to remember is that the report is generated by database node if you are in a RAC environment.  The left hand Target Navigation will show you what target node the report has been sourced from:

0209_05

 

The report starts out with basic information about the environment the report was run in:

0209_06

 

The first section we jump to, which we almost always jump to, is the top wait events:

0209_07

I find it difficult to spend much time on anything that comes up under 10%, (at least its a pretty good rule of thumb to follow…) but as you can see, we are showing over 86% of the event waits on buffer exterminate.

If we go down to the second section we go to most often, the top SQL, we’ll see more verification of this issue:

0209_09

 

For SQL_ID  67sjxrgv720p4, we can see that over 86% is spent on the wait event buffer exterminate.

0209_11

 

We can also view the sessions involved with this SQL_ID and if there were any Blocked Sessions, that is listed in its own section.  No blocked sessions here, so we can go onto more important research.

Objects are most impacted by the waits for this statement are also displayed and if this had been part of a stored procedure of other internally stored code, this would have been shown, again, in it’s own section.

0209_12

 

At the bottom of the report, we see what top waits happened in succession during the time we chose to examine the ASH samples:

0209_13

 

As you can see, the samples are all about one minute each of an approximate five minute examination of the ASH sample time.  We can see at the 3pm time, (just as was displayed in the Top Activity) the impact of the buffer exterminate wait, then the wait for CPU and cluster GC waits post the “hit”.

Buffer Exterminate Waits

So what is Buffer Exterminate?  A lot of folks like to refer to this as “SGA Thrashing”-  sounds lovely, doesn’t it?  The wait event happens when a database uses Automatic Memory Management, (AMM) and the MMON background process decides to shrink the buffer cache and reallocate the memory to other areas of the SGA after advice from the memory advisors. A lot of DBAs haven’t been sold on AMM yet and I have to admit, I’m one of them.  I only use it in small, non-important databases and often avoid setting it for any OLTP environment.  One thing to keep in mind though, I’m more often in the EM12c world these days vs. working with memory allocation issues in DB12c, so I don’t have the experience in DB12c and multi-tenant to give an educated review of AMM with the new version.. 🙂

The above ASH Report should be enough to let you know that this database is impacted by current AMM settings during resize, but if you need to provide more information, you can do this in two ways:

  1. Review the Memory Advisors in EM12c.
  2. Query the V$SGA_RESIZE_OPS and V$MEMORY_RESIZE_OPS directly to see how often the resize is happening.

Here’s a simple query to start:

SELECT parameter, initial_size, target_size, start_time
FROM v$memory_resize_ops 
WHERE initial_size > = 0 
and final_size > = 0 
ORDER BY parameter, start_time;

Another choice is to set DB_CACHE_ADVICE to off and then set the db_cache_size directly.

And that concludes an example of how and why you would use an ASH report to investigate an issue in Oracle.  Have a great day and as always, do everything you can to be bulletproof!

 

 

 

 

 

 

Posted in ASH and AWR, EM12c Performance

January 30th, 2015 by dbakevlar

I’ve had a recent request to write an update on working with AWR reports, so as promised, here it is!

The Automatic Workload Repository

The Automatic Workload Repository, (AWR) was one of the best enhancements to Oracle back in release 10g.  There was quite a goal put in front of the development group when they were asked to develop a product that:

1.  Provided significant performance recommendation and wait event data enhancements over its predecessor statspack.

2.  Was always on, meaning that the data would continually collect without manual intervention from the database administrator.

3.  Wouldn’t impact the current processing, having its own background processes and memory buffer, designated tablespace, (SYSAUX).

4.  The memory buffer would write in the opposite direction vs. direction the user reads, eliminating concurrency issues.

Along with many other requirements, all of the above was offered with the Automatic Workload Repository and we end up with architecture that looks something like this:

pt5

 

Using AWR Data

The AWR data is identified by the DBID, (Database Identifier) and a SNAP_ID, (snapshot identifier, which has an begin_interval_time and end_interval_time to isolate the date and time of the data collection.) and information about what is currently retained in the database can be queried from the DBA_HIST_SNAPSHOT.  AWR data also contains ASH, (Active Session History) samples along with the snapshot data, by default, about 1 out of every 10 samples.

The goal to using AWR data effectively really has to do with the following:

1.  Have you identified a true performance issue as part of a performance review?

2.  Has there been a user complaint or a request to investigate a performance degradation?

3.  Is there a business challenge or question that needs to be answered that AWR can offer an answer to?  (we’ll go when to use AWR vs. other features…)

Performance Review

A performance review is where you have either identified a problem or have been assigned to investigate the environment for performance issues to solve.  I have a couple Enterprise Manager environments available to me, but I chose to go out to one in particular and cross my fingers hoping I would have some heavy processing to fit the requirements of this post.

The quickest way to see workload in your database environment from EM12c, click on Targets –> Databases.  Choose to view by load map and you will then view databases by workload.  Upon going to a specific Enterprise Manager environment, I found out it was my lucky day!

pt1I really don’t know who Kurt is that has a database monitored on this EM12c cloud control environment, but boy, is he my favorite person today! 🙂

Hovering my cursor over the database name, (kurt) you can view the workload he has running on his test database currently:pt2

 

Boy, is Kurt my favorite person today!

EM12c Database Home Page

Logging into the database, I can see the significant IO and resource usage for the database and host from the database home page:

pt3

 

If we move to Top Activity, (Performance menu, Top Activity) I begin to view more details about the processing and different wait events:

pt4

 

Kurt is doing all KINDS of inserts, (seen by the different SQL_IDs, by SQL Type “INSERT”.  I can drill down into the individual statements and investigate this, but really, there are a TON of statements and SQL_ID’s here, wouldn’t it just be easier to view the workload with an AWR report?

Running the AWR Report

I choose to click on Performance, AWR, AWR Report.  Now I have a choice.  I could request a new snapshot to be performed immediately or I could wait till the top of the hour, as the interval is set hourly in this database.  I chose the latter for this demonstration, but if you wanted to create a snapshot immediately, you can do this easily from EM12c or request a snapshot by executing the following from SQLPlus with a user with execute privileges on the DBMS_WORKLOAD_REPOSITORY:

BEGIN
DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();
END;
/

For this example, I simply waited, as there was no hurry or concern here and requested the report for the previous hour and latest snapshot:

pt6

 

I always start at the Top Ten Foreground Events and commonly look at those with high wait percentages:

pt7

 

Direct Path Write, that’s it.  Nothing else to see here… 🙂

Direct path write involves the following:  inserts/updates, objects being written to, tablespaces being written to and those datafiles that make up the tablespace(s).

It’s also IO, which we quickly verify down in the Foreground Wait Class:

pt8

 

Looking at the Top SQL by Elapsed Time confirms that we are dealing with a workload consisting of all inserts:

pt9

 

Clicking on the SQL ID, takes me to the Complete List of SQL Text and shows me just what Bad Boy Kurt is doing to produce his testing workload:

pt10

Wow, that Kurt is quite the rebel, eh? 🙂

Insert in a loop into one table from the same table, rollback and then end the loop, thanks for playing. He’s kicking some tires and doing it with angst!  Don’t worry people, like I said, Kurt is doing his job, using a module called “Load Generator”.  I’d be a fool not to recognize this as anything other than what it is-  generating workload to test something.  I just get the added benefit of having a workload to do a blog post on using AWR data… 🙂

Now, if this was a real issue and I was trying to find out what this type of performance impact this type of insert was creating on the environment, where to go next in the AWR report?  The top SQL by elapsed time is important as it should be where you focus your efforts.  Other sections broken down by SQL is nice to have, but always remember, “If you aren’t tuning for time, you’re wasting time.”  Nothing can come of an optimization exercise if no time savings is seen after you’ve completed the work.  So by taking first the Top SQL by Elapsed Time, then looking at the statement, we now can see what objects are part of the statement, (large_block149, 191, 194, 145).

We also know that the problem is IO, so we should jump down from the SQL detailed information and go to the object level information.  These sections are identified by Segments by xxx.  

  • Segments by Logical Reads
  • Segments by Physical Reads
  • Segments by Read Requests
  • Segments by Table Scans

so on and so forth….

These all show a very similar pattern and percentage for the objects we see in our top SQL. Remember, Kurt was reading each of these tables, then inserting those same rows back into the table again, then rolling back.  As this is a workload scenario, unlike most performance issues I see, there is no outstanding object showing with an over 10% impact in any area.

pt11

As this is an Exadata, there is a ton of information to help you understand offloading, (smart scans) flash cache, etc. that will assist in relaying the information you need to make sure you are achieving the performance you desire with an engineered system, but I’d like to save that for another post and just touch on a few of the IO reports, as we were performing table scans, so we do want to make sure those were being offloaded to the cell nodes, (smart scans) vs. being performed on a database node.

We can start by looking at Top Database IO Throughput:

pt14

And then view the Top Database Requests Per Cell Throughput, (sans the Cell node names) to see how they compare:

pt12

 

Then we add to it this last graph that tells us, are we offloading enough of the workload to each of the cell nodes, Top IO Reasons by Request?

pt15

 

This has been one scenario trip through an AWR Report.  Hopefully it’s been interesting and next time I’ll search for a different load type to take you through!

Happy Friday!

 

 

 

 

 

Posted in ASH and AWR Tagged with: , ,

January 16th, 2015 by dbakevlar

I’ve had a few folks ask me a similar question about the AWR Warehouse, occuring numerous times this week.

“How can I limit what the user of the AWR Warehouse can view in the AWR Warehouse?”

“How can I add source databases to the AWR Warehouse without DBA privileges?”

This topic bridges into the area of confusion of use of the EM12c environment, (which I consistently promote for use by DBAs, Developers and all of IT) and then the AWR Warehouse, which shares it’s user interface as part of cloud control, that currently is a DBA’s deep analysis and research tool.

The request to limit privileges to add source databases, limit view access to targets, but also bind values, full SQL statements, and advanced performance data impacts the purpose of the AWR Warehouse.  I fully understand the security requirements for AWR Warehouse access as stated in Oracle’s support documentation:

“You can add and remove source databases provided you have access to the database target and to database credentials with execute privileges on the sys.dbms_swrf_internal package and the DBA role.”

https://docs.oracle.com/database/121/TDPPT/tdppt_awr_warehouse.htm#CACEEAHD

Why is this important?

The first reasoning would be for space considerations.  AWR data with long term retention could add up to a considerable disk space if just any database was added without careful planning to ensure the right database information is retained for the right databases.  The power of the Automatic Workload Repository, (AWR) is that its always on, but always on means its always growing and this is best left to a DBA resource to ensure that allocation of space is used wisely.

Second, when discussing limiting view of data in the AWR Warehouse–  When investigating an issue with any SQL execution, there are many factors to take into consideration.  One of the most significant and commonly important information to answering why a performance issue has occurred, requires me to look into differences in the amount of data resulting in the where clause and objects vs. the data provided to the optimizer.

If we take the following, simple where clause into consideration:

where a.col1=b.col2
and a.col2=:b1
and b.col7=:b2;

When the optimizer uses data provided to it from statistics, histograms and any dynamic sampling, there is going to be a number of choices that can be made from the following information provided.

  1. Hash join on table a and b to address the join or perform nested loop if…
  2. Column a.col2 is unique, making the join quite small or…
  3. Adding b.col7 to a.col2 to the join will make it so unique that a nested loop is sufficient.

Now, what if the optimizer decided to perform a nested loop when 16 million+ rows were returned?

To the untrained eye, some may assume that the optimizer had made a poor choice or that there was a bug and would walk away.  More often, if you have the data provided by the values passed to the bind variables, along with the data provided to the optimizer, assumptions would fall away and a very different story would present itself.

This is why Oracle requires DBA privileges to add a source database to the AWR Warehouse and to work with the data provided as part of the AWR Warehouse. This feature provides an abundance of data that is most useful to the professional that knows how to work with the Automatic Workload Repository.  This professional, to have the access required to perform this type of analysis and research should be the database administrator, so the requirement for the AWR Warehouse now makes perfect sense.

Now to return to the query, adding in the values for the bind variables, a new picture develops to research:

where a.col1=b.col2
and a.col2=6002
and b.col7='YES';

We now can verify the statistics data behind the values for both a.col2 and b.col7 and accurately diagnose where the optimizer may have been mislead due to incorrect data provided to the Cost Based Optimizer.

This may be a simple explanation behind why I believe in the DBA privilege policy was chosen for source database additions and view options to the AWR Warehouse, but hopefully it sheds a bit of light onto the topic.

 

Posted in ASH and AWR, Enterprise Manager Tagged with:

July 31st, 2014 by dbakevlar

When concurrency is the crippling factor in a database performance issue, often I’m told that viewing blocked sessions in Enterprise Manager is difficult.  The query behind, along with flash image generation in any Enterprise Manager can take considerable time to render and no matter how valuable the view is, the wait is something DBAs just can’t hold out for when needing the answer now.

Blocking Sessions View in OEM

If you’re wondering which feature I’m speaking of, once you log into any database, click on Performance, Blocking Sessions.

blocked_sessions

If there aren’t any or any significant load on the database, it can return quite quickly.  If there is significant load and blocking sessions, well, you could be waiting quite some time….

Behind the Scenes

The query that is run behind the scenes will be executed by the DBNSMP, (or whatever user you have configured for use from a target to communicate with the OEM) to the database in question and will look like the following:

selectsid,username,serial#,process,nvl(sql_id,0),sql_address,blocking_session,
wait_class,event,p1,p2,p3,seconds_in_wait
 fromv$session
 whereblocking_session_status='VALID' OR sid IN
 (selectblocking_session
 fromv$session
 whereblocking_session_status='VALID')

So what do you do when you need blocking information quickly and can’t wait for the Enterprise Manager Blocking Sessions screen?  Use ASH Analytics to view blocking session information!

ASH Analytics View Options

Start out by telling me you have installed ASH Analytics in your databases, right?  If not, please do this, it’s well worth the short time to install the support package and view via an EM Job for this valuable feature.

Next, once its installed or if you’ve already installed it, then for any database target Home Page in the EM12c, click on Performance, ASH Analytics.

ash_Access

The default timeline will come up for ASH Analytics.  If the blocking is occurring now, no change to the time window will be required and you’ll simply scroll down to the middle wait events graph.

ash_analytics1

Notice that no filters or session data is present on the current graph and it’s focused on the standard Wait Class data.  This can be updated to view blocking sessions and offer very clear info on the sessions and waits involved by doing the following quick changes:

Switch to

  • Load Map from Top Activity
  • Switched to Advanced Mode
  • Chose the following Dimensions of data to display
             – Blocking Session
             – User Session

You will see the following data displayed instantly on the screen, without the wait.

blocked_sessions3

You will see the blocking sessions and below, will be displayed the sessions blocked for each.  If there is more than one session blocked, it will show as a second, third, fourth box, etc. under the blocking session ID.

Advanced Dimensions for Blocking Sessions

If you want to build out and see what wait events are involved on the blocking session, this can be done as well.  Just move the Dimensions bar below the load map from two dimensions to three.  Then add another dimension to the load map.

ash_analytics_3

I now can see that I have a concurrency issue on one of the blocking sessions, (calling same objects) and the second blocking session is waiting on a commit.

The additional advantage of using this method to view blocking session data is that it’s not just “current blocking data” that is available as when you use the “Blocking Sessions” view in OEM.

bar1

Using ASH Analytics allows you the added option to move the upper bar to display time in the past or move it to view newer data just refreshed.

If there is specific data that you are searching for, (username, SQL_ID, etc.)  change the dimensions to display what you are interested in isolating.  ASH Analytics supports a wide variety of data to answer questions about blocking sessions along with all other types of ASH data collected!

 

 

 

Posted in ASH and AWR, Enterprise Manager

July 24th, 2014 by dbakevlar

So the AWR Warehouse patches are out, but documentation has not officially caught up to it yet, so as we appreciate your patience.  I thought I would post about what I went over in my webinar last week when I had the wonderful opportunity to speak to the ODTUG members on this feature that everyone is so anxious to get their hands on.

Let’s start with some of the top questions:

1. Can I just load the AWR data into the OMR, (Enterprise Manager Repository) database?

A.  No, it is highly recommended that you do not do this-  use a separate database and server to house the AWR Warehouse.

2. What licensing requirements are there?

A.  The AWR Warehouse, (AWRW, my new acronym and hashtag, #AWRW) requires the Diagnostic Pack license and with this license, a limited use EE license is included for the AWR Warehouse.  This is subject to licensing changes in the future, but at this time, this is a great opportunity considering all the great features that will be included in the AWR Warehouse to house, partition, access and report on via EM12c, etc.

3.  Can I query the AWR Warehouse directly?

A.  It is another Oracle database, so yes, of course you can!

Behind the Scenes

The AWR Warehouse’s data is sourced from target databases in EM12c, providing the ability to retain AWR data indefinitely or for any period of time you choose.  For retail companies or those that do heavy once per year processing, this is gold.  The idea that you can do comparisons on performance a year ago vs. today is incredibly valuable.

This data is loaded via an ETL process using an Agent to Agent direct push of the data, initialized by an Enterprise Manager.  The actual export on the source database and import on the AWR Warehouse is performed by a DBMS_Scheduler job local to those servers.

awrw_etl

The actual interval on the source database and AWR Warehouse depends on if you’ve just added the database to the AWR Warehouse, (back load of data, requires “catch up”) or if the AWRW ETL load has been disabled for a period of time.  There is a built in “throttle” to ensure that no more than 500 snapshots are loaded at any given time and intervals that cause very little to no network traffic in the environment.  During the catchup that required a full 500 snapshots to load on a VM test environment, I was thrilled to see it took a total maximum execution time of less than 12 minutes and 2GB of data.  The network latency was nominal, too.

For the next sections, you will notice the naming convention in jobs and objects of “CAW” either in the beginning or middle of the name.  CAW stands for Consolidated AWR Warehouse and you can use %CAW% to help filter to locate via queries in any AWRW related search, including on source databases, (targets).

Source ETL Job

The job on the source database, (targets) to datapump the AWR data from the source for a given snapshot(s) to reside on an OS directory location to be “pushed” by the agent to agent onto an AWR Warehouse OS directory location.

DBMS Scheduler Job Name: MGMT_CAW_EXTRACT

Exec Call: begin dbsnmp.mgmt_caw_extract.run_extract; end;

How Often: 3 Hour Intervals if “playing catch up”, otherwise, 24 hr interval.

AWR Warehouse Job

This is the job that loads the data from source targets, (databases) to the AWR Warehouse.

DBMS Scheduler Job Name: MGMT_CAW_LOAD

Exec Call: begin dbsnmp.mgmt_caw_load.run_master;

How Often: 5 Minute Intervals

Biggest Resource Demand from the “run_master”:

begin dbms_swrf_internal.move_to_awr(schname => :1); end;

EM Job Service

The EM12c comes into play with the ETL job process by performing a direct agent to agent push to the AWR Warehouse via a job submitted to the EM Job Service.  You can view the job in the Job Activity in the EM12c console:

em_etl_job

Under the Hood

The additions to the source database, (target) and the AWR Warehouse once adding to the AWR Warehouse or creating an AWR Warehouse is done through the DBNSMP schema.  The objects currently begin with the CAW_, (Consolidated AWR Warehouse) naming convention, so they are easy to locate in the DBSNMP schema.

AWR Warehouse Objects

The additions to the DBSNMP schema are used to support the ETL jobs and ease mapping from the Enterprise Manager to the AWR Warehouse for AWR and ASH reporting.  The AWR schema objects that already exist in the standard Oracle database are updated to be partitioned on ETL loads by DBID, Snapshot ID or a combination of both, depending on what the AWR Warehouse developers found important to assist in performance.

There are a number of objects that are added to the DBSNMP schema to support the AWRW.  Note the object types and counts below:

caw_object_count

The table that is of particular interest to those of you with AWR queries that are interested in updating them to be AWRW compliant, is the CAW_DBID_MAPPING table:

dbsnmp_caw_dbid_mapping

You will be primarily joining the AWR objects DBID column to the CAW_DBID_MAPPING.NEW_DBID/OLD_DBID to update those AWR scripts.

An example of changes required, would be like the following:

from   dba_hist_sys_time_modelstm,   dba_hist_snapshot s, gv$parameter p,   dbsnmp.caw_dbid_mapping m
             where stm.stat_name in (‘DB CPU’,’backgroundcpu time’)    
             and   LOWER(m.target_name)= ‘&dbname
             and   s.dbid= m.new_dbid     and   s.snap_id = stm.snap_id
            and   s.dbid = stm.dbid           and   s.instance_number = stm.instance_number
            and   p.name = ‘cpu_count’   and   p.inst_id = s.instance_number)

 Notice that the simple change with the addition of the mapping table and addition to the where clause has resolved the requirements to query just the data for the database in question by DBID.

I’ve included some updated scripts to use as examples and hopefully give everyone a quick idea on how to work forward with the AWR Warehouse if you so decide to jump headfirst into querying it directly.

Source Database Objects

There are only a couple additions to the Source Databases when they become part of the AWR Warehouse.

source_target_count

The objects are only used to manage the AWR extract jobs and track information about the tasks.

CAW_EXTRACT_PROPERTIES : Information on ETL job, dump location and intervals.

CAW_EXTRACT_METADATA : All data about extracts- times, failures, details.

So….

Do you feel educated?  Do you feel overwhelmed?  I hope this was helpful to go over some of the processes, objects and information for AWR queries regarding the AWR Warehouse and I’ll continue to blog about this topic as much as I can!   This feature is brand new and as impressed and excited as I am about it now, I can’t wait for all there is to come!

Posted in ASH and AWR, Enterprise Manager, Oracle

July 17th, 2014 by dbakevlar

The webinar is over, but you haven’t missed out on everything I presented on the console feature, under the hood and behind the scenes!

You can access the slides from today’s presentation, as I’ve uploaded them to slideshare and the scripts are easy to locate on the scripts page here on DBAKevlar.

Thanks to everyone who attended and a big thanks to GP for doing the introduction and ODTUG for hosting us! 🙂

Posted in ASH and AWR, Enterprise Manager, Oracle

June 5th, 2014 by dbakevlar

So, there is a lot to cover on this topic and I hope Part I got you ready for more!  Part II, I’m going to cover the following areas of the AWR Warehouse:

  • Installation
  • Access
  • ETL Load Process

Installation

Let’s start with Prerequisites:

  • The AWR Warehouse Repository must be version 11.2.0.4 or higher.
  • The Targets and I think this will get a few folks, need to be 12.1.0.2 or higher, (now folks know why some docs aren’t available yet!:))
  • Just as with installing ASH Analytics, etc., make sure your preferred credentials are set up for the database you plan to use for the AWR Warehouse repository.
  • You should already have your database you plan on using for the AWR Warehouse repository discovered in your EM12c console.
  • Any database targets you wish to add at the this time to the AWR Warehouse setup should also be pre-discovered.
  • For a RAC target or AWR Warehouse, ensure you’ve set up a shared location for the ETL load files.

Now the repository is going to require enough space to receive anywhere from 4-10M per day in data, (on average…) and Oracle highly recommends that you don’t run anything else on the AWR Warehouse database, so don’t use your current EM12c repository database or your RMAN catalog, etc…  Create a new repository database to do this work and manage the data.

To Install:

Click on Targets, Databases, which will take you to your list of databases.  Click on the Performance drop down and choose AWR Warehouse.  As no AWR Warehouse has been set up, it will take you promptly to the wizard to proceed with an initial workflow page.

Click on the Configure option and choose the database from your list of available databases to make your AWR Warehouse repository database. Notice that you next need to select Preferred Credentials for both the database target and the host it resides on.  As the ETL process does perform host agent host commands, both these credentials are necessary.

Next page you’ll set up the retention period for your AWR Warehouse.  You can set the value to number of years or choose to retain the data indefinitely for you data hoaders… 🙂  You can then set the interval to upload data, (how often it processes the ETL load…) which defaults at 24hrs, but you can set it as often as once per hour.  Due to potential server load issues considering size of environment, # of targets, etc., I would recommend using the default.

Next set up the location for the ETL dump files.  For RAC, this is where you will need to specify the shared location, otherwise, for non-RAC environments, the agent state directory will be default.  I recommend setting up an exclusive directory for RAC and non-RAC targets/AWR Warehouse.

Click on Submit and monitor the progress of the deployment in the EM Job Activity.  The job will be found quickly if you search for CAW_LOAD_SETUP_*.

Accessing the AWR Warehouse

Once set up, the AWR home can be accessed from the Targets menu, click on Databases, then once you’ve entered the Databases home, (databases should be listed or shown in a load map for this screen, click on Performance and AWR Warehouse.

The following dashboard will then appear:

awr_ware_3

From this dashboard you can add or remove source targets and grant privileges to administrators to view the data in the AWR Warehouse.

You can also view the AWR data, run AWR reports, do ADDM Comparisons, ASH Analytics and even go to the Performance home for the source target highlighted in the list.

awr_ware_4

Note for each of the databases, you can easily see the Source target name, type, DB Name, (if not unique…) # of Incidents and Errors.  You can see if the ETL load is currently enabled, the dates of the newest and oldest snapshots in the AWR Warehouse and the # of AWR snapshots that have been uploaded to the repository.

Now how does this all load in?

The AWR Warehouse ETL

The process is actually very interesting.  Keep in mind.  These are targets, source and destination, but what will drive the ETL job?  How will the job be run on the targets and then to the destination AWR warehouse?  Now I was sure the repository used TARGET_GUID to keep everything in order, but since the ETL does have to push this data from a source target through to the destination repository via the host, there is definite use of the DBID, too.

To upload the AWR snapshots from the target to the AWR warehouse, an extract is added with a DBMS job as part of a new collection on a regular interval.  There is first a check to verify the snapshot hasn’t been added to the warehouse repeatedly and then the extract is identified by DBID to ensure unique info is extracted.

The second step in the ETL process is the EM12c step.  Now we are onto the EM Job Service and it submits an EM Job on the host to transfer the data from the target to the warehouse host for the third part of the ETL process.  This is an agent to agent host process to transfer the dump files directly from target host to AWR warehouse host, so at no time do these dump files end up on the EM12c host unless you were housing your warehouse on the same server as your EM12c environment.

The last step for the ETL process is to then complete the load to the AWR warehouse.  This is another DBMS job that takes the dump files and imports them into the AWR Warehouse schema.  DBID’s are mapped and any duplicates are handled, (not loaded to the final warehouse objects…)The ETL is able to handle multi-tenant data and at no time is there a concern then if more than one database has the same name.

Retention Period

Once the retention period has been reached for any AWR warehouse data, the data is purged via a separate job set up in EM12c at the time of creation, edited later on or set with the EM CLI command.  A unique setting can be made for one source target from another in the AWR Warehouse, too.  So let’s say, you only want to keep three months of one database, but 2 years of another, that is an option.

For Part III, I’ll start digging into more specifics and features of the AWR Warehouse, but that’s all for now!

Posted in ASH and AWR, Enterprise Manager

  • Facebook
  • Google+
  • LinkedIn
  • Twitter