The Layman’s Term Guide to AWR for Microsoft- Part I
As often as I use the Automatic Workload Repository(AWR) data in my day-to-day job at Microsoft, I think most can imagine how often I’m explaining its value, along with tracing to my Microsoft peeps. Its time to write a post dedicated to the Microsoft crowd who want to understand a bit more about AWR, so hopefully its helpful!
What is the Automatic Workload Repository?
The Automatic Workload Repository, (AWR) had been around since Oracle 10g and requires the diagnostic and tuning management pack licensing to use all of its features in Oracle’s Enterprise Edition database. Versions before 10.2.0.4 had limited collections vs. the modern reporting schema and every subsequent release of Oracle has added to it’s content, which explains the size increase stored in the objects/number of objects in the SYSAUX tablespace.
By default and since version 11.2.0.4, the AWR retention is 8 days and takes an automatic snapshot once per hour. It’s common for DBAs to up this retention to at least 31 days to capture a month of workload information and these snapshot identifiers can then be used to identify workload intervals for querying and reporting. Oracle can be also be configured to lessen the intervals between snapshots to change the granularity of the AWR reports, or my preference, the DBA or privileged user can take manual snapshots to identify an important beginning or ending of a period.
The AWR stores its cached data in it’s own buffer, which can be queried via the V$ views and then stores the values once the buffer is full or once an interval commit has been issued, writing essential data around performance and aggregated information to the DBA_HIST objects in the SYSAUX tablespace. Also retained in this repository are samples from Active Session History, (ASH). AWR retains one out of every ten ASH samples and also uses AWR data to produce Automatic Database Diagnostic Monitor, (ADDM) reports, The SQL Monitor and baseline reports.
When attempting a comparison to SQL Server or Azure SQL in regards to the AWR, it’s closest to the query store and Dynamic Management Views placed into a repository with extensive packaged reports and utilities built upon it. The data collects metrics, latency information, performance data and advisor recommendations, along with aggregates historically. The data is produced as part of Oracle, but then housed in the repository to provide incredible value to the DBA. If you want to get the low-down on the AWR, including architecture, this slide deck can provide some great history and insight into the repository and products around it.
So how are these reports used and how would I explain them in layman’s terms?
AWR- the 360 Panorama View
Although a one-hour report can provide incredible value- this report is used by my team to collect the overall workload of an Oracle database environment, commonly 7+ days. The amount of information packed in this report offers everything from the basic facts about the database to:
- Resource allocation, (Server memory, CPU, Database memory, IO, network, etc.)
- Top SQL by elapsed time, CPU, IO, etc.
- Wait event statistics
- Parameter settings, (think sp_configure)
- Advisor recommendations
There are specific report versions for Real Application Clusters, (RAC) databases, combining the data from each of the nodes to a single report, including Global Cache, (GC) information about the interconnectivity and performance between RAC nodes. There is a similar version for Exadata engineered systems that provides all this and information on the cell nodes, including IO and flash storage savings.
One of the biggest challenges for many is that the format of the report has changed over releases and how data is presented. As of Oracle 12c, they started limiting some of the output to the text version of the report, so it’s always best to now request the HTML version, which kills my viewing it from the Linux server command line. The biggest challenge is if you’re comfortable with an 11g single instance report, an Exadata 12c report can be a bit of an information overload, even for Oracle DBAs.
I have a few blog posts about digging into AWR reports, as they really are as much a science as an art, as certain data will result in you navigating the report a bit differently. I’ve tried to link some of those posts here and although they are from my previous work in Oracle, they’re still valuable for those new to the AWR.
So how should you think of an AWR report if there really isn’t anything like it in Azure SQL/SQL Server? Think of it like a panoramic picture.
You get a wide view of everything going on in the database. It’s great for looking at the overall health of the database, the resource usage and noting any red flags. I use them to size Oracle databases, including RAC and Exadata to Azure IaaS VMs. They give me the information that I need to perform this task correctly and accurately size the workload vs just shifting the current hardware over. I’m able to also locate any problems that should be addressed beforehand and know if any resources are currently inadequate for the migration.
Using the Panorama example, if you needed to zoom in on a specific area of the picture, would a panorama be the best way to photograph it? Probably not and that’s how the other parts of the repository and reporting come into play in the AWR. If you want to understand more about AWR reports, check out this earlier slide deck that goes into the history of AWR/ASH and know that since it was written, there are new features that have been added, which I’ll update soon!
ASH Report: IOS Live Pic or even a timelapse of Oracle
Active Session History, (ASH) unlike other parts of the AWR has its own background process collecting information. The session information is stored as samples and as samples can overlap in time, these should never be used to count executions, etc. What it does offer you is great insight into what was happening at that time that was impactful to the database at a granule level.
As ASH isn’t dependent upon beginning and ending snapshots, instead using those samples, reports can be created from small intervals of time, allowing for diagnosis of performance issues during the time they occurred. The reports work best when used this way, where an ASH report covering hours or days rarely if ever will provide value.
An ASH report will include the top sessions, SQL statements, modules and procedural calls, but also if there are blocking sessions, what parallel processing has occurred and a breakdown of the top event waits by the samples involved in the report.
If you consider an ASH report as an IOS Live Pic or time lapse vs. an AWR Report a panoramic view, the picture covers less area, but shows you exactly what is happening.
As with an IOS live pic, you can extend the window, but many times, just the small slice of time can provide a moving window of what the main sessions that occurred during the incident you need investigated.
If you want to know more about ASH, check out this post.
Oracle Trace: The GoPro of Oracle
There are a lot of different session traces levels that can be done, but the glory of a trace is that you get to experience the process from the session level and trace it through from beginning to end. It’s more similar to a gopro movie, where you get to experience how long the journey takes, what time is spent on each step of the trip and what happened on the journey from the process point of view. You get to view specifics that may be missed or transparent to the user:
This includes any triggers, logic built into procedures, etc. are all captured and time assessed to perform these actions, then written to the trace file. The file then can be turned into a report to trap each step that was performed as part of the database process from the login through the close of the session.
Traces have been my go-to when digging deep into code logic, when from the database level reports, wasn’t clear what was exactly being executed by the application or when there were other processes that are consuming time.
Need to know more about tracing, check out this great post from Tim Hall, since I only wrote one focused on EM13c- previous post.
AWR SQL ID Specific Report: Close up? Nah…
I have one other favorite report when a specific SQL Statement shows as a red flag in either an AWR or ASH report and I’d like a little more information on just what happened in the SQL during window of time. The AWR SQL ID specific report, (awrsqrpt.sql) can be run from the command line or gathered inside Cloud Control, (aka OEM). This report is like a mini AWR report, but just for the SQL statement. It requires the beginning and ending snapshot ID, just like an AWR, but it also wants the unique identifier for a SQL Statement, also called a SQL_ID.
It provides the execution plan and secondary performance information for the different executions, (by differing sql_ID/hash_plan_values, think transaction IDs in SQL Server) that were performed during the window of the report. I can’t decide what kind of picture it would be, as it isn’t a panoramic view, but it’s not an ASH, as it’s snap_id dependent and it’s not like a trace, because it’s outside the session, but focused on one SQL Statement and how it is performing. Maybe more like this:
Or, kind of like this- multiple executions, one SQL ID, but two different hash values:
No matter how you view it, it’s a great report and I have detailed information on it in this post here if you’re interested.
Hopefully this was helpful or at least amusing!