March 23rd, 2016 by dbakevlar

This issue can be seen in either EM12c or EM13c AWR Warehouse environments.  It occurs when there is a outage on the AWR Warehouse and/or the source database that is to upload to it.

oh_gif_by_gifsandstock-d4ldoq9

The first indication of the problem, is when databases appear to not have uploaded once the environments are back up and running.

awrw5

The best way to see an upload, from beginning to end is to highlight the database you want to load manually, (click in the center of the row, if you click on the database name, you’ll be taken from the AWR Warehouse to the source database’s performance home page.)  Click on Actions, Upload Snapshots Now.

A job will be submitted and you’ll be aware of it by a notification at the top of the console:

awrw1

Click on the View Job Details and you’ll be taken to the job that will run all steps of the AWR Warehouse ETL-

  1.  Inspect what snapshots are required by comparing the metadata table vs. what ones are in the source database.
  2. Perform a datapump export of those snapshots from the AWR schema and update the metadata tables.
  3. Perform an agent to agent push of the file from the source database server to the AWR Warehouse server.
  4. Run the datapump import of the database data into the AWR Warehouse repository, partitioning by DBID, snapshot ID or a combination of both.
  5. Update support tables in the Warehouse showing status and success.

Now note the steps where metadata and successes are updated.  We’re now inspecting the job that we’re currently running to update our tables, but instead of success, we see the following in the job logs:

awrw2

We can clearly see that the extract, (ETL step on the source database to datapump the AWR data out)  has failed.

Scrolling down to the Output, we can see the detailed log to see the error that was returned on this initial step:

awrw3

ORA-20137: NO NEW SNAPSHOTS TO EXTRACT.

Per the Source database, in step 1, where it compares the database snapshot information to the metadata table, it has returned no new snapshots that should be extracted.  The problem, is that we know on the AWR Warehouse side, (seen in the alerts in section 3 of the console) there are snapshots that haven’t been uploaded in a timely manner.

How to Troubleshoot

First, let’s verify what the AWR Warehouse believes is the last and latest snapshot that was loaded to the warehouse via the ETL:

Log into the AWR Warehouse via SQL*Plus or SQLDeveloper and run the following query, using the CAW_DBID_MAPPING table, which resides in the DBSNMP database:

SQL> select target_name, new_dbid from caw_dbid_mapping;
TARGET_NAME
--------------------------------------------------------------------------------
NEW_DBID
----------
DNT.oracle.com
3695123233
cawr
1054384982
emrep
4106115278

and what’s the max snapshot that I have for the database DNT, the one in question?

SQL> select max(dhs.snap_id) from dba_hist_snapshot dhs, caw_dbid_mapping cdm
2 where dhs.dbid=cdm.new_dbid
3 and cdm.target_name='DNT.oracle.com';
MAX(DHS.SNAP_ID)
----------------
501

The Source

These next steps require querying the source database, as we’ve already verified the latest snapshot in the AWR WArehouse and the error occurred on the source environment, along with where it failed at that step in the ETL process.

Log into the database using SQL*Plus or another query tool.

We will again need privileges to the DBSNMP schema and the DBA_HIST views.

SQL> select table_name from dba_tables
where owner='DBNSMP' and table_name like 'CAW%';
TABLE_NAME
--------------------------------------------------------------------------------
CAW_EXTRACT_PROPERTIES
CAW_EXTRACT_METADATA

These are the two tables that hold information about the AWR Warehouse ETL process in the source database.

There are a number of ways we could inspect the extract data, but the first thing we’ll do is get the last load information from the metadata table, which will tell us what were the

SQL> select begin_snap_id, end_snap_id, start_time, end_time, filename
from caw_extract_metadata 
where extract_id=(select max(extract_id) 
from caw_extract_metadata);
502 524
23-MAR-16 10.43.14.024255 AM
23-MAR-16 10.44.27.319536 AM
1_2EB95980AB33561DE053057AA8C04903_3695123233_502_524.dmp

So we can see that per the metadata table, the ETL BELIEVES it’s already loaded the snapshots from 502-524.

We’ll now query the PROPERTIES table that tells us where our dump files are EXTRACTED TO:

SQL> select * from caw_extract_properties
 2 where property_name='dump_dir_1';
dump_dir_1
/u01/app/oracle/product/agent12c/agent_inst
ls /u01/app/oracle/product/agent12c/agent_inst/*.dmp
1_2EB95980AB33561DE053057AA8C04903_3695123233_502_524.dmp

So here is our problem.  We have a dump file that was created, but never performed the agent to agent push or load to the AWR Warehouse.  As the source table was updated with the rows to the METADATA table, it now fails to load these rows.

Steps to Correct

  1. Clean up the dump file from the datapump directory
  2. Update the METADATA table
  3. Rerun the job
cd /u01/app/oracle/product/agent12c/agent_inst
rm 1_2EB95980AB33561DE053057AA8C04903_3695123233_502_524.dmp

Note: You can also choose to rename the extension in the file if you wish to retain it until you are comfortable that everything is successfully loading, but be aware of size constraints in your $AGENT_HOME directory.  I’ve seen issues due to space constraints.

Log into the database and remove the latest row update in the metadata table:

select extract_id from caw_extract_metadata
where being_snap_id=502 and end_snap_id=504;
101
delete from caw_extract_metadata where extract_id=101;
1 row deleted.
commit;

Log into your AWR Warehouse dashboard and run the manual Upload Snapshots Now for the database again.

awrw4

Posted in AWR Warehouse, EM13c, Enterprise Manager Tagged with: , , ,

February 18th, 2016 by dbakevlar

This question was posted in Twitter from @matvarsh30, who asked, “How can I display CPU usage over different periods of time for databases in Enterprise Manager?”

Everyone loves their trusty Top Activity, but the product’s functionality is limited when it comes to custom views and this is what our user had run into. There are numerous ways to display this data, but I’m going to focus on one of my favorite features in the product that was created to replace Top Activity, ASH Analytics.

Retaining AWR Data for ASH Analtyics

Note: This process to display CPU graphs will work for EM12c and EM13c.  Other than the location of the target menu, not much else has changed.

The default display is for one hour and as ASH Analytics is dependent upon AWR data, so although 8 days of detailed information is easy, it is important that you set your retention in the source, (target) databases appropriately to ensure you’re able to view and or research past the default 8 day retention of AWR in any database.  I am a firm believer that if you have the diagnostic and tuning pack for your EE databases, you should be getting the most out of these tools and up the retention time from the default by running the following command via SQL*Plus with the appropriate privileges:

BEGIN
  DBMS_WORKLOAD_REPOSITORY.modify_snapshot_settings(
    retention => 86400,        -- In minutes, 86400 is 60 days
    interval  => 30);          -- In minutes, only change from 60 if doing workload tests, etc. 60 min interval is sufficient.
END;
/

Now you have not just the EM metric data that rolls up, but the AWR data for ASH Analytics to do deep analysis and reporting.

With this in place, let’s create some graphs that answer the question – “How do I display CPU usage for a database over one week, 10 days, 30 days, 60 days?”

Default to Power User View

Once logged into any database, you can access ASH Analytics from the Performance drop down.  If you have an 11g or earlier database, you may have to install the package to create the EMCC views, but this will need to be done to utilize this powerful tool in Enterprise Manager.  ASH Analytics works in databases version 10.2.0.4.0 and above.

Logging into ASH Analytics will display data for the instance based off one hour, but to change to a one week view, you’ll simply click on “Week” and then move the displayed view for the bottom section graph out to encompass the entire week:

ashan1

Using this example, you can then see that I’m now showing a similar graph to Top Activity, but for a whole week and without the aggregation that Top Activity some times suffers from.

ashan2

We’re not going to stick with this view though.  Leaving it on “Activity”, click on Wait Class and go to Resource Consumption and click on Wait Event, (it’s on Wait Class by default.)

As you can see on the right side, there is an overlap in the legend that needs to be fixed, (I’ll submit an ER for it, I promise!)  but luckily, we’re focused on CPU and we all know, in EM, CPU is green!

ashan4

When we highlight the green, it turns a bright yellow.  Now that you have chosen this by hovering your cursor over it, double click to choose it.  The graph will now update to display CPU:

ashan5

You now possess a graph that displays all CPU usage for over the last week vs. total wait classes.  You can see the overall percentage of activity in the left hand side table and on the right bottom is even displayed by top user sessions.  You can also see the total CPU cores for the machine, which can offer a clear perspective on how CPU resources are being used.

Now you may want to see this data without the “white noise”.  We can uncheck the “Total Activity” box to remove this information and only display CPU:

ashan6

We could also choose to remove the CPU Cores and just display what is being used:

ashan7

By removing the top core info, we see the patterns of usage and just cores used much clearer. We could also decide we want to view all the wait classes again, without the CPU Cores or Total Activity.  The only drawback is the overlap in the legend, (I so hate this bug in the browser display….)ashan8

Now, as requested, how would you do this for 10, 30 and 60 days?  As noted in the top view, you note that you’re offered a view by hour, day, week, month and custom.  As many months have 31 days in them, you may choose to do custom view for all three of those requests, but a custom request is quite simple:

ashan9

Yep, just put in your dates that you request and then click OK.  If you have already stretched your window to beginning to end on the lower view, don’t be surprised if it retains this view and shows you all the data, but yes, all of it will display, that is if your example database was active during that time… 🙂  Yes, the database I chose, as it’s from one of our Oracle test environments was pretty inactive during the Christmas and January time period.

ashan10

And that’s how you create custom CPU activity reports in ASH Analtyics in Enterprise Manager!

 

Posted in ASH and AWR, EM12c Performance, EM13c, Enterprise Manager Tagged with: , , , , ,

November 30th, 2015 by dbakevlar

SQL Monitor is one cool tool.  Via EM12c, it empowers everyone, (yes, that means the developer, too, so give them the access to view this data via Cloud Control, please!)  This is a top five list, but it doesn’t stop there, just remember, this is my favorite features when using Cloud Control with SQL Monitoring.

Once logged into the database, you can gain access to SQL Monitor by clicking on Performance –> SQL Monitoring.

1. High Level View

The main view of active and inactive processing, (just remember, SQL Monitor processes will age out, so keep that in mind that this tool isn’t for looking at historical processing.) is a great place to start when we talk about value to your day to day job.

sqlmon5

From the main view, you can easily see what sessions are active, the SQLID, if there is parallel and what the DOP, (Degree of Parallelism) granted, database time, (broken down by wait event percentage) and IO Requests.  There is a ton of pertinent data here, including plan hash values, start and stop time, plus the SQL Text, but everyone has to admit, this is what we all want and need to know.

2. SQL ID Details via SQL Monitor

By clicking on any of the SQLIDs displayed in the SQL Monitor main page, you can then dig into the details surrounding a SQL process.

sqlmon2.3

For the given SQLID, you can view the duration and type of wait event resource consumption.  You can quickly see what the cost is and in graphs added, quickly identify what performance challenges a particular execution plan is most hindered by.  There are arrows to help the DBA or developer navigate the plan to understand what pain points are most evident and where the plan connects from bottom to top as the process filters through data to come up with it’s results.

3.  Offload Percent

Yes, you can use a query to see the offload percent for a process, but SQL Monitor is nice enough to show you right in Cloud Control-

Picture1

If you’re working on an Exadata and want to make sure you’re using all those great engineered system features you paid for, it’s nice to know you’re offloading to a cell node vs. doing the work on your database node that should be used for other work than scanning objects.

4. Resource Usage Details for a Distinct SQLID

Knowing how many resources are going to a process are essential to knowing how to eliminate extra resource usage.

sqlmon3.1

Under Metrics for each SQLID captured by SQL Monitor, you’ll be able to view the amount of CPU, PGA and IO used by the process.  You’ll even be able to tell what kind of waits you’re experiencing if you don’t use Google Chrome, (I’m joking…OK, maybe I’m not, but you understand browser challenges, I’m sure… :))

5.  Session Details

By clicking on the User from the main view, I’m able to log into the session details.  The actual session ID won’t be displayed, (in this case, we just see SH for each session) but we can click on the link, there by displaying vital info about the session details.

sqlmon6

We can quickly link to the SQLID, view if there are any session blocking the one we are investigating and see pertinent wait information including the P1, P2 and P3 waits.

If I click on the bottom link in the WAIT section, it will take me to the object that is causing the most waits, which happens to be the SALES table.

sqlmon7

I can now view column and constraint info without having to go to the schema feature within Cloud Control.

I can also view information about indexes and statistics, which no longer is accessible from SQL Details in Cloud Control.

sqlmon8

6.  (Bonus) Viewing Waits and Pain Points

As a bonus, knowing how many of us like to see tabular vs. graphical execution plans, it may be worthwhile to also view them in graphical form.  Below you can see as I hover my cursor over the sections that are highlighted with yellow lines from the Sales table and then Red for the filtering, you can note that it clearly displays that there are 13 million rows having a nested loop performed out of a 26 million row table.  I think most of us know that a hash join would have performed much faster and this is why the section has been highlighted in red as the pain point for the SQL in question.

screenshot_Mon_Nov_30_16.12.30

Just a little bonus for those of you still as hooked on tabular execution plans as I am… 🙂

Have a great week after the long weekend and hope your Thanksgiving was wonderful!

 

Posted in EM12c Performance, Enterprise Manager Tagged with: ,

August 28th, 2015 by dbakevlar

Creating a trace file from EM12c is quite easy and doesn’t require a DBA offering up the world to allow a developer or support person to perform this action.

Enterprise Manager also removes some of the syntax options that could get a developer or app support person that is less specialized in creating trace files, all the options and the knowledge of the syntax, etc. from getting into trouble, which creates an option for everyone to be successful.

The proposed solution that I’ll post here is where the EM12c user will be able to create trace files, but the DBA will control the OS level files, (delivering the file to the user after the trace file is created, will use a defined user to log into the database from Cloud Control and will have no target access granted previously.

Creating the Database User for Tracing

Connect to your PDB and create your user that you will be using for tracing….

create user dev_trace identified by <password>;
grant dev_trace connect, select_catalog_role, create session, alter session;

Limit the size of trace files for the database to ensure that no trace file can be forgotten and grow unconstrained:

connect system/<password@hostnm.us.oracle.com:1521/pdb1.us.oracle.com
alter system set max_dump_file_size =40960;

Once this is complete, you can create your database target credentials for your developer role that will now use this new login and limit their rights in the target as they login.

Grant the following privileges at the “Target Type Privilege” for the new user:

Monitor Enterprise Manager Monitor Enterprise Manager performance
Add any Target Add any target in Enterprise Manager
View any Target Ability to view all managed targets in Enterprise Manager

Yes, you do have the right to switch from “View any Target” to a list of specified targets.  I just figured I’d lesson the steps here, so do what fulfills your IT security model.

Add the the new role that you’ve created for your Tracing User and then complete the wizard creation for a new user in Cloud Control.

Creating a Trace File

Once the user logs in, (and updates their password… :)) They should be able to log into a target and view performance data.  For our example, we’ll use the following database scenario, where the developer or app support user is concerned about “Other” in Top Activity.  They’ve been given a high level overview of Top Activity wait events and know that green, (CPU) is expected, dark blue, (IO) often occurs and that colors of red, (application or concurrency) brown, (configuration or network) orange, (commit) and pink, (other) should be monitored more closely.

tracing1

 

Now tracing is a session level process that created a trace file in an OS, (Operating System) directory.  For Oracle 11g and higher, this creates the file in the Automatic Diagnostic Repository, (ADR) home.  The base location can be easily located by the parameter DIAGNOSTICE_DEST + /rdbms/diag/<CDB_NM>/<CDB_NM>/trace.

show parameter DIAGNOSTIC_DEST;

In our above Top Activity example, we can see the sessions on the right hand side and see a couple that are of concern.  Since we’ve decided that we need to inspect the “Other” waits more, we will look in the Session ID column and double click on the 1107 Session ID.

This brings us to the details page for the 1107 session:

tracing2

 

We have a number of tabs in the Details page to inspect, but let’s say we want to create a trace file to dig deeper into the ENQUEUE waits instead.  To the right, at the upper and lower section, you can see the button “Enable SQL Trace”.  As the user has the privileges to “Alter Session”, they can click this and it will step into the creation steps for a trace file.

tracing3

 

This is like performing a 10046 trace.  As you can see, its very straight forward and simple vs. executing the SQL*Plus commands that would require the following for both options set to “No”:

execute dbms_support.start_trace_in_session (1107,39645, true);

To create the option with the wait information set to “Yes” we’d update our request to the following:

execute dbms_support.start_trace_in_session (1107,39645,waits=>true);

And for both to be set to “Yes”, we’d execute this command that has both:

execute dbms_support.start_trace_in_session (1107,39645,waits=>true, binds=>true);

Back to our Cloud Control option, (the easier option) once we decide to gather the wait and bind information, we can click “OK” and the trace will subsequently begin.

tracing4

Once we’ve finished tracing, we should remember to turn the trace back off, (which is the reason for the DBA setting the max dump file size to have a safety net for those that forget!)

The option will then show right in the session, even if you leave and come back to the session in Cloud Control.  You can click on it and disable the trace:

tracing5

 

If you were to forget, the DBA could easily locate the session being traced in the file and they can issue the following statement from the command line to disable it:

execute dbms_support.start_trace_in_session (1107,39645, false);

Getting a READABLE Trace File Report

Now this is where your DBA comes in and you need to make friends with them.  Most DBAs are not going to be comfortable granting access to OS level files and I don’t blame them.  It’s best if you don’t have to worry about this access and better if they just retrieve the files for you.  Before they do, you want them to convert the trace data into a readable report.  This is done via the TKPROF utility.

The trace file should be easy to locate from the server in the ADR directory for trace files, (the directory is called “trace”, duh… :))  and there are two files that are created as part of your trace file-  The .trm trace mapping file, which you won’t need and the .trc trace file that is the one that is required to create the report.

The DBA will run the following to create the report

tkprof <trace_file_nm>.trc <output_rpt_nm>.txt

They can SCP or FTP the output file to their workstation and this is the one that you’ll want them to send to you.

And there you have it-  creating a secure user that can create trace files and then how to create the trace files once you have the access.  Now you DBAs and developers make friends and bring each other donuts and maybe buy each other a drink… 🙂

Posted in EM12c Performance, Oracle Tagged with: , ,

August 4th, 2015 by dbakevlar

I’ve had a lot of people email and message me asking me what I’m doing with the Raspberry Pi and I wanted to share before the actual RMOUG Quarterly Education Workshop, (QEW) this Friday at Elitches.

rmoug_summer_qew

So a little history-  When I was a kid, I was a real handful.  My mother used to come into my room and I’d have created “webs” out of yarn strung from every piece of furniture and twisted into patterns.  I used to take apart radios and use the parts to create new toys, figuring out how the electronics worked.  Dolls suddenly became cyborgs and we wont’ talk about what I used to do to poor, defenseless stuffed animals.  My destruction and “recycling” used to exhaust her to no end, but what I used to create new toys out of, rarely occurs in today’s world and when the Raspberry Pi was introduced to the stage, I was immediately smitten…:)

20150803_194756

As many know, I’m also a grand supporter of STEAM, (Science, Technology, Engineering, Art and Math) as part of the success in filling the 1.4 million open positions that we are posed to have available in technology by 2020.  That number is only going to grow and I don’t see our public education system building out core knowledge to embrace technology in a way that kids will have been introduced to logical thinking and coding to entice them to careers in technical fields.  Public educations just doesn’t have the resources or the vision to do this and I feel its up to us who are here, in the technical arena now.

steam_logo

With that said, it took some convincing to get the QEW to include a family coding event.  We were already late getting the new board in place, so we’ve been scrambling ever since!  Devoxx4Kids will be onsite and they’ll be doing a session on how to build Minecraft modules.  I’ll be teaching the Raspberry Pi session with attendees.  The project will have all areas of STEAM covered, which is quite a challenge to do in one project with kids!

The attendees will bring their Raspberry Pi setups with their parents [hopefully] in tow, and I’ll be providing small motors, pre-soldered with connectors to attach to a pibrella board that was part of the “shopping list” from the website.  The Pibrella board is a small add-on board to the Raspberry Pi that makes it easy enhance projects and it’s one of my favorite additions to my setup.  With the motor, and a small lego base, the kids will then use ART supplies to build a small project, it can be an animal, a flower, a design from stickers, even a spinner or other flyer.  The art project can be ENGINEERED any way they want, to either spin, dance or fly off the motor.

Along with art supplies, I’ll have LED lights and small watch batteries that can be used to power the lights and add to their projects.  I also have magnets to use to “attach” flyers to bases to launch which all bring in some components of SCIENCE into the project.

raspberry-pi-led-lights1

Once they finish their creation, we’ll attach it to a motor and start up our Raspberry Pi’s.  Each attendee will open up a text editor and CODE some simple timing and spinning rotations, adding different enhancements to the code, depending on what they want their project to do, (I have three coded examples that they can edit with different times, start with the execution or use the button on the pibrella to commence, etc.)

They’ll then EXECUTE their code and see if the initial TIMES COMPUTED work with their project or if they need to adjust the MATH to make the project work.

Once they are satisfied with the project they built, we’ll discuss how they might imagine to enhance their project.  Would they add lights?  Would they add music?  Might they change the physical design or code?  I want them to use their imagination and their logical thinking skills to see how cool technology can be.

20150803_192233

My youngest son, Josh, seen above, has been helping me with this project, using his master soldering skills to assist me attaching the jumper wires to the motors, helping me test out different art projects and flyers to ensure that the code works like expected and that my new [non-brand name] Lego case secures my different card installations correctly.

Thanks for listening to me rant so I can take a break from everything and hopefully, we’ll have a RAGING turn out for the family coding event, people will stick around and listen to me also ramble on about Enterprise Manager Hybrid Cloning to the Cloud and then have a great Hands on Lab using a AWR Warehouse I created for this event!

See you Friday!

 

 

 

Posted in DBA Life, Enterprise Manager, Oracle, Raspberry Pi Tagged with: , , ,

July 28th, 2015 by dbakevlar

More and more companies are consolidating environments.  Server sprawl has a high cost to any business and finding ways to consolidate to more powerful servers or to the cloud is a productive undertaking for any company.

The Consolidation planner in Enterprise Manager has been around for quite some time, but many still don’t utilize this great feature.

Creating a Consolidation Project

You can access the Consolidation Planner from the main Enterprise menu from Cloud Control in EM12c. There are two consolidation types currently available  physical to physical, (P2P) or physical to virtual, (P2V) consolidation.  For our example, we’ll perform a P2P, (physical) consolidation project.

consol1

Once you choose the option for your consolidation project, choose the servers you wish to add to your project.

conol2The planner let’s you know the current resource usage of the hosts that you wish to include in your consolidation project.  This assists the administrator, giving a high level view if the servers are an option for consolidation.

If you scan to the right, you can view more information, including network and disk IO.

consol3

Note that the 07 host has high network I/O and capacity demands that will need to be taken into consideration if consolidating.  Once you are certain that you have all hosts that you wish to consolidate to a new host or engineered system, click on Next.

Choose the system you wish to consolidate to and the hosts will be displayed, along with current resource usage if already the home of databases.

consol4

As displayed to the right, we can see that the destination doesn’t contain the network waits that the 07 host is facing and so we can see that the potential candidates for consolidation are possible.

We can then click Next and choose the amount of workload data we would like to use for our comparison and if we would like to run the job now or schedule it:

consol6

Next you’ll decide if you wish to go with the default to include no pre-configured scenarios or if you would like to add these to enhance the results.

So for my example, I’ve added pre-configured scenarios, using the existing servers that I wish to consolidate:

consol7Once satisfied, I can then click Next to review my consolidation project:

consol8

Review the project and if satisfied, click on Submit.

The Consolidation Project

The project is now in place and collecting data to ensure the most valid information is known for consolidation.

consol9

 

You can see that the project is in the process of collecting the data in the third column for each of the advanced scenarios I requested.  The job that collects all this data can be displayed from the Job Activity screen:

consol11

 

Back on the Consolidation Project screen, there are four tabs at the bottom that offer us information about our project.  This data can be viewed at anytime from the Consolidation console-

consol12

 

Notice that the amount of days that our review data is for hasn’t come close to being achieved, so this is shown in the General information tab.  We can also view the sources involved and see if any changes in resource usage has occurred:

 

consol10

 

 

We’ll need to return to this next tab after the data collections have occurred, but just to review, this is where we start to see how our consolidation data plays out:

consol14We’ll stop here for now and revisit this in three weeks when the consolidation collection is finished and all the data is populated.  You now know how to create a project and next time we’ll go over how to use this data.

 

Posted in Enterprise Manager Tagged with: ,

July 8th, 2015 by dbakevlar

Happy Birthday to me!  So for my birthday, I give a present to you…  As I want all DBAs to sleep better at night, here are the top ten features you can use in Enterprise Manager Cloud Control to offer a good night’s rest instead of during the day at your desk… 🙂

sleeping_pc

1.  Disable the Default Rule Sets Shipped with Cloud Control.

Yes, you heard me.  I believe you should use them as a starting point or an example, but don’t put them into production.  These were examples set by development to see all that you could be notified on, but what you need to be woke up for should be anything mission critical that will SUFFER an outage if you DON’T respond.  Anything that can wait till the morning SHOULD wait till the morning.

rulesets

Make copies of the default rules and disable the originals.  Plan on making as many copies and edits as necessary to ensure that you are only being notified on the appropriate targets, life cycle status and line of business that YOU are responsible for ensuring is up and available to the business.

2.  Implement Monitoring Templates and Default for Important Target Types.

Monitoring templates ensure that you are monitoring each target in the same way and for the same metric thresholds.  This ensures you start with metric thresholds that make sense for the target and should be applied to all targets of that target type.  Creating monitoring templates are easy when you create one target as an example and use it for the source of your template.

3.  Use Metric Thresholds for Individual Targets and Set Them to Not Be Overridden by Monitoring Templates

Now this might sound like a complete 180 from #2 on this list, but it’s not.  This is just like #1, break down and specialize for unique targets that have unique challenges.  This means, if you have a target backup drive that fills up to 97% each night, you shouldn’t be woke up for it.  This is expected behavior and you can either set a static threshold specific to this target or an adaptive threshold that won’t be overridden by the monitoring template for this target ONLY.

4.  Utilize Administration Groups

admin_grp1

Administration Groups offer you advanced features and scalability to your Cloud Control environment that standard groups, and to a lesser extent, Dynamic groups, do not.  Line of business and life cycle management features that ensure you can break down notification groups, rule sets and other features, along with more advanced features with Database as a Service and other features to allow you to do more with less.  The natural life of a database environment is one of growth, so thinking ahead one, five and ten years is a great way to add value to the business as a database administrator.

5.  Create Metric Extensions for Unique Monitoring Scenarios

Enterprise Manager 12c is a self-service product.  So often there are unique situations that the business needs monitored for or the DBA notes creates a situation or outage, but isn’t, by default, a metric that comes with EM12c.  It’s easy enough to create a metric extension and take the concern and worry out of the situation, creating more value to the business.

6.  Add Corrective Actions, (Jobs)

Often when, a problem occurs, a DBA has a simple shell script or SQL they run and it corrects the problem.  If this is the case, why not have Cloud Control monitor for the issue, create an incident in the Incident Manager, send an email, then run the SQL or script as a Corrective Action?  The DBA will still know the problem occurred the next morning, but no one needs to be woke up to do what can be automated in the system.

corr_act

7.  Use Patch Plans and Automate Patching

I understand, really.  Something could somehow, somewhere, some rare time go wrong, but the patch plans you can create in Enterprise Manager are surprisingly robust and full featured. If you’re still doing patching the old fashioned way and not patching environments in the more automated and global patch plan way, you’re wasting time and let’s face it-  DBAs rarely have time to waste.  You are a resource that could be utilized for more important tasks and quarterly PSU patching is just not one of those.

8.  Centralize Database Jobs to EM Jobs

The common environment is structured with multiple DBAs, often with one DBA as primary to a database environment and the others playing catch up to figure out how the primary has the database set up.  My favorite DBA to work with once told me, “Kellyn, love your shell scripts.  They make the world go ‘round.  I just don’t want to try to figure out how you write shell at 3am in the morning or what kind of scheduler is used on all the OS’s you support!”  I realized that I owed him to centralize all my environments with an interface that made it easy for ANYONE to manage it.  No one had to look at cron, the task scheduler or a third party scheduling tool anymore.  Everything was in Enterprise Manager and no matter what operating system, it all looked very similar with the logs in the same place, found in the same tab of the UI.  Think about it- this is one you do for the team, move those jobs to inside Enterprise Manager, too…

9.  Put Compliance Framework into Place

Compliance is one of those things that seem a mystery to most.  I’m often asked why environments really need it and does it make sense.  It can seem overwhelming at first, but the idea that you know what database environments, hosts and such are out of compliance helps to distinguish how to get your database environment all set up to ensure that business best practices are in place-  You have a baseline of compliance standards for configuration settings, installation and real-time monitoring to view globally via EM12c.

10.  Plug-ins to Offer a Single Pane of Glass View

A database is a database or that’s how the business sees it.  I have almost as many years in SQL Server as I do in Oracle.  I’ve worked in Sybase, Informix, Postgres and MySQL.  After being hired for my Oracle DBA skills in every job I’ve held, it never failed-  within 6 weeks, a mission critical database environment on a secondary database platform was discovered that a group, often outside of IT had implemented and now needed critical support of.  Enterprise Manager offers plug-ins to support all of the above database platforms and more.  It offers plug-ins for engineered systems, storage arrays and other hardware that the DBA is now expected to manage, too.  Why manage all of this from multiple systems when you can easily create a single pane to ensure you’re covered?

So there you have it, my top ten list.  There are, of course, 100’s of other great features in EM12c, but make sure you are taking advantage of these in the list!

morning

Posted in Enterprise Manager, Oracle Tagged with: , , ,

July 1st, 2015 by dbakevlar

I had a chance to take a closer look at the Zero Data Loss Recovery Appliance, (ZDLRA) via Cloud Control today and really liked the dashboard that’s included in the plugin for the product.  This post isn’t going to cover the installation of the plugin or configuration-  I was just intrigued by the dashboard and wanted to go over the great features that are part of it.

Accessing the ZDLRA

As with any Targets, the ZDLRA is going to show up in the All Targets menu and also has a target type filtered drop down in the Targets menu.

rec1

All ZDLRA targets discovered and configured will be listed under this view, (I’ve renamed mine to protect the innocent.

rec2

The reason for more than one ZDLRA is to ensure their is redundancy for disaster recovery, which you’ll see in the upcoming overview of the current environment we have set up for our High Availability Enterprise Manager environment.  If we double click on the first ZDLRA listed, we’ll come to the main dashboard, which we’ll dig into section by section, left to right.

ZDLRA Targets

As with all targets, there is the system target which can be expanded to show all the associated database type targets that make up the ZDLRA.

rec3

Below the database targets associations are the associated hardware targets, which also can be expanded to show that the ZDLRA is based off engineered system architecture.

Each of the associated targets are links that can take you to the home pages for each target shown in the list.

Backup Information

To the right of the target and associated target information, we start to view the high level information about what the ZDLRA handles-  backups.

rec4

At the bottom of the above graphic, notice that the ZDLRA is replicating copies of all backups to the ZDLRA2 to ensure that more than one copy of backups exist for DR purposes.

Summary

If you switch from the Current Activity to the 24hr historical view, then you can see a lot more information about the backups the ZDLRA is responsible for:

sec31

As you’ll see above, there are 7 databases that are currently being backed up to the ZDLRA.  You can click on “Backup”, which will then display more details about the backups:

rec15

You’ll now see the database targets that are backed up, the backup pieces, the amount of data received in GB’s and if available, the transfer rate for the backup.  As the backups are being replicated to the ZDLRA2, there aren’t any copies being pushed to tape, which we can also verify in the Replication section at the bottom.

rec16

We can click on the link for any of the Backup Pieces and it will then display the job task ID, the target, the database, backup information, elapsed time and if there are any errors, along with completion state.

Back on the main dashboard page, we can then inspect some of the backup load to the ZDLRA.  As you move your cursor across the graph, more information will be shown:

rec7

The blue displayed in the graph is the Backup Data Received on the first ZDLRA.  Highlighting a section on the graph will show the amount of data and date received.

rec8

Highlighting the green section in the graph will bring up a quick highlight of the amount of data replicated from the ZDLRA to the ZDLRA2, including date.  Orange is also shown for backups copied to tape, (if any are performed.)

Performance and Space Information

On the right side of the pane, we start with compliance information, can view the rate at which data is received to the ZDLRA and the send rate for replication to the ZDLRA2.

rec5

At the lower part of this section, we can view the storage information, including the recovery and reserved space.

Incidents and Events

At the bottom right, we have any incidents and events connected to just the ZDLRA targets.  This filter assists the recovery administrator to zero in on the incidents that are important to their responsibility role.  There’s no reason to go to the Incident Manager view and filter there, the data is already presenting what is important to them from this dashboard.

rec6

If we were to take the second incident down, “The total space requie to meet the recovery….” We can select this incident and dig into the details about the incident.

rec9

The general information, including thresholds and ability to investigate metric threshold history, etc. is displayed, just as for any incident.

rec10

The Events tab shows the history and event sequence as the thresholds were reached during the incident.

rec11

In the All Updates tab, you can see what rule set and that it’s a system generated rule, nothing created by an admin, this came as part of the plugin for space consumption and requirements.

rec12

You can then take a look at Related Events and see if any associated targets may be related to the incident, may have caused the incident, etc.

The ZDLRA dashboard within EM12c is clean, provides the important information that you need to know about the backups, any DR configurations/processing, target health and filtered incident management that makes it easy to manage for any DBA.  We all know, a DBA is only as good as their last backup and this makes it easy to know you’re good…:)

Posted in Enterprise Manager, Oracle Tagged with: ,

June 23rd, 2015 by dbakevlar

The sales, support and technical teams were brought into the Denver Tech Center office to do some advanced training in Hybrid Cloud.  There were many take-aways from the days we spent in the office, (which is saying a lot-  most of you likely know how much I hate working anywhere but from home… :)) and I thought I would share a bit of this information with those that are aching for more details on this new and impressive offering from release 5.

If you’re new to Hybrid Cloud and want to know the high level info, please see my blog post on the topic.

Cloud Control Cloning Options

Cloud Control now includes a number of new options for database targets in EM12c.  These new drop down options include cloning to ease access to the new hybrid cloning.  Once you’ve logged into Cloud Control, go to Targets à Databases and then choose a database you wish to implement cloning features for.  Right click on the target and the drop downs will take you to the cloning options under Oracle Database –> Cloning.

tm_creation

There will be the following choices from this drop down:

  • Clone to Oracle Cloud-  ability to directly clone to the cloud from an on-premise database.
  • Create Full Clone-  Full clone from an image copy or an RMAN backup.
  • Create Test Master- Create a read-only test master database from the source target.
  • Enable as a Test Master- Use the database target as a test master, which will render it read-only and it would rarely be an option for a production database.
  • Clone Management-  Manage existing cloning options.

Using a test master is essential for snap clones, which are a great way to offer great space savings and eliminates the time that is required for standard cloning processes.  The test master is in a read only mode, so it will need to be refreshed or recreated with an up to date copy, (which will then be another option in the drop down, “Disable Test Master”) for new cloning procedures.

Snapshot Clone

For the example today, we’ll use the following production database:

tm1

We’ll use an existing test master database to perform our clone from:

tm4

We can right click on the database and choose to create a clone.  This is going to be an artifact via a snapclone, so keep this in mind as we inspect the times and results of this process.

tm8

Upon choosing to create a snapshot clone of a pluggable database.  This will then create snapshot clone, each clone is just a copy of the file header with block changes involved on the read only or read-write clone.

Once you fill out the pertinent data for the clone, using the correct preferred credentials with SYSDBA privileges, name the new pluggable database, the name you’d like it displayed in  Cloud Control as and enter the PDB  administration credentials, password and confirm the password.  Once that’s done, choose if you’d like to clone it to a different container, (CDB) than what the source resides on and then add the database host and ASM credentials.

Once you click next, you have more advanced options to view and/or setup:

tm10

The advanced options allow you to change the sparse disk group to create the clone on, storage limits and database options.

tm11

You can then choose to have data masking to protect any sensitive data, but keep in mind, once you do so, you will no longer be using a snapclone due to the masking, but the option to implement it at this step is an option.  You can also set up any pre, post or SQL scripts that need to be run as part of the clone.  This could include resetting sequences, user passwords, etc.

The next step allows you to schedule the clone in the future or run immediately.

tm12

You can also choose the type of notifications, as this is simply an EM Job that is submitted to perform the cloning process.  Once you’ve reviewed the cloning steps chosen via the wizard, you can then submit.

tm13

Once the jobs been submitted, the submitter can monitor the job steps:

tm14

 

Success

Once the clone has completed, you can view each of the steps, including the time each took.

tm15

The source database was over 500 GB and was cloned in less than one minute!  You also will see the new cloned database in the targets list:

tm16

If curious, note that this is a fully cloned database that is on ASM, which you can view, just as you would for any other database.

Again, note the size and that this can be managed like any other database that you would have created via a DBCA template or through a standard creation process.

tm17

More to come soon and thanks to Oracle for letting us get our hands on the new 12.1.0.5 hybrid cloning!

Posted in Cloud, Enterprise Manager Tagged with: , , ,

June 18th, 2015 by dbakevlar

Last week’s release of 12.1.0.5 was a pleasant surprise for everyone out in the Oracle world.  This release hit the bulls-eye for another cloud target of Oracle’s, announcing the introduction of Enterprise Manager 12c’s offering a single pane of glass management of the hybrid cloud.  The EM12c team has been been trained and testing out the new features of this release with great enthusiasm and I have to admit, pretty cool stuff, folks!

Why is the Hybrid Cloud Important?

Many companies are still a bit hesitant to embrace the cloud or due to sensitive data and security requirements, aren’t able to take advantage of cloud offerings for their production systems.  Possessing a powerful tool like Enterprise Manager to help guide them to the cloud could make all the difference-

hybcld1

 

You’re going to start hearing the EM folks use the term, “Single Pane of Glass” a lot in the upcoming months, as it’s part of the overall move, taking Enterprise Manager from the perception that EM is still a DBA tool and getting everyone to embrace the truth that EM12c has grown into an infrastructure tool.

What is the hybrid cloud?

As we’ve discussed the baby-steps that many companies are taking, (vs. others that are jumping in, feet first! :)) with the hybrid cloud, the company can now uphold those requirements and maintain their production systems within on-premise sites, but enforce data masking and sub-setting, the sensitive data is never presented outside the production database, (including to the test master database that is used to track the changes in the snapclone copies…)  This then allows them with Database as a Service to clone development, test, Q&A environments to a less expensive cloud storage platform without exposing any sensitive data.

clone_datamask

Once the datamasking or any other pre-clone data cleansing/subsetting is performed, then the Test Master database is created and can be used to create as many snap clones as needed.  These snaps can be used for development, QA or testing.  The space savings continues to increase as the snapclone copies are added, as the block changes are most of the space consumption in the test master database.  This can add up to a 90% storage savings over traditional database full copies.

Hybrid Cloning

The power of hybrid cloning is the Hybrid Cloud Gateway, a secure SSH tunneling, that allows seamless communication between on-premise systems and the cloud.

hybcld3

Types of Clones

There are four types of clones currently offered with Hybrid cloning-

  • On-premise source cloned to the cloud.
  • Cloud source, cloned to on-premise.
  • Cloud source cloned in the cloud.
  • Migrate from a schema in a database to a PDB in the cloud.

Simplicity is Key

The user interface is simple to engage, use to create a clone or clones, save off templates, build out a catalog to be used for a self-service portal and when cloning, the status dashboard is a great quick view of success on cloning steps:

hybcld4

If deeper investigation of any single step needs to be perforrmed, the logging is no different than inspecting an EM job log, (because an EM job is exactly what it is… :)):

hybcld5

I’ll be returning from Europe soon and hope to do more with the product, digging into this great new feature, but until then, here’s a great overview of 12.1.0.5’s brand new star!

 

Posted in Cloud, Enterprise Manager Tagged with: ,

May 18th, 2015 by dbakevlar

When you need to have information about one SQL_ID and don’t need everything and the kitchen sink, there are a few different ways to collect this via Oracle.  I’m surprised how rarely this is covered in performance tuning/optimization, (whatever the current “acceptable” term is for fixing a database when there are performance issues arise… J)  classes, manuals and documentation.  I’d like to cover two of my favorite here and know, there are other options that I won’t be digging into, (SQLd360, SQLTXPLAIN, tracing, etc…)

The Workload Repository SQL Report

I also refer to this as the “SQL ID Specific AWR Report”.  Now many of you may have come across this report via Cloud Control when you go into a unique SQL_ID in the console and upon looking at the details, you notice there are two options below the wait event actrivity, one for run an AWR SQL Report and one for an ASH report.  What many don’t realize, is that this report also is available from the command line on the database host.

awrsq1

Report Location:  $ORACLE_HOME/rdbms/admin/awrsqrpt.sql

Report Format:  HTML from Cloud Control and the command line, TEXT is also an option from SQL Plus.

Requirements:  Execution DBMS_Repository and privileges to run reports and store on the host or if you run via Cloud Control, then you just can save off the HTML report to your desktop.

Licensing:  Yes, you do need diagnostics pack for this feature.

What’s so great about this report?

This report, like “Search SQL”, which we’ll go over soon enough, allows you to go through all of the AWR history you possess and let you know the minimum and maximum runs, execution plans for each plan hash value that exists, along with elapsed time and other pertinent statistical information about the SQL ID in question.

Notice that the version of the database, that it’s a RAC instance, along with which node in the RAC environment the statement was run in, is quickly stated, so if anyone questions this, you can quickly locate this information at the top of the report.

awrsq2

The report will display the SQL_ID in question, along with identifying if there was more than one plan hash value, (which the report will number and identify by PHV, (plan hash value) and the plan hash value ID.  This will make it easy to know which execution plan and events belong to which if there is more than one.  For our example, we do only have one here.

awrsq3

Remember for the elapsed time, this is for the total time for the snapshot timeline involved and you need to look at the per execution to see how long each one actually required to complete.  These statistics are very helpful when identifying the wait event(s) that are involved in the performance challenge.

awrsq4

The plan is nicely formatted and you can step into the time used for each step in the process.  I wish it defaulted with temp usage, but alas, that is not to be in the HTML report.  This is the only area of the report where I prefer the HTML report over the text, as with this report, there is considerable size difference for storing the report and I hate giving up any data…. 🙂

If you’ve had an outage during your snapshot timeline, then this report, just like AWR, is dependent upon up time and will notify you of the issue, requiring you to pick snapshots that have no downtime involved.

Search SQL

Search SQL has been around since EM10g, but it’s one of the least known features.  I like it as it gives a clean, high level view of performance over a large period of time by snapshot and then you can dig down to do research from there.

Report:  None really from the command line-  it’s a compilation of report features.

Report Format:  None-  this is via the console, but some of the reports can be saved off, (such as ADDM and other reports…)  you can also copy and paste or screenshot pages.

Licensing:  Due to the way we are using it here, both the diagnostics and the tuning pack will be required.

Using Search SQL

Using our same run as we used above, we can identify a SQL_ID that is causing us concern and we want to investigate further-

awrss1

Copy the SQL_ID so you can paste it in the proper field in the Search SQL console.  You’re now going to leave the Top Activity, (or ASH Analytics, whichever you are using, both work great for this type of initial identification…)

awrss2

Each Search SQL console is database specific, so click on the Performance drop down, SQL and then Search SQL for EM12c, (EM10g and EM11g, this is directly under the Performance menu….)

awrss3

Update the following sections:

  • Add AWR Snapshots and Baselines.
  • Choose All snapshots to verify that you are seeing the entire performance data contained in your AWR.  If your database is set with the default time for retaining AWR, this is 8 days for 11.2 and above.
  • Enter in or paste the SQL ID into the SQL ID value section.
  • Click on Search, (don’t click on Show SQL, as I know this is a bit misleading, but it will just show you the SQL statement to be run by Cloud Control…)

awrss4

Four tabs will be shown, don’t be surprised if you don’t notice the completion as the cache often only returns one row:

  • Cursor Cache, (in the cache currently)
  • AWR Snapshot data, (where we’ll spend most of our time…)
  • AWR Baselines, (good data to have to back up changes in plan hash values if you have your database set to update automatically or if someone’s put in a plan from a baseline run.)
  • SQL Tuning Set if you’ve chosen to examine those, too.

You CAN examine the existing run and looking at it this way really shows you how it’s currently running vs. how it has run, but if you’re just looking for when it really ran well, then skip the curor section and jump to the AWR snapshots to see this SQL ID for ALL snapshots in the AWR-

awrss5

You’re seeing all of this from the largest elapsed time to the least descending, so although we’re able to view any patterns on how the SQL may be degrading over time, note if there are plan has value changes, (which in our example, there are and we didn’t see that for this statement in our AWR SQL Report) and click on the “Next 25” link at the bottom to get to the lesser elapsed run times.

awrss6

There are some runs with 0.00 on the run times.  You can investigate these, but we’d find that they hadn’t completed or no rows were returned.  What is more important are the runs where we see overall run times that are efficient in the elapsed time and that rows were returned.  I’ve marked them in our report.

awrss7

Upon closer inspection, you’ll also notice that they all belong to the same plan hash value except for one run.  This is a good indicator that this plan has value should be identified as the best over all for this query.

Now as we all know, it’s not just about the one query.  The database could impact the run as well.

  • What resource limitations were in play?
  • Where there any concurrency issues?
  • Where there any other factors that overall effect performance?

On the right, you’ll notice that the snapshots are links.  You can use these to inspect the over all performance of the database during the snapshot interval the execution occurred.

awrss9

Here’s all that high level statistics info at the database level.  You also have the opportunity to run an ADDM report from here and see that “all that’s important” view of your database.

awrss8

Now the second tab is going to offer you the workload repository report for this database, including the summary to tell you, is there anything up in this database at that time that you should be aware of.

You can go back to the Search SQL report and then look at the individual executions listed, too.

awrss12

This will take you to the SQL Details page for whichever one you click on-

awrss10

Just as with any SQL Details, you can see the activity, the plan, plan control, etc.  The example above is for an execution from the day before, so it’s kind of an easy way to get to it vs. going and searching historically through Top Activity.

Baselines

Now why did we choose the Baselines as part of our search?  Baselines, especially when they are automatically implemented, can impact performance at times and knowing this information can quickly let you know it wasn’t another culprit involved.

awrss11

We can see what plan hash value is attributed to what baseline, the overall elapsed time and you can click on the Baseline Name link to see when and if it was implemented to diagnose if its part of your problem.

Hopefully this gives you some more tools for your arsenal when that boss or peer is standing at your desk and demanding an answer to a challenge-  what more can we ask for but answers! 🙂

Posted in ASH and AWR, EM12c Performance, Oracle Tagged with: , ,

May 4th, 2015 by dbakevlar

There may be a reason that one needs to re-add a database to the AWR Warehouse.  This is a new opportunity for me to learn and offer assistance, but I’m working off the grid to figure out a solution.  Luckily, it’s just a matter of creating a new process from pre-existing code and processes.

The process requires both the AWR Warehouse UI and a SQLPlus connection.

1.  Disable the snapshots

2.  Extract the AWR for the Database

3.  Drop the database from the AWR Warehouse

4.  Add the database back into the AWR Warehouse with the retention time or any other configuration change.

5.  Update the configuration table in the source database to show the correct snapshots.

Disable Snapshot Uploads

Log into the AWR Warehouse console and locate the database that needs to be re-added:

awrw_rbld1

From the AWR Warehouse UI, click on the database and then choose to disable the snapshot uploads for the database in question.

awrw_rbld2

 

Extract AWR Data for the Database

From the AWR Warehouse host, log into the AWR Warehouse repository, (database) and now extract the AWR data.  This will use the formal AWR data extract procedure, found in the $ORACLE_HOME/rdbms/admin directory and yes, it works with the AWR Warehouse.

~~~~~~~~~~~~~
AWR EXTRACT
~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~ This script will extract the AWR data for a range of snapshots ~
~ into a dump file. The script will prompt users for the ~
~ following information: ~
~ (1) database id ~
~ (2) snapshot range to extract ~
~ (3) name of directory object ~
~ (4) name of dump file ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SQL> @$ORACLE_HOME/rdbms/admin/awrextr.sql;

A list of databases will be displayed that you can choose from.  In the AWR Warehouse, all the source databases will be shown:

Databases in this Workload Repository schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
DB Id DB Name Host
------------ ------------ ------------
* 1027807260 AWRWHSE AWRWHSE_HOST
9352681 APPSB1 HOST1
4212609618 AASDB  HOST2
4212608695 AASDB HOST3
so on and so forth....

Enter in the DBID for the database from the list.  We’ll use the first AASDB from the list for our example:

Enter value for dbid: 4212607231
Using 4212607231 for Database ID
Specify the number of days of snapshots to choose from
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Entering the number of days (n) will result in the most recent
(n) days of snapshots being listed. Pressing <return> without
specifying a number lists all completed snapshots.
Enter value for num_days:
Listing all Completed Snapshots

As I didn’t enter a number in for the number of days, you can see I want ALL of the AWR data for this database.  Now the extract script will ask for snapshot information.  Again, we want all of it and the report will show us all of the snapshots.  If there are so many, that you can’t see the first one, use the AWR Warehouse UI to see the first and last snapshot for this step.

Specify the Begin and End Snapshot Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter value for begin_snap: 1777
Begin Snapshot Id specified: 1777

Enter value for end_snap: 2303
End Snapshot Id specified: 2303
Specify the Directory Name
~~~~~~~~~~~~~~~~~~~~~~~~~~
Directory Name Directory Path
------------------------------ -----------------------------------
DATA_PUMP_DIR /u01/admin/AWRWHSE/dpdump/
OPATCH_INST_DIR /u01/home/oracle/12.1.0.2/OPatch
OPATCH_LOG_DIR /u01/home/oracle/12.1.0.2/QOpatch
OPATCH_SCRIPT_DIR /u01/home/oracle/12.1.0.2/QOpatch

Choose a Directory Name from the above list (case-sensitive).
Enter value for directory_name: DATA_PUMP_DIR
Using the dump directory: DATA_PUMP_DIR
Specify the Name of the Extract Dump File
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The prefix for the default dump file name is awrdat_1777_2303.
To use this name, press <return> to continue, otherwise enter
an alternative.
Enter value for file_name: awrw_AASDB
Using the dump file prefix: awrw_AASDB
End of AWR Extract
SQL> !ls -la /u01/admin/AWRWHSE/dpdump/awrw_AASDB.dmp
-rw-r----- 1 oracle oinstall 268779520 May 4 16:49 /u01/admin/AWRWHSE/dpdump/awrw_AASDB.dmp

Now that the extract is completed and the data is safe, it’s time to return to the AWR Warehouse console.

 

Remove the Database from the AWR Warehouse

Click on the database in the AWR Warehouse and choose to remove it from the repository.

awrw_rbld3

 

Once the database has been removed, go through the proper steps to add it again to the AWR Warehouse.  Once added, disable the upload of snapshots.  This will keep from concurrency issues and other issues until we’re finished.

Reload the AWR Data

Log back into the AWR Warehouse Repository database, it’s time to load the data back into the AWR Warehouse.

$ORACLE_HOME/rdbms/admin/awrload.sql;

Specify the Name of the Dump File to Load
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Please specify the prefix of the dump file (.dmp) to load: /u01/admin/AWRWHSE/dpdump/awrw_AASDB.dmp
Pressing  will result in the recommended default
tablespace (identified by *) being used.

Enter value for default_tablespace: MGMT_TABLESPACE

As the space will be used at some point anyway, (the mgmt_tablespace is expected to grow, so no concern here and the temporary user utilized for the process will be dropped at the end.

Choose to use the standard TEMP tablespace and proceed forward past this step.  As the AWR objects are already built into the MGMT_TABLESPACE, there is no need to remap from the SYSAUX as the process in a standard database would require.

Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "AWR_STAGE"."WRH$_SQL_PLAN"                 89.485 MB   329981 rows
. . imported "AWR_STAGE"."WRH$_SQLTEXT"                  192.34 MB    398276 rows
. . imported "AWR_STAGE"."WRH$_SYSMETRIC_SUMMARY"        232.12 KB    1129472 rows
.......
........
	Append Data for "AWR_STAGE".WRH$_SGASTAT.
INSERT /*+ APPEND */ INTO SYS.WRH$_SGASTAT (SNAP_ID, DBID, INSTANCE_NUMBER, NAME, POOL, BYTES) SELECT SNAP_ID,
3228342000, INSTANCE_NUMBER, NAME, POOL, BYTES FROM "AWR_STAGE".WRH$_SGASTAT WHERE DBID = :c_dbid

Update Metadata Table

Now that we’ve loaded the data back in, we need to update the metadata table to show that more than the initial data exists.

select * from dbnsmp.caw_load_metadata
where dbid=<dbid for database in question>;
aasdb3 oracle_database 4212607231 1777 1778
27-APR-15 27-APR-15 3 1

Update the table to reflect the new data, as there should only be one upload max that’s been sent over. If there is more than one, you’ll need to change the update statement to just reflect the one row that shows the max(end_snap_id) as the AWR Warehouse loads from the lowest snapid to the most recent.

select * from dbsnmp.caw_load_metadata
where dbid=421607231;
Update dbsnmp.caw_load_metadata
set end_snap_id=2303
where dbid=4212607231
and dump_id=<most recent dump_id>;
commit;

If there isn’t any data at all, then force one upload and then edit it.  This is easily done from the Console:

awrw_rbld4

 

Proceed back and update the row in the metadata table and then check the UI to verify that the beginning and end snapshot looks correct.  You should now have all the AWR data loaded back into the AWR Warehouse! And that’s, how to reload AWR Data from a database that must be removed and re-added to the AWR Warehouse.

This is a new process for me, so please, let me know if you note anything amiss and I’ll be glad to research it!

 

Posted in AWR Warehouse, Database, Oracle Tagged with: ,

April 28th, 2015 by dbakevlar

When sizing the AWR Warehouse, one of my requirements is to have certain reports for a sampling of databases that will source the AWR Warehouse. This report provides me the right information to create the correct sizing requirements vs. any assumptions done with other choices. The report is the General AWR Information report and can be found in the $ORACLE_HOME/rdbms/admin directory.  It provides the following information:

  • Time of Report
  • AWR Retention and interval settings, including if non-default setting.
  • Schema percentage of space used in the SYSAUX tablespace.
  • Objects and space consumption in the SYSAUX tablespace, including breakdown of the AWR space usage by type.
  • Automated tasks scheduled.
  • ASH Usage Info
  • Foreground vs. Background processing info.

I’ve now run into a few projects where questions were raised on why so much space was required and there is sometimes an issue with the data being retained in the AWR that must be identified or addressed that will impact the long term size demand on the warehouse before the data is brought over from the source database, (target) via the ETL to the AWR Warehouse.  Knowing how to identify this is crucial, but many folks aren’t looking at AWR data space usage regularly, so how would you know?  Well, that’s what this post is about and hopefully will save me time with how much is on my plate these days… 🙂

The General AWR Information Report

To run the AWR Info report, log into the host of the database in question as a user that has rights to AWR reports.  From the directory you would like to have the report housed, run the following:

SQL> $ORACLE_HOME/rdbms/admin/awrinfo.sql;

Once the report is generated, simply open it in a text editor on the host or FTP it to your workstation.

We’ll start with what a standard AWR Info report with “normal” space usage looks like and use it as a baseline.  This will help you, as a DBA understand what is common for AWR space consumption.

Our Standard retention and interval is 8 days and 60 minute intervals on snapshots and the space usage for the SYSAUX tablespace looks like this:

*****************************************************
(1a) SYSAUX usage - Schema breakdown (dba_segments)
*****************************************************
| 
| Total SYSAUX size 2,993.9 MB ( 74% of 4,046.0 MB MAX with AUTOEXTEND OFF ) 
| 
| Schema SYS occupies 2,750.9 MB ( 91.9% ) 
| Schema XDB occupies 67.3 MB ( 2.2% ) 
| Schema AUDSYS occupies 65.4 MB ( 2.2% ) 
| Schema MDSYS occupies 61.7 MB ( 2.1% ) 
| Schema ORDDATA occupies 16.1 MB ( 0.5% ) 
| Schema SYSTEM occupies 15.7 MB ( 0.5% ) 
| Schema WMSYS occupies 7.1 MB ( 0.2% ) 
| Schema EXFSYS occupies 3.7 MB ( 0.1% ) 
| Schema CTXSYS occupies 3.7 MB ( 0.1% )

Non Default AWR Settings

If the settings have been changed from the default, the AWR info report will display the settings and let you know they aren’t the default. Space consumption will change vs. what you see in our first example, too.

Warning: Non Default AWR Setting!
——————————————————————————–
Snapshot interval is 30 minutes and Retention is 8 days    (5.2 GB, vs. 3GB for a 60 min. interval.)

OR

Snapshot interval is 60 minutes and Retention is 42 days  (10.5 GB)

With an increase in interval or retention, an increase in space consumption will result, but it won’t be 1:1.  There are two features that impact what space is required-  rollup, which saves space, then partitioning, that requires a bit more, so approximately 70% increase on average with the interval to every 30 minutes.

When an increase in retention of AWR data is implemented, then you should calculate about 2.5G of data for each one week of AWR data retained, at 1hr interval on snapshots, (this depends on version of the database, too.  AWR in 10g is much smaller than 11g, which is also smaller than 12c…)

AWR Components

We also need to look at the consumption used by a standard AWR schema breakdown to understand WHAT components are using the space:

**********************************
(3a) Space usage by AWR components (per database)
**********************************
COMPONENT MB % AWR KB_PER_SNAP MB_PER_DAY MB_PER_WEEK TABLE% : INDEX% 
--------- --------- ------ ------------ ---------- ----------- ---------------- 
FIXED 1,559.1 63.3 7,750 181.6 1,271.5 45% : 55% 
EVENTS 489.9 19.9 2,435 57.1 399.5 40% : 60% 
SQL 238.3 9.7 1,184 27.8 194.3 64% : 36% 
SPACE 111.1 4.5 552 12.9 90.6 63% : 37% 
ASH 35.3 1.4 175 4.1 28.7 83% : 17% 
SQLPLAN 11.0 0.4 55 1.3 9.0 64% : 36% 
SQLTEXT 0.9 0.0 5 0.1 0.8 87% : 13% 
SQLBIND 0.6 0.0 3 0.1 0.5 50% : 50% 
RAC 0.6 0.0 3 0.1 0.5 50% : 50%

Note that fixed objects are at the top of the list, followed by events, SQL, space and then, ASH.  This is how the flow of greatest to least should commonly be displayed.

Now lets looks at an AWR Info report where the data consumption is experiencing an issue:

********************************** 
(3a) Space usage by AWR components (per database) 
********************************** 
 
COMPONENT MB % AWR KB_PER_SNAP MB_PER_DAY MB_PER_WEEK TABLE% : INDEX% 
--------- --------- ------ ------------ ---------- ----------- ---------------- 
ASH 2,410.3 42.5 1,494 70.0 490.2 89% : 11% 
FIXED 2,149.7 37.9 1,332 62.5 437.2 48% : 52% 
EVENTS 489.7 8.6 304 14.2 99.6 43% : 57% 
SPACE 224.4 4.0 139 6.5 45.6 58% : 42% 
SQL 160.6 2.8 100 4.7 32.7 55% : 45% 
SQLPLAN 82.0 1.4 51 2.4 16.7 67% : 33% 
RAC 58.3 1.0 36 1.7 11.8 70% : 30% 
SQLTEXT 7.3 0.1 5 0.2 1.5 96% : 4% 
SQLBIND 6.0 0.1 4 0.2 1.2 33% : 67%

Note that the ASH data is the first component listed and the size is extensively larger than the FIXED, EVENTS, etc.  There are numerous reasons for this to have occurred, so we’ll investigate what could have caused the increase in space consumption, as over time, the extended retention into the AWR Warehouse will consume more space on the destination side, increasing requirements for the AWR Warehouse.

Why Is ASH Using So Much?

First, we’ll check to see what the minimum and maximum snap_id’s from both the dba_hist_snapshot in comparison to the AWR:

select min(snap_id),MAX(snap_id) from dba_hist_snapshot;

MIN(SNAP_ID) MAX(SNAP_ID)
------------ ------------
       15027        15189

select min(snap_id),MAX(snap_id) from WRH$_ACTIVE_SESSION_HISTORY;

MIN(SNAP_ID) MAX(SNAP_ID)
------------ ------------
           1        15189

As you can see, the AWR contains ASH data from the first snap_id when the dba_hist snapshot shows that only data from 15027 on should exist.

We’ll next check for orphaned rows of ASH data in the AWR:

SELECT COUNT(*) FROM wrh$_active_session_history a
  WHERE NOT EXISTS
  (SELECT 1
  FROM wrm$_snapshot
  WHERE snap_id       = a.snap_id
  AND dbid            = a.dbid
  AND instance_number = a.instance_number
  );

If this exists, follow the steps from Oracle to Manually Purge the Optimizer Statistics & AWR Snaphots to Reduce Space Usage of SYSAUX Tablespace (Doc ID 1965061.1) to split the partitions and purge the data manually from the AWR from the SOURCE DATABASE, (target) to address before the ETL extracts and load the data to the AWR Warehouse.

ASH Parameters

The next reason for the extensive ASH data in the AWR could result in a change to the parameters involving how ASH data is written to the AWR.  I’ve only recently heard that some shops are doing this as a way to “audit” the SQL happening in their databases.  I have to admit, I would prefer to see DBAs use auditing features vs. use ASH samples to track this, but it is happening and they should expect the following:

1.  Extensive space usage by the AWR

2.  Inaccurate results in ASH and AWR reports due to Oracle expecting only 1:10 samples existing in the AWR and having 10:10 will impact the results.

The parameters controlling this feature are underscore parameters and should only be changed under the guidance of Oracle.

_ash_sampling_interval = 100  The interval that ASH samples, lessened, causing samples to be created more often than the default of 1000.

_ash_sample_all = TRUE  True results in samples of even inactive sessions to be created, increasing the amount of ASH data by 10X or more.

_ash_disk_filter_ratio = 1 Would result in ASH writing all samples to the AWR instead of 1:10.

Once you’ve addressed any issues in storage of the AWR and loaded all snapshots to your new AWR Warehouse, also remember to “dial down” the retention in the source database to a the default of 8 days, (or something close) and shrink the SYSAUX tablespace to reallocate the space back to the database, having no longer need of the space it once consumed.

The AWR Warehouse does require considerable storage for the performance data housed within the Automatic Workload Repository, but with the right process to inspect what is kept in your AWR before building out your environment, you can avoid having to allocate more storage than you really need to.

 

 

Posted in ASH and AWR, AWR Warehouse, Oracle Tagged with: , ,

April 21st, 2015 by dbakevlar

A common issue I’ve noted are dump files generated from the AWR Warehouse, but upon failure to transfer, the dumpfiles simply exist, never upload and the data is stuck in a “limbo” state between the source database, (target) and the AWR Warehouse.  This can be a very difficult issue to troubleshoot, as no errors are seen in the actual AWR Warehouse “View Errors” and no data from the source is present in the AWR Warehouse.

awrw_nw2

Empowered by EM Jobs

If you go to Enterprise –>  Jobs –> Activity and inspect the Log Report after a search for %CAW% jobs that perform the extraction, transfer and load that experienced a problem, you will then be able to view the error and can inspect the details of the issue.

awr_nw4

 

If you double click on the job in question and note in the example above, you’ll notice that the jobs have the naming convention of CAW_RUN_ETL_NOW.  This is due to a force run via the AWR Warehouse console, (Actions –>  Upload Snapshots Now.)  If the job was a standard ETL run, the naming convention will be CAW_TRANSFER.

The job is a multi-step process, so you will see where the extraction step failed.  This is a bit misleading, as the previous failure that set into motion was a preferred credential issue that stopped the transfer step to the AWR Warehouse.  If you look far enough back in the jobs, you’ll find the original error, but now we are stuck in a loop-  the source can’t go forward once the credentials are fixed and yet it’s difficult for someone unfamiliar with the process to know where it all went wrong.  The first recommendation is often to remove the database and re-add it, but in this scenario, we are going to kickstart the process now that the credentials have been fixed.

Digging into a Job Error

awrw_nw5

 

 

As we stated earlier, you’ll see in the steps, that the extract failed, but the transfer would have been successful if a file had been created.  Double click on “Failed” to see the error that occurred:

awrw_nw4

 

In the SQL output, the error states, ORA-20137:  NO NEW SNAPSHOTS TO EXTRACT

Now we know that no files have been uploaded to the AWR Warehouse, yet, the logic written to the AWR Warehouse package that is in the DBNSMP schema in the source database thinks it’s already pulled all of these snapshots to an extract and created a dumpfile.  This error is very clear in telling the user what is going on.  There is a data telling the ETL process the data ALREADY has been extracted.

Disclaimer:  This solution we are about to undertake is for a BRAND NEW ADDITION TO THE AWR WAREHOUSE ONLY.  You wouldn’t want to perform this on a source database that had been loading properly and then stopped after successful uploads to the warehouse, (if you have one of those, I would want to proceed differently, so please keep this in mind before you attempt this in your own environment….) This fix is also dependent upon all preferred credential issues to be resolved BEFOREHAND.

The Kickstart

To “kickstart” the process after a failure, first, verify that there are no errors that aren’t displaying in the console:

select * from dbsnmp.caw_extract_metadata;

Next, gather the location for the dumpfiles:

select * from dbsnmp.caw_extract_properties;

There will be one line in this table-  It will include the oldest snapID, the newestID and the location of the dumpfile, (often the agent home unless otherwise configured.) This is the table the logic in the package is using to verify what has been already extracted.  We now need to remove this tracking information and the pre-existing dumpfiles created in the previous failed processes:

  1. Make a copy of this table, (create table dbsnmp.caw_extract_prop2 as select * from dbsnmp.caw_extract_properties;)
  2. Truncate dbsnmp.caw_extract_properties table.
  3. Delete the extract, (dumpfiles) from the directory shown in the caw_extract_properties table.  Don’t remove anything else from that directory!
  4. Log into the AWR Warehouse console.
  5. Click on the source database you just scoured of the ETL extract files.
  6. Highlight the database, click on Actions, Click on “Upload Snapshots Now.”
  7. View the job via the link displayed at the top of the console and monitor to completion.
  8. Once the job has succeeded completely, remove the dbsnmp.caw_extract_prop2 table and the backup files you moved that were created earlier from failed extracts.

You should now see successful upload jobs from this point on in the job, along with data in your AWR Wareshouse:

awrw_nw6

Posted in AWR Warehouse, Enterprise Manager Tagged with: ,

April 9th, 2015 by dbakevlar

Here’s a quick guide to my schedule of sessions at IOUG Collaborate 2015 for this year.  I’m looking forward to seeing everyone next week, (I’ll be arriving on Saturday, so I really should say in a couple days) and have no doubt we’re up for another phenomenal conference at Las Vegas, this year at the Mandalay!

Additionally, there are some great opportunities in professional development at the IOUG Strategic Leadership Program.  I’ll be speaking with James Lui on how to improve your value with personal brands and social media.

I’m the WIT luncheon speaker on Monday, so for those of you that signed up for this great, yearly event for women in tech before it sold out, I’m looking forward to meeting as many of the powerful women in our industry, (and hopefully a few men, too!) before and after the event.

There are also a few SIG’s that I’ll be attending that aren’t on the schedule:

Tue-12:30 p.m. – 1:00 p.m.Cloud Computing and Virtualization SIG Meeting (ID: 943)

Wed-12:30 p.m. – 1:00 p.m.Oracle Enterprise Manager SIG Meeting (ID: 949)

Id Number Date and Time Session Title Product Line(s) Session Track(s) Session Room
976 Sun. Apr. 12
9:00 am – 1:00 pm
Everything I Needed to Know About Enterprise Manager I Learned at COLLABORATE – Hands-on Lab Oracle Enterprise Manager Manageability Palm D
0 Mon. Apr. 13
9:15 – 10:15 am
Zero to Manageability in One Hour: Build a Solid Foundation for Oracle Enterprise Manager 12c Oracle Cloud – DBaaS/PaaS| Oracle Database| Oracle Enterprise Manager Manageability Banyan B
112 Mon. Apr. 13
3:15 – 4:15 pm
The Power of the AWR Warehouse Oracle Enterprise Manager Manageability Banyan B
967 Mon. Apr. 13
4:30 – 5:30 pm
IOUG Strategic Leadership Program: Staying on Top of Key Trends and Growing Your Personal Brand with Social Media Applicable to All Professional Development Banyan E
1003 Wed. Apr. 15
2:00 – 3:00 pm
OakTable: TED-Style Lightning Talks (2:00-3:00pm) Applicable to All Database Mandalay K
986 Mon. Apr. 13
12:45 – 1:45 pm
Twitter 101 for Oracle Professionals Applicable to All| Professional Development Banyan D

Thanks to everyone at IOUG for the opportunity to be so involved with this great conference and see everyone soon!

Posted in Database, DBA Life, Enterprise Manager Tagged with: ,

March 31st, 2015 by dbakevlar

Mauro decided to give me another challenge-  run SQLd360 against the AWR Warehouse and let him know how it does straight “out of the box”.  It’s a simpler installation process than SQLTXPLAIN, you simply unzip and run, (no installation of any packages…)

I chose a SQL_ID from one of the source databases loaded and ran it for a 31 day history:

SQL> @sqld360.sql 8ymbm9h6ndphq T 31

I saw a few errors scroll by on the screen as the process to collect the data and zip it up ran.

There isn’t the same log file to see on successful runs to view parameters, but the SQL is very straight forward and right in the SQL directory, so easy enough to see where the DBID is populated as part of the sqld360_0b_pre.sql:

-- get dbid
COL sqld360_dbid NEW_V sqld360_dbid;
SELECT TRIM(TO_CHAR(dbid)) sqld360_dbid FROM v$database;

As this is pulled from the v$ objects, it’s going to populate incorrectly for the AWR Warehouse, which requires a query of the AWR, preferably the DBA_HIST_DATABASE_INSTANCE for this information.

SQL> desc dba_hist_database_instance;
 Name Null? Type
 ----------------------------------------- -------- --------------
 DBID NOT NULL NUMBER
 INSTANCE_NUMBER NOT NULL NUMBER
 STARTUP_TIME NOT NULL TIMESTAMP(3)
 PARALLEL NOT NULL VARCHAR2(3)
 VERSION NOT NULL VARCHAR2(17)
 DB_NAME VARCHAR2(9)
 INSTANCE_NAME VARCHAR2(16)
 HOST_NAME VARCHAR2(64)
 LAST_ASH_SAMPLE_ID NOT NULL NUMBER
 PLATFORM_NAME VARCHAR2(101)
 CON_ID NUMBER

Then the pre run script also queries the plan_table for the SQL_ID:

-- check if SQLD360 is getting called by EDB360
COL from_edb360 NEW_V from_edb360;
SELECT CASE WHEN count(*) > 0 THEN '--' END from_edb360
 FROM plan_table
 WHERE statement_id = 'SQLD360_SQLID' -- SQL IDs list flag
 AND operation = '&&sqld360_sqlid.'
 AND rownum = 1;

We will need it to call the DBA_HIST_PLAN_TABLE if we would want this data, but it does have a flag so that it can continue on as a stand alone run, so it’s not a game stopper.

We then get tripped up again when it looks for the instance number:

-- get instance number
COL connect_instance_number NEW_V connect_instance_number;
SELECT TO_CHAR(instance_number) connect_instance_number FROM v$instance;

Again, with the AWR Warehouse, we need to call upon the DBA_HIST_DATABASE_INSTANCE view after adding in a where clause for the dbname.

SELECT TO_CHAR(instance_number)
from dba_hist_database_instance where db_name=p_dbname;

As with SQLTEXPLAIN, we are shot on get_sdatabase_name_short, get_host_name and get_rdbms_version, too:

-- get database name (up to 10, stop before first '.', no special characters)
COL database_name_short NEW_V database_name_short FOR A10;
SELECT LOWER(SUBSTR(SYS_CONTEXT('USERENV', 'DB_NAME'), 1, 10)) database_name_short FROM DUAL;
SELECT SUBSTR('&&database_name_short.', 1, INSTR('&&database_name_short..', '.') - 1) database_name_short FROM DUAL;
SELECT TRANSLATE('&&database_name_short.',
'abcdefghijklmnopqrstuvwxyz0123456789-_ ''`~!@#$%&*()=+[]{}\|;:",.<>/?'||CHR(0)||CHR(9)||CHR(10)||CHR(13)||CHR(38),
'abcdefghijklmnopqrstuvwxyz0123456789-_') database_name_short FROM DUAL;
-- get host name (up to 30, stop before first '.', no special characters)
COL host_name_short NEW_V host_name_short FOR A30;
SELECT LOWER(SUBSTR(SYS_CONTEXT('USERENV', 'SERVER_HOST'), 1, 30)) host_name_short FROM DUAL;
SELECT SUBSTR('&&host_name_short.', 1, INSTR('&&host_name_short..', '.') - 1) host_name_short FROM DUAL;
SELECT TRANSLATE('&&host_name_short.',
'abcdefghijklmnopqrstuvwxyz0123456789-_ ''`~!@#$%&*()=+[]{}\|;:",.<>/?'||CHR(0)||CHR(9)||CHR(10)||CHR(13)||CHR(38),
'abcdefghijklmnopqrstuvwxyz0123456789-_') host_name_short FROM DUAL;
-- get rdbms version
COL db_version NEW_V db_version;
SELECT version db_version FROM v$instance;
DEF skip_10g = '';
COL skip_10g NEW_V skip_10g;
SELECT '--' skip_10g FROM v$instance WHERE version LIKE '10%';
DEF skip_11r1 = '';
COL skip_11r1 NEW_V skip_11r1;

You can understand how all of this is going to return the information on the AWR Warehouse repository, (actual database it’s run on) instead of the source database that the SQL_ID is from.

CPU and RAC Info

We have the same issue when it goes to gather the information on CPU, and RAC-

-- get average number of CPUs
COL avg_cpu_count NEW_V avg_cpu_count FOR A3;
SELECT ROUND(AVG(TO_NUMBER(value))) avg_cpu_count FROM gv$system_parameter2 WHERE name = 'cpu_count';
-- get total number of CPUs
COL sum_cpu_count NEW_V sum_cpu_count FOR A3;
SELECT SUM(TO_NUMBER(value)) sum_cpu_count FROM gv$system_parameter2 WHERE name = 'cpu_count';
-- determine if rac or single instance (null means rac)
COL is_single_instance NEW_V is_single_instance FOR A1;
SELECT CASE COUNT(*) WHEN 1 THEN 'Y' END is_single_instance FROM gv$instance;

The first two queries just need to update to use

WRI$_DBU_CPU_USAGE
CPU_COUNT

WRI$_DBU_CPU_USAGE
CPU_CORE_COUNT

And the final one can easily be updated to use the DBA_HIST_DATABASE_INSTANCE after passing in the db_name.

What does make SQLd360 great, is that upon quick inspection, a change to the “pre” script to pass is the DB_ID and query the DBA_HIST vs. the V$/GV$ views is all that is really required to make most of the SQLd360 AWR Warehouse compliant.

SQLd360 Wins

Without any changes-  as is, the product runs and simply provides some wrong info when it comes to stats data, environment and such.  The actual SQL is viable and the code disregards the DBID when it queries the AWR objects.  It will fail on the ASH pull, but that is something that shouldn’t be expected unless they want to hit the DBA_HIST_ACTIVE_SESSION_HISTORY at some point… 🙂

Thanks to Mauro for the great idea to try SQLd360 against the AWR Warehouse and review it for changes to see what would be required.  Hope this helps, guys!

 

 

 

 

Posted in ASH and AWR, AWR Warehouse, Oracle Tagged with: , ,

March 31st, 2015 by dbakevlar

I finally have a moment to look into my “hacked” SQLT XPRECT/XTRACT runs and see if the changes I made were enough to run properly with the AWR Warehouse.

The answer is yes and no… and maybe “it depends”… 🙂

The Results

The data from the AWR Warehouse to pull the corresponding data for the appropriate SQL_ID from the correct source database in the AWR repository worked, but the environment information still populates incorrectly, as I didn’t update anything outside of the get_database_id function in the SQLT$A package and the executables that call it, as documented in my earlier blog post.

The first indicator that my assumptions that additional functions would need to be updated were verified when I noted the environment values in the main html page, the main entry point into the zip file provided as output in any SQLT run:

sqlt8

 

The SQL_ID was run against the SH database, but it instead shows the information for the AWR Warehouse, displaying incorrectly.  The issue is its populating the database name, DBID and the shortname from the local views, (v$instance, v$database, etc…)

The CBO environment and system statistics are also wrong, as it pulls this information from the database and this is simply the AWR repository data, not the database the data resides in:

sqlt10

 

Yes, the statistics and initialization parameters have no way to currently pull from the dba_hist tables, which means that we have the wrong data for most of the higher level info on the main HTML page.

Where It’s Right

Where I made changes, which were minimal vs. what was required, is where it displays correctly.  The cursor sharing, the SQL information, along with SQL_ID data is all correct for the statement at hand.

sqlt11

The bind peek info is correct, too-

sqlt12

You can quickly view that the data is pulled from the DBA_HIST_SQL_PLAN table and since I’m pushing the correct SQL_ID and DBID to the query, it responds with the correct information in return.

I have results in any source that shows the DBA_HIST_SQL_PLAN and distinct failures when anything queries the GV$SQL_PLAN.  This should be expected-  the AWR Warehouse is only going to house the SH database information for this SQL_ID in the DBA_HIST objects.  The GV$ objects aren’t going to know anything about SH source database information.

Success Rate

So how successful was I?  About 30%… 🙂

Any objects, including the r_rows objects populated as part of the SQLT runs, were successful if they sourced from the Automatic Workload Repository, (WRH$, WRI$ and WRM$ objects, which feed the DBA_HIST* views) and failed when they sourced from an GV$ or V$ object.  I also had to ensure that they had the appropriate DBID called, along with the SQL_ID.  Without this added element, failures also occurred.

Recommendations for Change

Additional Changes that need to be made to complete the transition to AWR Warehouse compliance would include:

At the time of installation, the additional option for not just tuning pack, but also add “A” for AWR Warehouse.  If this option is chosen, a second version of the SQLT$A package is then deployed, along with the executables for the XPRECT and XTRACT SQL statements.

SQLT$A Package would require updates for the following functions:

  1. get_database_id
  2. get_database_name
  3. get_sid
  4. get_instance_number
  5. get_instance_name
  6. get_host_name
  7. get_host_name_short

The XPRECT and XTRACT would then require an additional parameter addition-  dbname.

So instead of:

>START /u01/home/sqlt/run/sqltxprext.sql aycb49d3343xq <sqlt password>

it would be:

>START /u01/home/oraha/kellyn/sqlt/run/sqltxprext.sql aycb49d3343xq  <dbname> <sqltpassword>

Carlos and Mauro were wonderful and introduced me to Abel, so I’ll be passing this along to him, so cross your fingers and lets see if we can get this valuable tool working with the AWR Warehouse….and soon! 🙂

 

 

Posted in ASH and AWR, AWR Warehouse, Oracle Tagged with: , ,

March 5th, 2015 by dbakevlar

While presenting at HotSos in Dallas, Tx, an attendee, (thank you Wenju! :)) asked how page performance issues could be identified when the call was being made remotely and not from the Oracle Management  Repository, (OMR) using the Page Performance console within Cloud Control.

This is a very good question and although not straight forward, I would like to demonstrate what is available to assist in answering the challenge.

First, I have to add that the answer is provided using release 12.1.0.4 of the Enterprise Manager 12c, so if you are still running an earlier version, your results may not render as successful an outcome, but I will also provide you with a command line option to help those of you on those earlier releases, too.

Page Performance Console

The Page Performance Console is accessed once you’ve entered the Cloud Control Management feature from the Setup –> Manage Cloud Control –> Health Overview.  After entering the Health Overview, at the left, top menu, click on OMS and Repository –> Monitoring –> Page Performance.  There are three tabs available in this console and using the second tab, Page Level Performance, you will view the list of pages that are experiencing the longest waits, in descending order.

awrw_dash1

Using the first one in the list, we note that this is the AWR Warehouse.  The calls are made to a remote database, (the AWR Repository is not stored in the same database as the Oracle Management Repository, so of course, the calls are 90%+ remote…) so it’s an excellent choice to use as an example.

Clicking on the link for the page, I’m brought to the Page Performance Report that has all the details on the performance waits for the console page.  As I’ve been a heavy user of this, it’s no wonder if it showing up as the top in the page processing time.

awrw_dash2

The initial section clearly shows us how much of the performance issue is derived from java waits.  Knowing that this is the case, you would think that we can just bypass this, but to have the data verifying what the culprit is for waits is essential for a DBA.   We should never assume or guess when it comes to performance and this is just one more valuable feature of EM12c that eliminates that from the demanding challenges from users that are posed to you each day.  To make it easier to understand the above section, if there were additional time distribution for database, agent or network/browser time, it would be displayed similarly to the top sessions and top SQL in the Top Activity:

awrw_dash4

When you are viewing the data in the Page Performance report, the translation from Top Activity for the DBA is:

Time Distribution = Activity(%)

Even though the database calls are completely insignificant, the data is still provided as part of the report.  Just saying the remote calls are not enough, we need to see why they were deemed insignificant.  If we want to see how our distributed (aka remote) calls are contributing to the time consumed by the total process, then all we have to do is scan down in the Top SQL part of the report to investigate.

awrw_dash3

Note that no SQL_IDs are displayed, even though these are all queries.  The reason is that the OMR is not able to pull these remote calls in the AWR Warehouse that are being queried by Cloud Control.  What is displayed is the time CONSUMED by each of the executions till the results are returned to the console and no Top SQL is listed as nothing was over 10ms.  As I’ve stated, there is incredible value in knowing that database time is still not the main consumer of the remote calls.  The second valuable information is displayed to the very right of the image,  which are the remote session IDs, which can be used if you wished to investigate this via EM12c in the AWR Repository database with an ASH report at the time this executed or other search session opportunity, (there are a number options available.)

Older Release Options

If you are on an earlier version and don’t have the Performance Page Report available to you, there are great command line tools available via the EM CLI.  The following graphic takes you through the steps and the translations here for the DBA is that its similar to TKProf-  you give collect the session ID, create a trace of that session, stop the trace, then create the report from the trace file.

awrw_dash5

Although the report isn’t as aesthetically pleasing as what I’ve shown you from the Performance Page Report, you still are provided the data you need to answer those important performance questions about console page response and this data can be very valuable if you are looking into an issue or working with Oracle Support.

Posted in EM12c Performance, Oracle Tagged with: ,

  • Facebook
  • Google+
  • LinkedIn
  • Twitter