EM12c Enterprise Monitoring, Part V “Warning Management”
This is Part IV in a multi-part series, demonstrating how to take EM12c from out of the box to enterprise level.
You can read Part I, Part II, Part III and Part IV to complete the first phases of this setup and look for future posts in this series to ensure your EM12c is set up to support your database world.
Tracking Warnings through the Incident Manager
There were a few changes we made to metrics so that we have moved generic alert log alerts to “warning” only. We now need to add one rule to create incidents so these are logged in the system. We don’t want to email on these, so they are not to be added to the current rules, but must be a new rule added to the rule set.
Click on Create:
Choose the default, “Incoming events and updates to events and click on Continue.
Choose the following options and click “Next”
The next step is to create an incident and by default, assign ownership to SYSMAN so it will not be left unassigned. The priority will be low, as this is data we will monitor on a regular basis from the EM12c Incident Manager console. This is a one stop shop for all alerting to the system, but by default, no warnings are currently having warning thresholds creating incidents.
Do NOT fill out any notifications. We do not want these to email. Click on Continue.
The rule summary will look like the following:
Click Next
Name the rule and give a clear description, then click Next –> Continue –> OK.
Remember if you are finished with your additions/edits at this time, you must click Save to save off any/all changes to the Rule set.
The only additional rule that we’ve added, outside of the edits that I’ve shown in the rules high level section is the following:
The benefits of this rule design are:
- No after-hours alerting/paging of warning level issues.
- Pro-active addressing of warnings during the business day before problems escalates to critical.
- One location to monitor the environment vs. canning alert logs, secondary emails, etc.
- Global integration with the system, so no other changes have to be made outside of the monitoring templates to set the global metric thresholds for warnings vs. critical.
Final Rule Set Review
The rule sets that you end up with will look like the following, with the system generated ones disabled and then the two new ones with the changes as seen below.
Assigning Incident Rule Sets to Groups
The Rules sets we have created are set to treat all target types the same, no matter the use, but if you have already set up your databases into groups, you can assign the critical alert rule sets to just production databases and then create a second set of rules to alert differently depending on the use of the target.
To assign the rules we have now to the production group, click on the Incident Management Rule Set and click on “Edit”.
On the bottom of the main page, update the rule so that instead of “All Targets”, it is set to “Specific Targets” and then click “Add”.
You will see the list of your groups by default. Click on “Production” and click “Select”.
Click on Save to save changes to the Incident Rule Set. This Rule Set is now ONLY enabled for the targets in the “Production Group” and will no longer create incidents or monitor the thresholds for the “Non-Production Group.” The “ASM-Production” group is covered, as its Parent Group is “Production.”
Copying Rule Sets to use for Secondary Groups
We now need to create an Incident Management Rule Set for Non-Production after updating our existing Rule Set to only notify on Critical.
Click on the Incident Rule Set and click on Actions –> ” Create Like Rule Set”.
For this new rule set, update the name, the description and remove the “Production” group, replacing it with the “Non-Production”. Click on Rules to update and notify to non-critical email notifications.
With this change, you can alert for non-critical to one set of email addresses and keep production to just “after-hours” alerting only.
Once this is complete, you will have the following rules. Remember, only the incident rule set was required to be changed- it’s the only one that is used for notifications.
Your baseline, production/non-production Incident Rule sets, using groups should appear something similar to the following:
Note: there are three Rule Sets in the end, one for all targets that covers self-update events that does not notify, but adds entries in the OMS. There are two more Rule Sets- One with alerting to critical response and another for notification of non-critical, (non-production) issues.
With this scenario, you can add email notifications to whatever level required by the business, but for critical alerting, only critical, production outage alerting is recommended.
Bonus- EM Jobs and Notifications
Pingback: EM12c Enterprise Monitoring, Part IV | DBA Kevlar
thanks for taking the time to write this up. It was very helpful in setting up our new OEM 12c environment.
Excellent article, Helped me to understand who to setup the alerts and emails.
Regards
Prem