I ran out of time before I was able to provide an adequate white paper this year for my EM12c presentation, but there was some valuable info in what I had started, so thought I’d turn it into a mulit-part blog post…
The Oracle Enterprise Manager, (OEM) is the standard monitoring tool for Enterprise Edition Oracle databases. The interface allows the DBA to manage the entire Oracle stack using a single console. The installation and interface is easy for most DBA’s to implement and utilize. In the newest EM12c version, it encompasses integrated systems management, application management, application-to-disk and cloud management , the following documentation will include some 10g but mostly, the EM12c version of the product.
The goal for any DBA is to be notified of an issue and only notified when there is an actual issue. One of the most common downfalls of a monitored environment is the misconception that receiving emails upon success or checks stating that a process is running, is correctly configured monitoring. This produces an environment that leads to a “white noise” effect, where DBA’s may misinterpret a notification as one of the success notifications when a real issue has actually arisen.
The optimal design is one where redundancy checks of the monitoring system is included to ensure that if there is an issue with the monitoring environment that deters it from monitoring and sending alerts, the system has a redundant check on a secondary server that is notifying the DBA on call of the issue. Multiple Oracle Management Server Repositories , residing on separate servers can address this, but in my opinion, would be overkill when simple additional scripts run from a cron would suffice.
OEM Basics
The Oracle Enterprise Manager, 10g and EM12c comprises of the following, basic components:
- The Oracle Management Server/Service, (OMS).
- The OMS repository database.
- The OMS Home, aka the EM state directory, which contains the bin files, log files, collection files and configuration files.
- The Agent installation, application and configuration on each monitored host server.
EM12c has the additional weblogic components included automatically, along with the Cloud support features which can be installed.
Licensing
As long as the OMS is on its own server and is only used for the OMS repository and/or an RMAN backup catalog repository, individual oracle licensing IS NOT required for the Oracle database utilized for the repositories, (Please see pg. 15 of the following PDF from Oracle.)
http://download.oracle.com/docs/cd/B19306_01/license.102/b40010.pdf
Monitoring the OEM from a secondary server:
This can be performed easily from a shell script and allows the DBA(s) to rest easy, knowing that the interface to their database environment, if impacted, will notify them from a secondary server. This allows for redundant checks without sending an “I’m OK” notification to grant comfort:
#!/usr/bin/ksh#----------------------------------------------------------------------------# Author: Kellyn Pot’Vin# Redundancy Check to OEM Server to ensure EM is up and Running!# Verify that all parameters are set in the remote host env. vars...#----------------------------------------------------------------------------if (( $# != 2 ))thenecho "usage: $0 SID hostname"exit 1fi##----------------------------------------------------------------------------# Set up Oracle environment...#----------------------------------------------------------------------------export ORACLE_SID=$1export who_to_ping=$2echo "Oracle SID: "${ORACLE_SID export AVL_LOG=${LOG_DIR}/oem_avl.logexport AVL_ERR=${LOG_DIR}/oem_avl.errexport AVL_PNG_ERR=${LOG_DIR}/ping_avl.err#Check Repository DB for Access$ORACLE_HOME/bin/sqlplus oem_chk/"${pass}"@${ORACLE_SID} <<EOFspool ${AVL_LOG};select sum(1+1) from dual@grid_chk;spool off;exit;EOFcat ${AVL_LOG} | grep "ORA-" > ${AVL_ERR}if [ -s ${AVL_ERR} ]thenecho|mail -s "No Response from Grid Control from Oracle Management Server!" "<EML_Address>" < ${AVL_LOG}fi#Check to verify that EM12C is up! This requires SSH authentication from remote server. ssh oracle/n0c1u3ata11@ "$OMS_HOME/bin/emctl status oms" | grep Down > ${EM_LOG} if [ -s ${EM_LOG} ] then echo|mail -s "No Response from EM12C Grid Control!" "<EML_ADDRESS>" < ${EM_LOG} exit fi#Check Grid Server, ensure that you can ping it as welldateping -c 3 ${who_to_ping}if [ $? -ne 0 ]thensleep 5ping -c 3 ${who_to_ping}if [ $? -ne 0 ]thenecho "`hostname` CANNOT PING ${who_to_ping} the EM Server!" > /tmp/ping.$$echo|mail -s "`hostname` CANNOT PING ${who_to_ping} from Oracle Managent Server!" "<EML_Address>"rm -f /tmp/ping.$$fifirm -f ${AVL_LOG}rm -f ${AVL_ERR}exit
Pretty simple to schedule in cron:
0,15,30,45 * * * * /home/oracle/scripts/admin/chk_grid.ksh <dbname> <servername> > /dev/null 2>&1
I’ve chosen a 15 minute interval on the checks, but this can be done with any interval as requirements are set.
Escalation
Due to Sarbanes-Oxley and/or outside support contracts, an enhanced escalation process may be required. One that can offer more choices and escalation paths then what is currently offered in the 10g and EM12c console. A simple package/support object implementation can be created to support this type of requirement that works with OEM. The code presented here will allow one to set the on-call DBA, scheduler and escalation outside of the OEM interface, but will all OEM alerts and escalation from the OMS will utilize the data found in the supporting tables.
I will try to upload and post the supporting schema and code soon on dbakevlar.com
Blacking out DB from Agent Side with Shell Scripts:
Blackouts can be performed via a shell script to assist in automated processes that could trigger OEM alerts, sending false notifications when a blackout script is all that is required for Unix Admin or Application support personnel.
#!/usr/local/bin/ksh# ######################################################## start_blackout.ksh# Usage ./start_blackout.ksh <oracle_sid># Rewrite Date: 4/22/2011# Modified by: reckl#########################################################usage="$0 <db_name>"if (($# != 1))thenprint $usageexit 1fiORACLE_SID=$1sudo su - oracle -c "$AGENT_HOME/bin/emctl start blackout ${ORACLE_SID}_blackout ${ORACLE_SID}"exit
Patching
I am a supporter of patch deployments through OEM. If you have not configured this or are working to get this feature approved in your database environments, I highly recommend it. In the “Deployments” tab of the EM console, first ensure that the MOS credentials is configured:
Once this has been set up for your environment, you can then designate a patching strategy to deploy to development, test and then production with a full testing cycle that will make any DBA stop quaking in their boots when they receive the notification that new patches have arrived from Oracle Support.
The Deployment Procedure Manager allows the DBA group to schedule deployments of necessary patching with the most effective schedule and little DBA involvement required.
The DBA can then set up patching resource allocation and requirements from the “Offline Patching” UI and choose what to install for automatically patching:
To be continued in next post….