0% found this document useful (0 votes)

94 views78 pages

Manage OEM12c

Uploaded by

Carlos Ojeda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

94 views78 pages

Manage OEM12c

Uploaded by

Carlos Ojeda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 78

Manage the Manager: Tips on How to

Best Manage Oracle Enterprise

Manager 12c
Angeline Janet Dhanarani,
Product Management,
Oracle

Lap Nguyen,
Chevron
Andrei Dumitru,
CERN
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Total Cloud Control

Expanded Cloud Stack

Management

Complete Cloud Lifecycle

Management

Agile, Automated

Optimized, Efficient

Superior Enterprise-Grade
Management

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Scalable, Secure

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracles products remains at the sole discretion of Oracle.

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Program Agenda
1

Architecture Overview of Enterprise Manager

Critical Subsystems and its monitoring with Self-monitoring

features

High Availability and Disaster Recovery

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Architecture Overview
Overall Architecture and Components

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

CRITICAL SUBSYSTEMS AND ITS

MONITORING WITH SELF-MONITORING
FEATURES

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Critical Subsystems
1

Loader Subsystem

Job Subsystem

Console Subsystem

Agent Subsystem

Notification Subsystem

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Loader Subsystem

Backlog

Responsible for processing the data

collected by the agent and uploading it to
the Repository
Its efficiency greatly impacts
performance and health of overall system
Does synchronous uploading of data
Under heavy load, OMS prioritizes
uploading of data
Preference given to agents with
higher agent priorities like Mission
Critical and Production
Agents with lower priorities are
asked to backoff by OMS for a specific
time period
Backlog accumulates at the agents

OMS

Agents

Synchronous
Upload of data

Repository
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Loader Subsystem Monitoring

Checking Loader Performance
Monitor the Loader performance charts in Setup > Manage Cloud Control > Management Servers

Indicates the loader processing time

Look for consistent increase over a time period

Current loader CPU utilization

Lower value indicates loader throughput is efficient

Contact oracle support if the loader consistently running at more than 85% utilization
capacity

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Loader Subsytem Monitoring

Checking Agent Backlog
Monitor the Upload Backlog and Backoff charts in Setup > Manage Cloud Control > Health Overview

Overall Back-off Requests in the Last 10 Mins

Overall Upload Backlog (MB) and (Files)

Overall Upload Rate (MB/sec)

Incase of consistent increase in Back-off requests / Backlog

Check that load is evenly distributed across all OMS with Loader Statistics Report (Reports /
Information Publisher)
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Loader Subsystem Monitoring

Checking Agent Backlog
Uneven load on specific Management Server :Check if SLB configuration is set to Round-Robin algorithm
Permitted deviation tolerance : 10 20 %

Deviation
tolerance
10-20%

General Advice
It is normal to have some amount of Agent being backed off
Keep an eye on consistently growing large number of agents backed off

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Job Subsystem
Anything

User

OMS
WORKER THREADS

JOB DISPATCHER

STEP SCHEDULER

Repository

Agents

that is scheduled and

automated uses the job subsystem.
Eg: Scheduling Blackouts, Template
apply applications
Very crucial sub-component
Critical processes in the Job System
Step Scheduler:
Responsible for processing the job
steps that are ready to run and marks it
Ready
Job Dispatcher:
Picks the steps marked ready for
execution.Dispatches job steps to job
worker threads
Workers threads:
Take work from the Job Dispatcher and
send it to the appropriate agent
Different thread pools for job types

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Job Subsystem Monitoring

Setup > Manage Cloud Control > Health Overview

Monitor Jobs Backlog(Steps)

Indicates number of Job steps past its scheduled execution time.
If this number is high and has not decreased for long period, it indicates job system is not
functioning normally.
Indicates Job engine resources are unable to meet inflow or indicate abnormal processing of
specific jobs because it is stuck for unusual periods
Click
Rate of change of backlog is more important than
absolute backlog numbers
11

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Job Subsystem Monitoring

Setup>Manage Cloud Control>Health Overview>Monitoring>All Metrics> Repository Job Scheduler Performance

Problem Trend Analysis of Job Step Backlog and Overall Job Steps per Second metric

Increasing trend over a prolonged period

Decreased /constant trend over a prolonged period

If Job Step Backlog and Overall Jobs per Second shows increasing trend, indicates work load is high. Job
engine resources are not able to keep with inflow. Increase the resources
If Job Step Backlog is increasing but Overall Jobs per Second is not,it indicates abnormal processing of
specific job
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Job Subsystem Monitoring

Setup / Manage Cloud Control / Management Services >Job System(More Details) >Job Dispatcher details

Monitor Thread Pool Utilization if inflow of work is high, backlog is consistently high
If the Avg. Steps Dispatched/Min is HIGH and Avg. Threads Available is less than 50% of

Configured Threads for a specific pool, increase the thread pool size for each of the OMS
If the Avg. Steps Dispatched/Min is LOW, Avg. Threads Available is also LOW, this typically means that
either a thread is stuck/hung
Refer to Appendix for
Sizing Recommendations
Work
of thread pool size
Inflow
Contact Support for
triaging stuck threads

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Console Subsystem Monitoring

Setup>Manage Cloud Control>Health Overview>OMS and Repository >Monitoring > Page Performance

Monitoring console performance

General Advisories
Proactively check that page access and
session load is evenly distributed across OMS
Check SLB configuration if not
Check the presence of Symptom Analysis
Icon in Overall Tab and use this feature to
narrow down the cause of slow performing
pages
Icon appears only when metric Page
Processing Time (sec) exceeds the
threshold
Symptom analysis can be done on overall
page processing and individual pages
Break-down of processing time by layers
helps narrow down the issue

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Agent Subsystem Monitoring With Partner Agent

Partner agent is an agent which in addition
to all of its regular functions, monitors the
status of its assigned Management Agent and
its host

OMS
Oracle_emd_proxy status
Host status

Partner
Agent
Agent
oracle_emd_proxy

Host1

Agent Push

Monitoring with Proprietary protocol

Signals to partner agent

Monitored
Agent
Host2

Agent Subsystem Monitoring With Partner Agent

Target statuses with partner agent
mentioned in table

SCENARIO

TARGETS

STATUS OF TARGETS

Agent is shutdown
gracefully and not
under blackout

AGENT
HOST OMS

Down

MONITORED
TARGET

Agent Down

AGENT

Agent Unreachable

HOST

Up (Unmonitored)

MONITORED
TARGET

Agent Unreachable

If Partner Agent is
AGENT
not available(Host or
HOST
Agent is down)
MONITORED
TARGET

Agent Unreachable

If AGENT goes down

unexpectedly and
host is up (and not
under blackout)

Up (Unmonitored)

Agent Unreachable
Agent Unreachable

Partner agent accesses the monitored

agent and host with a proprietary
protocol
Can convey to the OMS whether the
monitored agent goes DOWN
Can determine if the host of the
monitored agent is UP or DOWN

Agent status detection done

immediately (few seconds).

Host status change detection when the

agent is down done every minute

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Agent Subsystem-Agent Unreachable Troubleshooting

Sub status added to provide more diagnostic details
Common
Scenarios

Sub status Description

Troubleshooting Tips

Down
Agent Down

Agent was brought down in error /brought down as part of

planned maintenance.

If agent was brought down in error, restart it from the agent

homepage.If agent was brought down as part of planned
maintenance, consider creating a blackout on the agent.

Up
Unmonitored

Currently this sub status is set only for host target with real time
partner agent deduction. Host is up but its agent is shutdown.

If agent is down, do emctl start agent. To triage agent issue, go

to its agent homepage and run the Symptom Analysis tool
located next to the Status field.

Cannot Write to
File System

Agent cannot write to file system due to permission issue.

Check that OS user who owns the agent process has write access
to agent instance directory.

Collections
Disabled

Agent Collections have been disabled. The Agent will no longer

collect any metric for the managed targets.

Check that Agent can upload to OMS with emctl upload. Check
loader statistics report for loader health.

Disk Full

Agent file system is full.

Check that Agent can upload to OMS with emctl upload. Recheck the count of pending files using the command emctl
status agent to verify if they have reduced.

Post Blackout

Agent is unreachable as its first severity has not yet come after
blackout end.

To triage agent issue, go to its agent homepage and run the

Symptom Analysis tool located next to the Status field.

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Agent Subsystem-Agent Unreachable/Pending Diagnosis

Sub status added to provide more diagnostic details
Common Scenarios

Sub status Description

Troubleshooting Tips

Blocked Manually

Agent has been blocked manually.

Unblock the Agent from console -Setup > Manage Cloud

Control > Agents

Blocked (Plug-in
Mismatch)

Agent has been blocked for communication with OMS due to

Plug-in mismatch.

If Agent has been restored from a backup perform an Agent

Resync emcli resyncAgent.

Blocked (Bounce
Counter Mismatch)

Agent has been blocked for communication with OMS due to

Bounce Counter mismatch.

If Agent has been restored from a backup perform an Agent

Resync emcli resyncAgent.

Agent Misconfigured

Agent is configured for communication with another OMS or

OMS Agent time skew is noticed or Consecutive metadata
/severity upload failure

Check Agent configuration to ensure the Agent is

communicating with the correct OMS.Re-secure the agent
with emctl secure agent

Communication
Broken

Agent is unreachable due to communication break between

agent and the OMS.

Address the network latency , port being blocked or proxy

related issue.

Under Migration

Agent is unreachable as it is under migration (2 system

upgrade) from pre 12 to 12C.

Migrate the agent and then start the agent.

Note: Refer to Appendix for General Troubleshooting steps for Agent Unreachable
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Notification Subsystem Monitoring

Notification system allows you to notify Enterprise Manager administrators when
specific incidents, events, or problems arise

A backlog in notifications can cause a delay in alerts being sent, or a missing alert all
together
If notifications are not getting delivered
Check your external systems that are configured to receive notifications
For email/pager - Is the email gateway configured and working?
For OS Command and PLSQL, check the external systems that they may connect to
Contact Oracle Support if external systems are not working as expected.
Find the specific events in Incident Manager console for non-informational events
If it is not found, likely to be an event publishing issue.
If found in Incident Manager, verify the rule definition
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Notification Subsystem Monitoring

Setup > Manage Cloud Control >Health Overview

Check Notification delivery backlog

Look for consistent increase
Key Metrics to monitor
Notifications Processed (Last Hour)
Pending Notifications Count

If Pending Notifications Count remains high over a period of time [such as an hour],
check Notifications Processed (Last Hour)
If it is making good progress, there could be temporary load and it will resolve itself soon
If it is not making good progress, there could be stuck queues in notification system/ out-of-date
incident rules. Contact Oracle Support

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Few Other Critical Subsystems(Appendix)

Events Subsystem

Repository Metrics Collection Jobs

Repository Health

Repository Scheduler Jobs

Metric Rollup Jobs

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

HIGH AVAILABILITY AND DISASTER

RECOVERY

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

High Availability
Critical components in Enterprise Manager
infrastructure are:
Repository - Persistent store for all Enterprise Manager
data
OMS - Central application accessed by Agents and endusers
Software Library - Filesystem repository used to store
software entities

All of the above should be configured for High

Availability if availability of Enterprise Manager is
critical
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

High Availability
Repository

Oracle RAC provides a standard HA solution for the EM repository

Best Practice: Configure RAC prior to EM installation
Best Practice: Use SCAN and role based DB Services
for OMS to Repository connect strings

OMS
Repository

Advantage of Role-based database services with Oracle RAC

Can automatically control the startup of database services on databases by assigning
a database role - PRIMARY / PHYSICAL_STANDBY / LOGICAL_STANDBY /
SNAPSHOT_STANDBY
Refer whitepaper Best Practises for Highly Available Oracle Databases for details
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

High Availability
OMS

Additional OMSs can be

deployed behind a Server Load
Balancer (SLB) for OMS High
Availability
Agents and Users communicate
with OMS via load balancer

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

High Availability End-To-End Topology

All OMS, Repository and Software Library
components are active within the same Data
Center
Software Library must be accessible
Read/Write from all active OMSs
Software library should be deployed on
highly available storage
Not a Disaster Recovery (DR) solution

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Disaster Recovery
Protects applications from catastrophic failures

Primary
Site

Keeps data on primary site synchronized with a standby

Allows applications to failover to the standby site

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Standby
Site

Disaster Recovery
Repository

Data Guard Physical Standby Database provides Disaster Recovery

solution for Repository

Primary Site
(active)

Use Data Guard Broker to manage switchover/ failover of database

Best Practise: Configure OMS connect descriptor with scan names
and role-based services of primary and standby data centers
(DESCRIPTION_LIST=
(LOAD_BALANCE=off) (FAILOVER=on)
(DESCRIPTION= (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
(ADDRESS_LIST= (LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=PRIM_SCAN)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=DB_ROLE_SERVICE)))
(DESCRIPTION= (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
(ADDRESS_LIST= (LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST= STBY_SCAN)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=DB_ROLE_SERVICE))))

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Standby Site
(passive)
Data Guard
Physical Standby

Disaster Recovery
OMS

Deploy Standby (Passive) OMSs on Standby Site

Standby OMS using Standby WebLogic Domain
Standby OMS using Storage Replication

Deprecated

Primary Site

Use DNS / Global Traffic Manager to redirect

Agent

Best Practice: Storage Replication is

recommended method
No manual application of Plug-ins or OMS
patches at Standby Site
No rebuild of Standby site needed after upgrades

DNS/
GTM

Standby Site
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Console
/EMCLI

Enterprise Manager High Availability Level 4 Solution

Recommended Solution for High Availability and Disaster Recovery with Storage Replication
PASSIVE
ACTIVE

PASSIVE
ACTIVE

DNS
Lookup

Server Load Balancer of

Standby data center

Primary
OMS

Additional
OMS1

Storage

OMS Share
OMS1 Share

EM
Repository
Physical Standby

Swlib Share

S
W
I
T
C
H
O
V
E
R

Server Load Balancer of

Primary data center

Storage

Primary
OMS

Additional
OMS1

OMS Share
OMS1 Share
Swlib Share

Storage Continuous Replication

DB Replication with Dataguard from Primary to Standby
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

EM
Repository
Physical Standby

ACFS Replication

Alternate to using External Storage Appliances

ACFS storage replication requires Grid

Infrastructure to be installed for a Cluster

ACFS Filesystem created for OMS install

and software library on ACFS server and this
is exported using NFS
Filesystem mounted on another node(OMS
server) and EM installed here

Similar setup on second ACFS server with

another ACFS filesystem to be used as a
standby
Established ACFS replication between the
primary and standby ACFS servers
Refer to Section 18.7 of Advanced
Installation Guide for configuration details
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

BI Publisher High Availability

With EM12c R4, BI Publisher is bundled and installed by default

BIP needs to be configured using the configureBIP script

BIP supports Enterprise Manager HA scale out

BIP can be configured on all OMS nodes to increase reporting capacity

This does not provide failover, in case one of the BIP instances fails or is otherwise stopped [Fixed
in future].

Recommendations:

Configure BIP on the first OMS node, before cloning it

Always configure BIP on all OMS nodes, and ensure that

BIP is always UP, when that node's OMS is also up

BI Publisher is supported only with storage replication

based solution for Disaster Recovery. Not functional with
Standby OMS with Weblogic Domain method

Load Balancer

OMS1

BIP1

BIP2

OMS2

WebLogic

Server 1

Server 2

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Appendix

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Architecture Overview
Oracle Management Service (OMS)

OMS

Central Enterprise Manager Application

Source of truth for all management
Receives and processes data from Agents
Uses repository as persistent store for information

Console

PBS

Comprises 2 Weblogic Server application deployments

Console Provides UI and Target Specific Management
Platform Background Services (PBS) A set of background services critical for monitoring
and management
Verify status of Console and PBS is marked as UP for each Management Service in Setup >
Manage Cloud Control > Management Services

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Architecture Overview
Repository

Most critical part of EM system

Deploy with performance and availability in mind

Persistent store for data collected from the managed Targets

Performance and Availability Metrics
Configuration and Compliance Information

Repository

Used to store a variety of Enterprise Manager configuration information

such as:
users and privileges
job definitions
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Architecture Overview
Agents

Collect monitoring and configuration data from the targets and store locally
in XML files
Collected data uploaded at scheduled intervals to Management Service using
HTTP/HTTPS
XML files are purged once data has been uploaded

Execute tasks on behalf of Enterprise Manager users

Real-time data collections
Jobs
Deployment Procedures

Agents

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Job Subsystem Monitoring

Sizing Recommendations of thread pool size

Sizing Recommendations of pool size for Large Configuration with 2 or 4 OMS nodes
Default pool size configuration for
Small and Medium configuration
Incase of major resource issues,
contact Oracle Support for guidance
on adding additional threads

Parameters

Value

oracle.sysman.core.jobs.shortPoolSize

oracle.sysman.core.jobs.longPoolSize

oracle.sysman.core.jobs.longSystemPoolSize

oracle.sysman.core.jobs.systemPoolSize

oracle.sysman.core.conn.maxConnForJobWorkers

144

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Event Subsystem Monitoring

Setup / Manage Cloud Control / Health Overview / OMS and Repository Menu /Monitoring All Metrics

Responsible for processing the events published by different components in the system
Key Metrics to check event backlog -Total Events Pending and Total Events Processed
(Last Hour)
If Total Events Pending remains high [over an hour].
Check metrics Total Events Processed (Last Hour)
If it is making good progress(count is high), there could be temporary load ignore
When pending count continues to be high, it should sustain a minimum processing of 1000
events per every 10 minutes
If it is not making good progress, there could be stuck queues in event system
Check the queue statistics in Event Status metric group to detect problem in AQ
Contact Support for triaging issues in AQ /queues
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Repository Metric Collection Jobs Monitoring

Repository metric jobs are sub divided into long and short running tasks
Some collection workers (Default 1) process the short tasks and some (Default 1)
process long tasks

Key Indicators of its performance

Repository Collection Performance Chart
Repository collection performance metrics
Key Metrics
Average Collection Duration (seconds)
Collections Processed
Repository Collection Task Performance
Run Duration (Seconds)

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Repository Metric Collection Job Monitoring

Average Collection Duration (seconds) - Indicator of the load on the
repository collection subsystem
Two possible reasons - Number of collections have increased Or some of the metrics
are taking a long time to complete
Check the Run Duration (Seconds) metric
To identify which metric is taking more than 2 mins of time(default) to execute. Threshold-able
If any metric is taking unusually long time, disable the specific metric to unblock.

Check the Collections Processed metric

Consistently high and backlog is continuous
Enable Collection Manager for one-off cases
Configure threads if backlogs are generally high
Maximum workers is 5
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Repository Health Monitoring

General guidelines for Maximum
Availability to check in repository
Regular Backups
ARCHIVELOG mode ON

FLASHBACK Mode ON
Refer to Oracle Database High Availability
Guidelines

Compliance to Repository Database

Setting as per Sizing guidelines.

Repository Scheduler Jobs Monitoring

Setup > Manage Cloud Control >Repository

Monitor Repository Scheduler Jobs status and processing time

Tips to troubleshoot if the Status of these jobs are down

For the repository jobs to run, the DBMS_SCHEDULER must be enabled
Start these jobs with pl/sql command exec emd_maintenance.submit_em_dbms_jobs

Repository Scheduler Jobs Monitoring

If a specific job is down ie broken state,
Query the mgmt_performance_names table as repository owner for the
dbms_jobname and fetch the job id from all_jobs
Look for ORA-12012 messages for this job id in the database alerts log and trace files
for the problem to fix. Re-start the job from console
Contact Oracle Support if fix cannot be easily identified

Key Metrics to gauge its performance

Throughput per second
Processing Time(% of Last Hour)
If Processing Time is large
and the Throughput is low
Check for errors in database-alert.log
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Metric Rollup Jobs

Setup > Manage Cloud Control >Repository

Aggregation mechanism: Both hourly and daily rollups are done from the raw data
directly
Look out for consistently growing backlogs or prolonged execution time span
Configure additional rollup worker threads using configure option in Metric Rollup Performance Chart
Click
If the RAC is configured in the database,
to avoid RAC contention negating gain of
additional threads
Create database service and set affinity to it
for the rollup job to only run on one RAC node

Metric Rollup Jobs Monitoring

Setting affinity with RAC Configuration

Create database service and set affinity to it for the rollup job to only run on one RAC
node
Create database service rollup and set one of RAC instance as primary instance in -r
srvctl add service -d <dbname>-s rollup -r <primary instance> -a <the the other instances> -y
automatic
srvctl start service -d <dbname>-s rollup
srvctl status service -d <dbname>
As sys user, execute DBMS_SCHEDULER.create_job_class( job_class_name => 'ROLLUP', service =>
'rollup')
GRANT EXECUTE ON sys.ROLLUP TO sysman;
As sysman user, execute DBMS_SCHEDULER.SET_ATTRIBUTE ( name => 'EM_ROLLUP_SCHED_JOB',
attribute => 'job_class', value => 'ROLLUP')
As sysman user, execute GC_SCHED_JOB_REGISTRAR.SET_JOB_CLASS('EM_ROLLUP_SCHED_JOB', 'ROLLUP')
Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Agent Subsystem-New Agent Unreachable sub-statuses

Sub status added to provide more diagnostic details
Agent Unreachable And Status Pending Statuses
Agent Unreachable

Post Blackout

Down

Blocked (Plug-in Mismatch)

Agent Down

Blocked (Bounce Counter Mismatch)

Up Unmonitored

Agent Misconfigured

Under Migration

Communication Broken

Cannot Write to File System

Status Pending

Collections Disabled

Target Addition in Progress

Disk Full

Status Pending (Post Blackout)

Blocked Manually

Status Pending (Post Metric Error)

General Troubleshooting Steps for Agent Unreachable

Setup > Manage Cloud Control >Agents
Target Status Diagnostics Report: Agent-based targets (Information Publisher report)
Check the Promote Status column and Broken Reason in Target Information
Check for latest Clean Heartbeat UTC time in Agent Ping Status table in the Report

Ensure OMS is reachable from agent host and agent from OMS host
Check emctl status for various configurations. Eg: Agent communicating with correct OMS
Check agent upload with emctl upload
Contact Oracle Supprt with these log
gcagent.log from agent home
emoms_pbs .log, emoms.log

Safe Harbor Statement

The preceding is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracles products remains at the sole discretion of Oracle.

Enterprise Manager (EM) High

Availability (HA) Architecture

Lap Nguyen, Database Analyst

Oracle Open World Conference
October 2, 2014

This document is intended for use by Chevron for presentation at the October 2, 2014 Oracle Open World Conference, for
posting on the Oracle Open World Conference website and for handouts to Oracle Open World Conference attendees. No
portion of this document may be copied, displayed, reproduced, published, sold, licensed, downloaded or used in any other
way, unless the use has been specifically authorized by Chevron in writing.
2014 Chevron U.S.A. Inc. All Rights Reserved

Agenda

Our company
Overview of Oracle EM HA
Tips and tricks to reduce down time when a switchover or failover to
disaster recovery (DR)
Benefits of using a storage replication HA solution

Oracle is a registered trademark of Oracle and/or its affiliates.2

Chevron is One of the Largest Integrated Energy

Companies in the World
2nd largest integrated
energy company in the
United States
12th largest company
in the world
64,500+ employees
worldwide (includes
service station
personnel)
2.59 net million barrels
of oil per day in 2012
$21.4 Billion Net
Income in 2013

$39.8 Billion Capital

and Exploratory budget
for 2014

A Global Company Operating on Six Continents

180+ countries in
which we operate

30+ countries with

exploration and
production activities

Chevron
Corporation
Headquarters

18 refineries and
asphalt plants

(includes Global
Upstream & Gas
and Downstream
headquarters)

35 chemical
manufacturing
facilities

Exploration & Production

Chevron

Refining

No Operations

Chemicals

* In some cases, one dot

designates multiple locations

3 retail brands
(Chevron, Texaco and
Caltex)
22,000+ retail outlets

Overview of Oracle EM HA Architecture Components

Repository
Local HA: Data Guard fast start failover with maximum availability
protection mode
DR: Data Guard

Oracle management system (OMS) Redundancy: two primary OMS

and two DR OMS
Network attached storage (NAS) Replication: Application software bit
and Oracle Enterprise Manager Software Library
Global Traffic Manager and F5 Networks BIG-IP
Oracle Access Manager (OAM) single sign on (SSO): Two primary
servers and two local DR servers (required due to Kerberos SSO)

F5, F5 Networks, and BIG-IP are trademarks or registered trademarks of F5 Networks, Inc. in
5
the U.S. and in certain other countries.

Oracle EMHA Architecture Diagram

Tips and Tricks to Simplify the DR Configuration

Did you know that you can incorporate standby database into the OMS
configuration using a one-time configuration on the primary OMS?
Command:
emctl config oms -store_repos_details -repos_conndesc
"(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=ho
st1)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=host2)(PORT=1521))(
ADDRESS=(PROTOCOL=TCP)(HOST=host3)(PORT=1521)))(CONNECT_DATA
=(SERVER=DEDICATED)(SERVICE_NAME=SID_DG)))" -repos_user sysman repos_pwd password
Where:
Host1-2 are local primary database hosts HA fast start failover
Host3 is a remote database host DR
Note: For RAC, host1-2 would be replaced by SCAN-IP

Benefits:
Reconfiguring the OMS to point to the new repository (after the Data
Guard switchover/failover) is not required.
Reduce downtime, human errors and manual work when
switchover/failover occurs.
2014 Chevron U.S.A. Inc. All Rights Reserved

Benefits of Setting Up Multiple Database Connections

when Using Switchover or Failover to DR
OMS was set up to connect to multiple hosts. In the DR situation, the
process can be simplified without having to run any configuration
changes.
6 simple steps to switchover to DR without having to run configuration
changes:
1. Stop the OMS

2. Switch over the database to DR

3. Disable f5 on the primary site and enable f5 on the DR site
4. Break the NAS Mirror and set the DR site to RW
5. Start up the OMS
6. Configure the OMS to support Chevron logon standards

Note: Step 6 is NOT required if OAM/Kerberos SSO is not in place

Benefits of a DR Solution with NAS Replication

Patching/Plugin Deployment steps without NAS replication:

Stop the OMS

Patch or deploy new plugins on Primary

Switch the database to Standby

Start up the OMS on Standby

Patch or deploy new plugins to the OMS on Standby

Switch the database back to primary

Start up the OMS

Patching /Plugin Deployment steps with NAS replication:

Stop the OMS

Patch or deploy new plugins on the primary OMS

Start up the OMS

Benefits:

Prior to storage replication, standby OMS recreation is required when upgrading

Reduce down time by half or more when patching or deploying plugins

Reduce human errors and simplify the EM infrastructure

Enterprise Manager at CERN

Andrei Dumitru
IT Department / Database Services / openlab

CERN
European Organization for Nuclear Research
Founded in 1954
Research: Finding answers to questions about the
Universe
Technology, International collaboration, Education

21 Member States
7 Observer States
European Commission, USA, Russian Federation, India, Japan, Turkey, UNESCO

Associate State
Serbia

Candidate State
Romania

People
2500 Staff, 560 Fellows, 500 Students, 10600 Users ,
Grand Total ~ 15000

CERN Member States

The largest particle accelerators & detectors

27km (17 miles)
long tunnel

Thousands of
superconducting
magnets
Ultra vacuum:
10x emptier
than on the Moon
Coldest place
in the Universe:
-271C/1.9K/-456F

Credit: Mariusz Piorkowski

Deployment
Agents version: 12.1.0.4
Linux x86-64
Secure agent upload
AD accounts for user login

2-node RAC OMS+OMR

OMS version: 12.1.0.4
Linux RHEL 5 x86-64
8 CPU @ 2.53GHz
48 GB RAM

Agents
Users(https)
Failover VIP

RAC nodes
Cold Failover Cluster

RDBMS version 11.2.0.4

Size: ~200GB
NAS storage
Shared storage (OMS & OMR)

Databases
200 Oracle Database Instances
80 RAC Databases
Middleware
370 WebLogic Servers
340 Java Virtual Machines
over 1000 App Deployments
Apache Tomcat & HTTP Servers
Hosts
270 Red Hat Enterprise Linux 5 & 6
Total
5200 targets

Before

Case Study:
OMS Troubleshooting
1. Launched Agent Upgrade job
2. Started receiving many alerts
Agent is unable to communicate with OMS
Agents not yet upgraded to R4:
no enhanced agent health status available

3. Looking into the new self monitoring

features(Loader):
Throughput was dropping
Backoff (no of files rejected) increasing
Utilized Capacity increasing

over 5000 files in backoff

Case Study:
OMS Troubleshooting
4. Diagnosing the Repository Database
High Load on the OMR host
Row lock contention in SYSMAN
schema caused by Agent Upgrade
5. Oracle Support provided patch
6. OMR issue fixed
Throughput rate back to normal
Everything working

After

Agents Overview

Check agent status

Symptom Analysis

Control agent
Properties

Partner Agent
Agents monitor one another
Faster downtime detection
Host=hostname.cern.ch
Separate alerts
Target type=Agent
Target name=hostname.cern.ch:1234
agent down event
Message=Agent has stopped monitoring.
The following errors are reported :
host down event
agent shutdown.
Host=hostname.cern.ch
Target type=Host
Target name=hostname.cern.ch
Message=Host Down - Detected by
Partner Agent

Repository
Out of the box checks
based on OMS size

Change the schedule

for Repository Jobs

Repository Metrics

Page Performance Analysis

Advantages of new EM12c R4 self-monitoring features

Quickly spot infrastructure problems

Fast host down detection - partner agent
New agent health sub statuses
Change schedule for repository jobs from UI
Performance diagnosis of UI pages
Detailed diagnosis of different sub-systems
Repository recommendations and checks

Directional Drilling Handbook For Saudi Arabia SLB 1687737723
100% (2)
Directional Drilling Handbook For Saudi Arabia SLB 1687737723
152 pages
AngelineDhanarani - CON8244 ManageTheManager 26 Final
No ratings yet
AngelineDhanarani - CON8244 ManageTheManager 26 Final
52 pages
Chevron Jack ST Malo Publication PDF
No ratings yet
Chevron Jack ST Malo Publication PDF
60 pages
Complaint, City and County of Honolulu v. Sunoco LP, No. 1CCV-20-0000380 (Haw. 1st Cir. Mar. 9, 2020)
No ratings yet
Complaint, City and County of Honolulu v. Sunoco LP, No. 1CCV-20-0000380 (Haw. 1st Cir. Mar. 9, 2020)
119 pages
Oracle Perfomance Tuning
No ratings yet
Oracle Perfomance Tuning
33 pages
Major Players of Fuel Oil in The Philippines
0% (1)
Major Players of Fuel Oil in The Philippines
8 pages
Technical Program: DECEMBER 5-9, 2021
100% (1)
Technical Program: DECEMBER 5-9, 2021
45 pages
AmCham Jordan Membership Directory
No ratings yet
AmCham Jordan Membership Directory
176 pages
01 Offshore HSE Plan Procedure
83% (6)
01 Offshore HSE Plan Procedure
31 pages
Hanggi Statement PDF
No ratings yet
Hanggi Statement PDF
23 pages
Asean Gas Pricing
No ratings yet
Asean Gas Pricing
37 pages
Oracle Enterprise Manager Administration - PPT
No ratings yet
Oracle Enterprise Manager Administration - PPT
23 pages
08po DF 4 2 PDF
No ratings yet
08po DF 4 2 PDF
10 pages
FrontierIssue11 PDF
No ratings yet
FrontierIssue11 PDF
40 pages
An Assignment On The Marketing Mix of International and Bangladeshi Industries
No ratings yet
An Assignment On The Marketing Mix of International and Bangladeshi Industries
6 pages
Aga Psms Workshop Pge Andre Da Costa-3-1-16 Rev For Web
No ratings yet
Aga Psms Workshop Pge Andre Da Costa-3-1-16 Rev For Web
16 pages
Completions Group: Engineering & Operation: Doc Id © Chevron 2005
No ratings yet
Completions Group: Engineering & Operation: Doc Id © Chevron 2005
21 pages
Chevron's Dropped Object Prevention Program
100% (3)
Chevron's Dropped Object Prevention Program
46 pages
Strategic Management
No ratings yet
Strategic Management
16 pages
Texaco Brand Advantage Brochure
No ratings yet
Texaco Brand Advantage Brochure
12 pages
Tuning OEM
No ratings yet
Tuning OEM
67 pages
Converter Transmissions For Off-Road Equipment (Off-Road Vehicles, Special Vehicles, Lift Truck) List of Lubricants TE-ML 03
No ratings yet
Converter Transmissions For Off-Road Equipment (Off-Road Vehicles, Special Vehicles, Lift Truck) List of Lubricants TE-ML 03
11 pages
DA40 Lubricants
No ratings yet
DA40 Lubricants
14 pages
Chevron Corporation
No ratings yet
Chevron Corporation
5 pages
Oracle EPM 11-1-2 1 Tuning Guide v4
50% (2)
Oracle EPM 11-1-2 1 Tuning Guide v4
58 pages
Chevron Indonesia: Azka Fauzi A. Teguh Aji I
No ratings yet
Chevron Indonesia: Azka Fauzi A. Teguh Aji I
7 pages
Timeline of Oil Gas Florida3
No ratings yet
Timeline of Oil Gas Florida3
7 pages
Home Materials Top 10 Companies in Automotive Lubricants Market
No ratings yet
Home Materials Top 10 Companies in Automotive Lubricants Market
6 pages
Chevron Thailand Factsheet
No ratings yet
Chevron Thailand Factsheet
5 pages
Responsibilities Ethics
No ratings yet
Responsibilities Ethics
5 pages
Big Oil - Butcher of The Earth
No ratings yet
Big Oil - Butcher of The Earth
16 pages
Week 6 Presentation Olivia Hayes
No ratings yet
Week 6 Presentation Olivia Hayes
5 pages
Best Practices For Large Oracle Apps R12 Implementations: Imagination at Work
No ratings yet
Best Practices For Large Oracle Apps R12 Implementations: Imagination at Work
32 pages
OSM Cloud-Native-System-Administrators-Guide
No ratings yet
OSM Cloud-Native-System-Administrators-Guide
304 pages
Petrobowl Trivia Questions
No ratings yet
Petrobowl Trivia Questions
8 pages
OSM Concepts
No ratings yet
OSM Concepts
120 pages
Company Critique and Comparison: Shell Canada, Chevron and Murphy Oil Corporation
No ratings yet
Company Critique and Comparison: Shell Canada, Chevron and Murphy Oil Corporation
5 pages
OSM Concepts
No ratings yet
OSM Concepts
92 pages
Best Practices For Large Oracle Apps R12 Implementations: Ajith Narayanan Dell IT, Bangalore Hyderabad, 9 Nov 2013
No ratings yet
Best Practices For Large Oracle Apps R12 Implementations: Ajith Narayanan Dell IT, Bangalore Hyderabad, 9 Nov 2013
22 pages
Overall Troubleshooting Performance
No ratings yet
Overall Troubleshooting Performance
17 pages
Fault Management
No ratings yet
Fault Management
8 pages
Crude The Real Price of Oil
No ratings yet
Crude The Real Price of Oil
1 page
Enterprise Manager 12c
No ratings yet
Enterprise Manager 12c
15 pages
By Rahul Wargad
No ratings yet
By Rahul Wargad
25 pages
Hands-On Monitoring and Alerting with Prometheus: Build Resilient, Real-time Monitoring and Alerting Systems Using Prometheus, PromQL, and Proven Best Practices for Modern Infrastructure (English Edition)
From Everand
Hands-On Monitoring and Alerting with Prometheus: Build Resilient, Real-time Monitoring and Alerting Systems Using Prometheus, PromQL, and Proven Best Practices for Modern Infrastructure (English Edition)
Muhammad Badawy
No ratings yet
Prefect Workflow Orchestration Essentials: Definitive Reference for Developers and Engineers
From Everand
Prefect Workflow Orchestration Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Share Point Administrator - Inglese -
From Everand
Share Point Administrator - Inglese -
Riccardo Dominici
No ratings yet
Oracle Recovery Appliance Handbook: An Insider’S Insight
From Everand
Oracle Recovery Appliance Handbook: An Insider’S Insight
Ramesh Raghav
No ratings yet
DevOps Beginners to Advanced with Projects
From Everand
DevOps Beginners to Advanced with Projects
Adil Khan
No ratings yet
PLC Programming & Implementation: An Introduction to PLC Programming Methods and Applications
From Everand
PLC Programming & Implementation: An Introduction to PLC Programming Methods and Applications
Ojula Technology Innovations
No ratings yet
Mastering Terraform A Comprehensive Guide to Infrastructure As Code
From Everand
Mastering Terraform A Comprehensive Guide to Infrastructure As Code
Mario Marinov
No ratings yet
Middleware Management with Oracle Enterprise Manager Grid Control 10g R5
From Everand
Middleware Management with Oracle Enterprise Manager Grid Control 10g R5
Arvind Maheshwari
3/5 (1)
Centreon Administration and Configuration Guide: Definitive Reference for Developers and Engineers
From Everand
Centreon Administration and Configuration Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Oracle SOA Suite 11g Administrator's Handbook
From Everand
Oracle SOA Suite 11g Administrator's Handbook
Ahmed Aboulnaga
No ratings yet
Microsoft Forefront UAG 2010 Administrator's Handbook
From Everand
Microsoft Forefront UAG 2010 Administrator's Handbook
Erez Ben-Ari
No ratings yet
Oracle 11g Streams Implementer's Guide
From Everand
Oracle 11g Streams Implementer's Guide
Ann L. R. McKinnell
No ratings yet
Oracle Enterprise Manager Cloud Control 12c: Managing Data Center Chaos
From Everand
Oracle Enterprise Manager Cloud Control 12c: Managing Data Center Chaos
Porus Homi Havewala
No ratings yet
Study Guide: Cisco AppDynamics Professional Implementer
From Everand
Study Guide: Cisco AppDynamics Professional Implementer
Anand Vemula
No ratings yet
Oracle: Protect Your Data
From Everand
Oracle: Protect Your Data
Floribert TCHOKO
No ratings yet
Angular Performance Optimization: Everything you need to know
From Everand
Angular Performance Optimization: Everything you need to know
Abdelfattah Ragab
No ratings yet
Visual SourceSafe 2005 Software Configuration Management in Practice
From Everand
Visual SourceSafe 2005 Software Configuration Management in Practice
Aleksandar Seovic
No ratings yet
Cisco AppDynamics Associate Performance Analyst (500-420 CAAPA) – Study Guide
From Everand
Cisco AppDynamics Associate Performance Analyst (500-420 CAAPA) – Study Guide
Anand Vemula
No ratings yet
JMeter Cookbook
From Everand
JMeter Cookbook
Bayo Erinle
No ratings yet
Study Guide Cisco AppDynamics Professional Implementer (500-430 CAPI)
From Everand
Study Guide Cisco AppDynamics Professional Implementer (500-430 CAPI)
Anand Vemula
No ratings yet
IT Inventory and Resource Management with OCS Inventory NG 1.02
From Everand
IT Inventory and Resource Management with OCS Inventory NG 1.02
Barzan 'Tony' Antal
No ratings yet
Application Observability with Elastic: Real-time metrics, logs, errors, traces, root cause analysis, and anomaly detection
From Everand
Application Observability with Elastic: Real-time metrics, logs, errors, traces, root cause analysis, and anomaly detection
Navin Sabharwal
No ratings yet
AWS Certified Developer Associate (DVA-C01) Practice Test
From Everand
AWS Certified Developer Associate (DVA-C01) Practice Test
iCertify Training
No ratings yet
Agile Testing: An Overview
From Everand
Agile Testing: An Overview
Florian Heuer
4/5 (10)
Key Principles of IT Architecture
From Everand
Key Principles of IT Architecture
Nelson Ambrose
No ratings yet
OCA Oracle Database 11g Administration I Exam Guide (Exam 1Z0-052)
From Everand
OCA Oracle Database 11g Administration I Exam Guide (Exam 1Z0-052)
John Watson
No ratings yet
Automating Software Tests Using Selenium
From Everand
Automating Software Tests Using Selenium
Hugo Peres
No ratings yet
DevOps: Introduction to DevOps and its impact on Business Ecosystem: Introduction to DevOps and its impact on Business Ecosystem
From Everand
DevOps: Introduction to DevOps and its impact on Business Ecosystem: Introduction to DevOps and its impact on Business Ecosystem
Stephen Fleming
No ratings yet
Microsoft AZURE® AZ-104 Administrator Practice Tests
From Everand
Microsoft AZURE® AZ-104 Administrator Practice Tests
iCertify Training
No ratings yet
Inside Officescan 11 Service Pack 1 Upgrade Documentation
From Everand
Inside Officescan 11 Service Pack 1 Upgrade Documentation
Dale Johnson
No ratings yet
Business Visibility with Enterprise Resource Planning
From Everand
Business Visibility with Enterprise Resource Planning
Anupama Sakhare
No ratings yet
Oracle Modernization Solutions
From Everand
Oracle Modernization Solutions
Tom Laszewski
No ratings yet
Stress Free Maintenance Solutions
From Everand
Stress Free Maintenance Solutions
Ron Mueller
No ratings yet
The Maintenance-Excellence Program
From Everand
The Maintenance-Excellence Program
Kris Bagadia
3/5 (2)
The IT Manager’s Guide to Continuous Delivery: Delivering Software in Days
From Everand
The IT Manager’s Guide to Continuous Delivery: Delivering Software in Days
Andrew Phillips
No ratings yet
DevOps For Beginners: DevOps Software Development Method Guide For Software Developers and IT Professionals
From Everand
DevOps For Beginners: DevOps Software Development Method Guide For Software Developers and IT Professionals
Joseph Joyner
No ratings yet
DevOps and Microservices: Non-Programmer's Guide to DevOps and Microservices
From Everand
DevOps and Microservices: Non-Programmer's Guide to DevOps and Microservices
Stephen Fleming
4/5 (2)
Operator’S Guide to Rotating Equipment: An Introduction to Rotating Equipment Construction, Operating Principles, Troubleshooting, and Best Practices
From Everand
Operator’S Guide to Rotating Equipment: An Introduction to Rotating Equipment Construction, Operating Principles, Troubleshooting, and Best Practices
Robert Perez
5/5 (4)
How to Create Continuous Production Flow?: Toyota Production System Concepts
From Everand
How to Create Continuous Production Flow?: Toyota Production System Concepts
Mohammed Hamed Ahmed Soliman
5/5 (1)
ORACLE PL/SQL Interview Questions You'll Most Likely Be Asked
From Everand
ORACLE PL/SQL Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
5/5 (1)
Creating a One-Piece Flow and Production Cell: Just-in-time Production with Toyota’s Single Piece Flow
From Everand
Creating a One-Piece Flow and Production Cell: Just-in-time Production with Toyota’s Single Piece Flow
Mohammed Hamed Ahmed Soliman
4/5 (1)
Learn SAP Basis in 24 Hours
From Everand
Learn SAP Basis in 24 Hours
Alex Nordeen
4.5/5 (2)
Selenium Interview Questions & Answers
From Everand
Selenium Interview Questions & Answers
Tech Interviews
5/5 (1)
DevOps Basics, Principles, and More
From Everand
DevOps Basics, Principles, and More
Tom Henricksen
No ratings yet
Service Desk Analyst Bootcamp: Maintaining, Configuring And Installing Hardware And Software
From Everand
Service Desk Analyst Bootcamp: Maintaining, Configuring And Installing Hardware And Software
Rob Botwright
No ratings yet
Software Testing Interview Questions You'll Most Likely Be Asked
From Everand
Software Testing Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

Manage OEM12c

Uploaded by

Manage OEM12c

Uploaded by

Manage the Manager: Tips on How to

Best Manage Oracle Enterprise

Total Cloud Control

Expanded Cloud Stack

Complete Cloud Lifecycle

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Architecture Overview of Enterprise Manager

Critical Subsystems and its monitoring with Self-monitoring

High Availability and Disaster Recovery

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

CRITICAL SUBSYSTEMS AND ITS

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Responsible for processing the data

Loader Subsystem Monitoring

Indicates the loader processing time

Current loader CPU utilization

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Loader Subsytem Monitoring

Overall Back-off Requests in the Last 10 Mins

Overall Upload Backlog (MB) and (Files)

Incase of consistent increase in Back-off requests / Backlog

Loader Subsystem Monitoring

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

that is scheduled and

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Job Subsystem Monitoring

Monitor Jobs Backlog(Steps)

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Job Subsystem Monitoring

Increasing trend over a prolonged period

Increasing trend over a prolonged period

Job Subsystem Monitoring

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Console Subsystem Monitoring

Monitoring console performance

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Agent Subsystem Monitoring With Partner Agent

Monitoring with Proprietary protocol

Algorithm of automatic partner agent

Agent Subsystem Monitoring With Partner Agent

If AGENT goes down

Partner agent accesses the monitored

Agent status detection done

Host status change detection when the

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Agent Subsystem-Agent Unreachable Troubleshooting

Sub status Description

Agent was brought down in error /brought down as part of

If agent was brought down in error, restart it from the agent

If agent is down, do emctl start agent. To triage agent issue, go

Agent cannot write to file system due to permission issue.

Agent Collections have been disabled. The Agent will no longer

Agent file system is full.

To triage agent issue, go to its agent homepage and run the

Copyright 2014, Oracle and/or its affiliates. All rights reserved. |

Agent Subsystem-Agent Unreachable/Pending Diagnosis

Sub status Description

Agent has been blocked manually.

Unblock the Agent from console -Setup > Manage Cloud

Agent has been blocked for communication with OMS due to

If Agent has been restored from a backup perform an Agent

Agent has been blocked for communication with OMS due to

If Agent has been restored from a backup perform an Agent

Agent is configured for communication with another OMS or

Check Agent configuration to ensure the Agent is

Agent is unreachable due to communication break between

Address the network latency , port being blocked or proxy

Agent is unreachable as it is under migration (2 system

Migrate the agent and then start the agent.

Notification Subsystem Monitoring

Notification Subsystem Monitoring

Check Notification delivery backlog