OEM 12c - alfredokriegdba.com

Using repvfy to find problems in the OEM 12c repository

The repvfy Kit is very useful when you are trying to diagnose a problem in OEM Cloud Control 12c.

I noticed that some of the tasks from the dbms_scheduler weren’t running on time, hence creating a backlog in the repository.

In order to get more information about this issue, you can make use of the repvfy Kit. The installation is pretty straight forward and is covered in the Oracle Support Note 1426973.1. At the time of this post repvfy version 2015.0622 is available.

Once installed you may start running test against individual modules or the entire OEM 12c repository.

What modules I can test using repvfy?

$ repvfy –h4

Let’s say you want a complete test with all the details of the entire OEM 12c repository, then you may run:

$ repvfy -level 9 -details

Keep in mind that this task is going to take some time to finalize, as is testing all modules available.

Ok, now going back to my problem with scheduler jobs not running on time. I decided to run the performance test to have more details of what is going on with the repository. This the command used for the test:

$ repvfy dump performance

The report looks like this

— — ——————————————————————— —

— — REPVFY: 2015.0507 Repository: 12.1.0.4.0 29-Jul-2015 11:27:01 —

— —————————————————————————

[—– REPVFY Version details ———————————————–]

COMPONENT INFO

—————— —————————————-

EMDIAG Version 2015.0507

Repository Version 12.1.0.4.0

Database Version 11.2.0.4.0

Test Version 2015.0526

Repository Type CENTRAL

5 rows selected.

[—————————————————————————-]

[– Database information —————————————————-]

[—————————————————————————-]

[—– Database information ————————————————-]

…

[—– Instance information ————————————————-]

…

[—– DBMS_SCHEDULER execution statistics (last two days) ——————]

JOB_NAME RUNS MIN_DELAY MAX_DELAY AVG_DELAY

—————————————- ———- ———- ———- ———-

EM_AVAIL_UNKNOWN_STUCK 169 .01 1.89 .43

EM_BEACON_GENSVC_AVAIL 507 .01 1.87 .58

EM_BSLN_SET_THRESHOLDS 8 .01 1.58 .38

EM_DERIV_RETRY_ACTIONS_JOB 101 .01 1.79 .36

EM_ECM_VCPU_JOB 8 .02 1.72 .7

EM_GATHER_SYSMAN_STATS 5 .05 1.66 .6

EM_GROUP_MEMBER_SYNCUP 503 .01 113.34 2.04

EM_HEALTH_CALC_JOB 507 .01 2.18 .58

EM_JOBS_STEP_SCHED 11953 0 3.89 .35

EM_JOB_PURGE_POLICIES 1 .04 .04 .04

EM_METBSLN_COMPUTE_STATS 16 .01 1.08 .23

EM_PING_MARK_NODE_STATUS 1014 .01 1.89 .44

EM_PURGE_POLICIES 1 .4 .4 .4

EM_REPOS_SEV_EVAL 43077 0 6.94 1.06

EM_ROLLUP_SCHED_JOB 1 .02 .02 .02

EM_SLM_COMP_SCHED_JOB 507 .01 2.09 .58

EM_SYSTEM_MEMBER_SYNUP 507 .01 1.9 .63

EM_TASK_RESUBMIT_FAILED 8 .01 1.58 .37

EM_TASK_WORKER_23 491 .02 1.94 .65

EM_TASK_WORKER_24 1 2.03 2.03 2.03

EM_TASK_WORKER_25 17 .01 1.71 .59

EM_TASK_WORKER_26 17 .02 1.92 .55

EM_TGT_PROP_CONF_PP 1 1.67 1.67 1.67

23 rows selected.

[—– Worker thread count ————————————————–]

CLASS WORKER_COUNT

————————- ————

Short (0) 1

Long (1) 1

2 rows selected.

[—– Task worker backlog ————————————————–]

CLASS CNT

————————- ———-

Short (0) 3190

1 row selected.

…

Here, we can clearly see that out Task Worker for Short tasks is getting a huge backlog. Next, I decided to run a system dump to get all the EM Infrastructure details.

$ repvfy dump system

Here’s another interesting finding:

[—– PL/SQL tracing levels ————————————————]

CONTEXT_TYPE_ID CONTEXT_TYPE TRACE_LEVEL LAST_UPDATE_DATE

————— —————————————- ————— ——————–

1 EM_EVENT_RECEIVER 4-OFF 12-MAY-2014 18:23:13

2 EM_EVENT_MANAGER 4-OFF 12-MAY-2014 18:23:13

4 EM.DERIV 4-OFF 12-MAY-2014 18:23:13

5 EM_EVENT_BUS 4-OFF 12-MAY-2014 18:23:13

6 EM_NOTIFY 4-OFF 12-MAY-2014 18:23:13

7 EM_PPC 4-OFF 12-MAY-2014 18:23:13

8 DEFAULT 4-OFF 12-MAY-2014 18:23:13

9 TRACER 4-OFF 12-MAY-2014 18:23:13

10 LOADER 4-OFF 12-MAY-2014 18:23:13

11 NOTIFICATION 4-OFF 12-MAY-2014 18:23:13

12 REPOCOLLECTION 4-OFF 12-MAY-2014 18:23:13

13 EMCLI 4-OFF 12-MAY-2014 18:23:13

14 EM.JOBS 4-OFF 12-MAY-2014 18:23:13

15 EM.BLACKOUT 4-OFF 12-MAY-2014 18:23:13

16 SVCTESTAVAIL 4-OFF 12-MAY-2014 18:23:13

17 COMPLIANCE_EVALUATION 4-OFF 12-MAY-2014 18:23:13

18 EM.ECM 4-OFF 12-MAY-2014 18:23:13

19 EM_SLM_COMPUTATION 4-OFF 21-MAR-2012 14:24:35

20 EM_CNTR_QUEUE 4-OFF 12-MAY-2014 18:23:13

21 EMD_RAC 4-OFF 12-MAY-2014 18:23:13

22 DB_SYSTEM 4-OFF 12-MAY-2014 18:23:13

23 EMD_DBSERVICE 2-WARNING 17-MAR-2015 13:33:16

24 EM_DBM 2-WARNING 17-MAR-2015 13:36:38

25 CAT 4-OFF 12-MAY-2014 18:23:13

26 EM_SSA_XAAS 4-OFF 12-MAY-2014 18:23:13

27 MGMT_COLLECTION.COLLECTION_SUBSYSTEM 4-OFF 12-MAY-2014 18:23:13

28 SEVERITY_EVALUATION 4-OFF 12-MAY-2014 18:23:13

29 SEVERITY_TRIGGER 4-OFF 12-MAY-2014 18:23:13

30 EM.GDS 2-WARNING 09-SEP-2014 13:43:36

31 BLK_TRACE 2-WARNING 17-MAR-2015 12:22:15

32 MET_BASELINE 2-WARNING 17-MAR-2015 12:23:34

33 METRIC_LOAD 2-WARNING 17-MAR-2015 12:23:34

34 USAGE_SUMMARY 2-WARNING 17-MAR-2015 12:26:36

35 JVMD_LOG_MODULE 2-WARNING 17-MAR-2015 13:06:15

36 EM_HEALTH_CALC 2-WARNING 17-MAR-2015 13:06:20

39 CRS_EVENT 2-WARNING 09-JUN-2015 15:44:03

36 rows selected.

As a best practice, we should have at least 2 Task Workers for each Short/Long tasks; and have trace disabled for the PL/SQL packages, unless we are troubleshooting an issue on them.

At this point repvfy helped us to identify 2 issues in our OEM 12c repository, now the question is, how do I fix them?

Well, repvfy also has the capabilities to fix problems related to those tests. In fact, if we want to check for the recommended values and have them fixed, we can run the following command:

$ repvfy execute optimize

This command will run tests against the internal task system, repository settings and the target system.

After the command finished, I checked again and found that my number of Task Workers was modified to 2 for each type and the trace was disabled for all the PL/SQL packages.

Do you want more information about the execute optimize command? Check Courtney Llamas blog.

http://courtneyllamas.com/getting-to-know-emdiag-repvfy-execute-optimize/

Thanks,

Alfredo

Deploy multiple plug-ins at once using OEM 12.1.0.4 console

Today’s post is about a neat Oracle EM 12c feature. I spoke in the past Collaborate 2015 about deploying multiple plug-ins at once using emcli to save time. I used emcli because the console didn’t have the option to do that. Guess what? the new release 12.1.0.4 has the option to do it from the console! This is especially handy when you don’t know how to use emcli and you are in the need to deploy several plug-ins to the OMS and you don’t want to spend that humongous amount of time doing it one by one.

In order to do that you just have to go to:

– Click in Setup

– Navigate to Extensibility -> Plugins

– Select one of the plug-ins you want to deploy

The next screen will ask you to add more plug-ins if required.

It will also tell you if any downtime is required for the plug-in deployment.

Click next and proceed as usual with the deployment process.

Thanks,

Alfredo

Oracle Enterprise Manager Webinar

This is the recorded presentation of the Oracle Enterprise Manager 12c @JoinSQL seminar held in Romania last month. Hope you find it interesting.

Click Here!

RMAN jobs not working after OEM upgrade to 12.1.0.4

If you are planning to upgrade your OEM to 12.1.0.4 and you have RMAN jobs scheduled in Cloud Control, you should consider applying patch 19519190 to the OMS. I noticed that most of the RMAN jobs were having issues and even worst, some steps were empty!!!

Obviously, the jobs were succeeding as the step is empty. In other words, the jobs were doing nothing.

Looks like this patch is not part of any PSU, yet! But having a problem with hundreds of jobs and especially with RMAN jobs is very risky.

Take a look at EM 12c: RMAN Step Commands are Being Removed from Multi-step RMAN Script Jobs in Enterprise Manager 12.1.0.4 Cloud Control (Doc ID 1914916.1).

Thanks,

Alfredo

Using OMS DEBUG mode to troubleshoot OEM 12c problems

This time, I want to show you how to troubleshoot OEM problems by enabling DEBUG mode in the OMS. The virtual machine (VM) running my sandbox installation of OEM 12c 12.1.0.4 crashed during the night. After restarting the VM and all the OEM components, I wasn’t able to login using the SYSMAN account. The error from the console was not very explicit, just, “Authentication failed. If problem persists, contact your system administrator.”

In order to get more details about the error, I decided to enable DEBUG mode for the OMS and reproduce the error. This is what I did to enable DEBUG mode.

$ cd /u01/app/oracle/oms/oms/bin

$ ./emctl set property -name log4j.rootCategory -value “DEBUG, emlogAppender, emtrcAppender” -module logging

Oracle Enterprise Manager Cloud Control 12c Release 4

SYSMAN password:

Property log4j.rootCategory has been set to value DEBUG, emlogAppender, emtrcAppender for all Management Servers

OMS restart is not required to reflect the new property value

After enabling DEBUG mode, I reproduced the error several times using the console. I also wrote down the approximate time of the error, just to easy the search in the log file. Searching in the emoms.trc file located under /em/EMGC_OMS1/sysman/log/, found an ORA-14400 error. The MOS note 1493151.1, explains how to fix the issue by adding a new audit partition.

$ cd /u01/app/oracle/gc_inst/em/EMGC_OMS1/sysman/log/

$ view emoms.trc

java.sql.SQLException: ORA-14400: inserted partition key does not map to any partition

The final step is to disable the DEBUG mode for your OMS, otherwise the log files can grow real big and the performance could be affected.

$ ./emctl set property -name log4j.rootCategory -module LOGGING -value “WARN, emlogAppender, emtrcAppender”

Oracle Enterprise Manager Cloud Control 12c Release 4

Copyright (c) 1996, 2014 Oracle Corporation. All rights reserved.

SYSMAN password:

Property log4j.rootCategory has been set to value WARN, emlogAppender, emtrcAppender for all Management Servers

OMS restart is not required to reflect the new property value

I hope this information is useful to you next time you are troubleshooting an OEM 12c issue.

Thanks,

Alfredo

Author alfredokriegPosted on March 26, 2015Categories UncategorizedTags OEM 12c, Oracle2 Comments on Using OMS DEBUG mode to troubleshoot OEM 12c problems

Oracle Enterprise Manager Security– Disable SYSMAN access

In Enterprise Manager 12c SYSMAN user is the schema owner and as a best practice all the users should log in using their own individual accounts. To enforce this you can prevent SYSMAN from login into the console and/or emcli by setting SYSTEM_USER to -1 in the MGMT_CREATED_USERS table:

UPDATE MGMT_CREATED_USERS

SET SYSTEM_USER=’-1’

WHERE user_name=’SYSMAN’

To re-enable the access just set it to 1.

UPDATE MGMT_CREATED_USERS

SET SYSTEM_USER=’1’

WHERE user_name=’SYSMAN’

Refer to Oracle Support’s note:

How To Disable SYSMAN & SYSTEM Users from Logging into Grid Console? (Doc ID 867360.1)

Thanks,

Alfredo

Author alfredokriegPosted on December 31, 2014Categories UncategorizedTags OEM 12c, Oracle, security

Oracle Enterprise Manager – Reducing the noise, Part 1

Enterprise Manager 12c is a great monitoring tool, with it you can monitor a wide range of target types from databases to middleware; although out-of-the-box metrics can suit your monitoring requirements they can generate a considerable amount of white noise. In order to reduce this noise first you have to identify which are the top alerts in your system; Cloud Control comes with several predefined reports that help you to dig into multiple areas of your system, there’s a report “20 Most Common Alerts” which shows you the incidence of common alerts.

In the picture above, you can clearly see that metric “Database Time Spent Waiting (%)” appears twice in my Top 3, let’s find out our metric setting for my DB targets; in order to do this we must go to a DB home page then Oracle Database -> Monitoring -> Metrics and Collection Settings.

Wait a minute! Why I’m receiving alerts if there are no thresholds setup for any of those metrics?, this behavior is clearly explained in MOS note 1500074.1 about a default warning threshold of 30% inside the database configuration. Let’s take a look to dba_threshold to confirm.

set lines 300

column METRICS_NAME format a30

column WARNING_OPERATOR format a30

column WARNING_VALUE format a30

column CRITICAL_OPERATOR format a30

column CRITICAL_VALUE format a30

SELECT METRICS_NAME,WARNING_OPERATOR ,WARNING_VALUE,CRITICAL_OPERATOR ,CRITICAL_VALUE FROM DBA_THRESHOLDS;

METRICS_NAME                        WARNING_OPERATOR               WARNING_VALUE                  CRITICAL_OPERATOR              CRITICAL_VALUE

———————————– —————————— —————————— —————————— ——————————

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             30                             NONE

Average Users Waiting Counts        GT                             30                             NONE

Blocked User Session Count          GT                             0                              NONE

Current Open Cursors Count          GT                             1200                           NONE

Database Time Spent Waiting (%)     GT                             30                             NONE

Database Time Spent Waiting (%)     GT                             30                             NONE

Database Time Spent Waiting (%)     GT                             30                             NONE

Database Time Spent Waiting (%)     GT                             30                             NONE

Database Time Spent Waiting (%)     GT                             30                             NONE

Database Time Spent Waiting (%)     GT                             30                             NONE

Database Time Spent Waiting (%)     GT                             50                             NONE

Database Time Spent Waiting (%)     GT                             50                             NONE

Logons Per Sec                      GE                             100                           NONE

Session Limit %                     GT                             90                             GT                             97

Tablespace Bytes Space Usage        DO NOT CHECK                   0                              DO_NOT_CHECK                   0

Tablespace Space Usage              GE                             85                             GE                             97

22 rows selected.

There you go!, all metrics for “Database Time Spent Waiting (%)” are set to 30% or 50% values, now the trick to disable these metrics is to set them to a different value like 99%; this will override the default value as follows:



Let’s look at the database setting again:

set lines 300

column METRICS_NAME format a30

column WARNING_OPERATOR format a30

column WARNING_VALUE format a30

column CRITICAL_OPERATOR format a30

column CRITICAL_VALUE format a30

METRICS_NAME                        WARNING_OPERATOR               WARNING_VALUE                  CRITICAL_OPERATOR              CRITICAL_VALUE

———————————– —————————— —————————— —————————— ——————————

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             30                             NONE

Average Users Waiting Counts        GT                             30                             NONE

Blocked User Session Count          GT                             0                              NONE

Current Open Cursors Count          GT                             1200                           NONE

Database Time Spent Waiting (%)     GT                             99                             NONE

Database Time Spent Waiting (%)     GT                             99                             NONE

Database Time Spent Waiting (%)    GT                             99                             NONE

Database Time Spent Waiting (%)     GT                             99                             NONE

Database Time Spent Waiting (%)     GT                             99                             NONE

Database Time Spent Waiting (%)     GT                             99                             NONE

Database Time Spent Waiting (%)     GT                             99                             NONE

Database Time Spent Waiting (%)     GT                             99                             NONE

Database Time Spent Waiting (%)     GT                             99                             NONE

Database Time Spent Waiting (%)     GT                             99                             NONE

Database Time Spent Waiting (%)     GT                             99                             NONE

Logons Per Sec                      GE                             100                            NONE

Session Limit %                     GT                             90                             GT                             97

Tablespace Bytes Space Usage        DO NOT CHECK                   0                              DO_NOT_CHECK                   0

Tablespace Space Usage              GE                             85                             GE                             97

25 rows selected.

We successfully modified these metrics to a very high value; at this point you can decide to stay at 99% or you can remove that threshold in order to completely disable them.

Now let’s confirm those settings in the database:

set lines 300

column METRICS_NAME format a30

column WARNING_OPERATOR format a30

column WARNING_VALUE format a30

column CRITICAL_OPERATOR format a30

column CRITICAL_VALUE format a30

METRICS_NAME                        WARNING_OPERATOR               WARNING_VALUE                  CRITICAL_OPERATOR              CRITICAL_VALUE

———————————– —————————— —————————— —————————— ——————————

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             10                             NONE

Average Users Waiting Counts        GT                             30                             NONE

Average Users Waiting Counts        GT                             30                             NONE

Blocked User Session Count          GT                            0                              NONE

Current Open Cursors Count          GT                             1200                           NONE

Logons Per Sec                      GE                             100                            NONE

Session Limit %                     GT                             90                             GT                             97

Tablespace Bytes Space Usage        DO NOT CHECK                   0                              DO_NOT_CHECK                   0

Tablespace Space Usage              GE                             85                             GE                             97

14 rows selected.

The metrics are not there anymore and hopefully the alerts neither. This behavior is also noted for “Average Users Waiting Counts” metric, if you are receiving considerable white noise for this metric you can disable as well following the same procedure. A good practice is to create a Monitoring template to help you modify these thresholds for multiple targets at once.

Stay tuned for my next post about reducing OEM 12c noise.

Thanks,

Alfredo

Author alfredokriegPosted on July 27, 2014Categories UncategorizedTags alerts, Database, metrics, OEM 12c, Oracle1 Comment on Oracle Enterprise Manager – Reducing the noise, Part 1