Blog Feed

ORA-01503: CREATE CONTROLFILE failed on RMAN duplicate between RAC databases

It was while since I performed an RMAN duplicate from a RAC database to another RAC database. You may find this post useless as there’s a lot of information out there about the error I faced, but all my posts are kind of my own little diary.
So, basically. In order to successfully duplicate one database to another you may need this.

    – TNS setup on the server where the database to be duplicated is hosted. You need to be able to tnsping both, the target and the auxiliary database.

     – Create a parameter file and password file for the auxiliary and startup nomount the first instance.
    – Create a full backup of the target database including archivelogs.
    – Connect to rman to both the target and the auxiliary databases from the auxiliary server:
rman target sys/*****@ auxiliary /
    – Issue the duplicate command:

 duplicate target database to ;
This should successfully finish but I got this error:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 01/03/2017 18:29:58
RMAN-05501: aborting duplication of target database
RMAN-06136: ORACLE error from auxiliary database: ORA-01503: CREATE CONTROLFILE failed
ORA-12720: operation requires database is in EXCLUSIVE mode
I searched this and the culprit is that my parameter file for the auxiliary had this:

cluster_database = true

By MOS note “Rman duplicate fail with RMAN-06136, ORA-01503, ORA-12720, ORA-00494 enqueue [CF] (Doc ID 1335479.1)”. Duplicate from a RAC database to another RAC database is not supported, hence we need to modify the parameter cluster_database to false in the auxiliary database.
After doing this the duplicate finished correctly.
Thanks,

Alfredo

OEM 13c New Features – Corrective Actions

Today’s post is about another new feature of Oracle Enterprise Manager (OEM) 13c. Well, not a new feature itself but some awesome enhancements to Corrective Actions (CA).
In this new release 13c several new CA types were added, including “Add Space To Tablespace”, “Chef Recipe” for those who like automation and now can be added to Service Level Agreement (SLA) alerts and job status events.
I’m a big enthusiast of OEM jobs and now having the option to set a CA for them is just more than cool!
I wrote an article a year ago on IOUG’s Select journal about Corrective Actions and didn’t want to finish my day without setting a CA for job. But wait a minute, I know I can set CA for metrics using the “Metric and Collection Settings” page, but there’s no such thing for jobs.
Here’s how we setup a corrective action for jobs. First we need to enable events for jobs, this means that we as administrators decide what kind of job status will create an event.
In order to do this, you need to navigate to Setup -> Incidents -> Job Events
Step 1. Here you decide what job status and job severity generates an event.


Step 2. You decide if you want to generate events for jobs without targets.
Step 3. Choose the target you want to generate job events for.
Once you have the proper events setup, then the next step is to catch this event and do something with it. In this case fire a Corrective Action (CA). The only place we can catch these events and do something with them is an Incident Rule.
Navigate to Setup-> Incident -> Incident Rules
Then create or modify an existing Incident Rule Set. Inside this Rule Set create a new Rule to catch these job events. Inside the Rule Actions there’s a new section to Submit a Corrective Action.
Click on the “Select corrective action” and choose the CA from your CA Library.



Cool doesn’t it! One of the uses I can think of is with Database backup jobs. Sometimes the RMAN backup fails when is not able to find an archivelog and the solution is to run a crosscheck command and retry the backup job. This CA can do that for us and just notify us once it completes.
It still needs further testing but is another tool to save us time.
If you want information about the “Add Space To Tablespace” CA you better check Kellyn’s post about it.
Thanks,

Alfredo

OEM 13c New Features – System Broadcast

I can’t finish the year without a post about OEM 13c. I finally had some time to install it and from first hand go through its new features.  I find this “System Broadcast” feature so useful, especially when you have tons of users using OEM.
You can try to email users, give them a call but they will always forget about the next maintenance window and call you right away when OEM is not available. This feature helps you to send a notification to all the users or a particular list of users.
System Broadcast messages (up to 200 characters) can only be sent using EMCLI, there’s no graphical option yet.
Here’s an example on how to send a message to a user named OEMADMIN:

$ emcli send_system_broadcast -toOption=”SPECIFIC” -to=”OEMADMIN” -message=”OEM will be down for maintenance Friday December 30th”
Successfully requested to send System Broadcast to users.

This is what you see once you login to OEM console:



Send the message to all the users:


$ emcli send_system_broadcast -toOption=”ALL” -message=”OEM will be down for maintenance Friday December 30th”
Successfully requested to send System Broadcast to users.
You need to be logged in EMCLI to be able to send these messages, if you are not you’ll get this error:

$ emcli send_system_broadcast -toOption=”ALL” -message=”OEM will be down for maintenance Friday December 30th”
Status:Unauthorized 401
As I said, you need to be logged in EMCLI:

$emcli login -username=sysman
Enter password:

Login successful

$ emcli send_system_broadcast -toOption=”ALL” -message=”OEM will be down for maintenance Friday December 30th”
Successfully requested to send System Broadcast to users.
Here’s the verb syntax:
emcli send_system_broadcast
      -toOption=”ALL|SPECIFIC”
      [-to=”comma separated user names”]
      [-messageType=”INFO|CONF|WARN|ERROR|FATAL” (default is INFO)]
      -message=”message details”
[ ]  indicates that the parameter is optional.
Thank you and happy new year!
Alfredo

ORA-01555 query duration 0 seconds with Dataguard

How many times we have calls from users complaining about their process that failed due to an ORA-01555 error?

We know that if the queries are not well tuned and they modify a lot of data, the image held in the UNDO Tablespace could not be consistent with the real data. But have you ever seen this error right away after executing a SQL statement against a table?
I just did couple of days ago. Here’s the story:
ORA-01555 error appeared in the alertlog’s database with a query duration of 0 seconds.
ORA-01555 caused by SQL statement below (SQL ID: d3rt4tyudufeu, Query Duration=0 sec, SCN: 0x034f.34f660b4)
Any queries plus an analyze table failed right away with ORA-01555:
ERROR at line 1: ORA-0155: snapshot too old: rollback segment number 10 with name “SYSSMU11_1072300523734$” too small

So weird.
After researching a bit on MOS, found a note regarding a bug.
Some minutes later we also started to receive ORA-600 errors related so scn numbers.
ORA-error stack (00600[ktbdchk1: bad dscn])
The MOS note mentions the ORA-01555 and the ORA-600 errors as part of bug 22241601 with a Dataguard configuration. Is worth to mention that yes, we were doing switchover testing recently in this 12.1.0.2 environment.
ALERT Bug 22241601 ORA-600 [kdsgrp1] / ORA-1555 / ORA-600 [ktbdchk1: bad dscn] / ORA-600 [2663] due to Invalid Commit SCN in INDEX (Doc ID 1608167.1)
The solution is to apply the patch but there’s also a tested workaround that is to rebuild online all the indexes of that table.
Hope this helps.

Alfredo

CRS-2632: There are no more servers to try to place resource ‘resource_name’ on that would satisfy its placement policy

I just got this error from CRS while trying to start a database using srvctl. I find this error kind of missleading as the error doesn’t tell what is really happening with the resource.
I suggest you to manually check the logs of the resource. In this case it had to do with a cluster database that one of the instances was not able to start. I tried to start this instance manually and then I got the real error.
The instance was having a missconfiguration in the parameter file that prevented it from starting.
Hope this note will provide you a guidance while having this CRS-2632 error coming out from CRS.
Thanks,

Alfredo

Tim Gorman is our guest speaker for September 30th NEOOUG’s quarterly meeting.


Tim currently works as a Technical Consultant at Delphix. Oracle ACE Director, Oak Table member and an advocate of the North East Ohio Oracle Users Group for many years. This time Tim brings two interesting sessions, “Split Brain Syndrome” and “Scaling To Infinity”.
Join us for these sessions, have lunch and learn from one of the most remarkable leader in technology.
Keep in mind that this time this quarterly meeting is to be held at the new Oracle’s facility in Beachwood, OH. We also require you to pre-register to this event using below URL:
More information here:
Thanks,
Alfredo

WAIT_FOR_GAP. How to restore missing archivelogs from backup?

In a Dataguard configuration, Oracle’s RFS (Remote File Server) writes redo data to the standby. When for any reason it can’t write this data, MRP (Managed Recovery Process) will wait for the archivelog to be applied and have the status “WAIT_FOR_LOG”. This will lead the standby to be out-of-sync with the primary database.
Sometimes some archivelogs can’t be transferred from primary database to the standby leaving a gap in the archivelog sequence. The MRP process will have the status “WAIT_FOR_GAP”.  
In order to fix the archivelog gap we have to manually transfer the archivelogs missing.
To find the gap you can query v$archive_gap (gv$archive_gap for RAC).
SELECT INST_ID, THREAD#, HIGH_SEQUENCE#, LOW_SEQUENCE# FROM GV$ARCHIVE_GAP;
INST_ID       THREAD#       HIGH_SEQUENCE#       LOW_SEQUENCE#
————- ————— ———————— ————————
2             2             823                  811
You can see that we are missing archivelogs from sequence 811 to 823 for thread 2. If these archivelogs are not available in the primary we have to restore them from backup.
RMAN> RESTORE ARCHIVELOG FROM SEQUENCE 811 UNTIL SEQUENCE 823 THREAD=2;
Keep in mind that parameter THREAD defaults to 1, so you must specify the thread number when you are trying to restore from a different thread.
After restoring these archivelogs the RFS process should transfer them automatically to the standby. Verify if the gap is fixed.
Thanks,

Alfredo

OEM Agent Plug-in Testing

Today I want to write about Oracle Enterprise Manager (OEM) 12c plug-ins and how to test them. 

OEM is divided into two logical parts, the EM Platform and the EM Plug-ins. The EM Platform basically consists of the UI console, job system, metrics, EM agent, etc. The EM Plug-ins are modules that you can attach to you EM Platform to extend the monitoring functionality. As a result, we have plug-ins for Oracle Databases, Fusion Middleware, even hardware or 3rd-party vendors like MS SQL Server.

The plug-ins have a set of Perl scripts that the agent uses to compute status and metrics for the targets. You can find these scripts under “/plugins//scripts”.
Let us see an example:

Fusion Middleware plug-in version 12.1.0.7:
$ cd  /plugins/oracle.sysman.emas.agent.plugin_12.1.0.7.0/scripts

In there you can see tons of Perl scripts. These scripts are being used by the agent to compute response and metrics for the targets registered in the agent.

Now let’s say you have a Web Cache target that shows an incorrect status in OEM and you want to know how the agent computes the response status of this target.

 $ ls webcache*.pl
webcacheesm.pl  webcacheIsStandalone.pl  webcacheresource.pl  webcacheresponse.pl

You can see that there’s a response script available. 
Now the question is, how to test it?

Based on the MOS id 1534087.1, you can create an env.sh script to load the required variables to your shell. I did create the env.sh file and source it:

$ . env.sh

Now I’m able to test the webcacheresponse.pl:

$ perl webcacheresponse.pl
em_result=1

Looks like having 1 as the result means that the Web Cache is up per the script’s logic. This result is passed to the agent and finally to the OMS.

Hope this is helpful for you while trying to investigate status related issues with your targets.
Thanks,

Alfredo

How to upgrade the RMAN recovery catalog to 12c?

Some days back I faced an issue while trying to upgrade the RMAN recovery catalog to support 12c databases.


RMAN> upgrade catalog;
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-07539: insufficient privileges to create or upgrade the catalog schema

The problem is that for 12c the catalog owner requires additional privileges. The solution is to run the dbmsrmansys.sql script that comes with the 12c binaries.

You have to copy this file from your 12c home “$ORACLE_HOME/rdbms/admin” directory to where your RMAN recovery catalog database is and execute it.

SQL> @$ORACLE_HOME/rdbms/admin/dbmsrmansys.sql
The script is going to complain for the lack of 2 other scripts, but the upgrade runs just fine.
$ rman CATALOG rman@catalog
recovery catalog database Password:
RMAN> UPGRADE CATALOG;
RMAN> UPGRADE CATALOG;
RMAN> EXIT;
Thanks,

Alfredo

ALTER SYSTEM in a RAC environment

Few weeks ago I had to modify an instance parameter to a different value. In this case, I just wanted to modify it for only one instance and leave the other instance as it was.
I thought that if I didn’t specify the instance name using the SID clause, the default is going to be just that instance. I was really wrong about that and the command ended modifying both instances.
After looking at Oracle’s documentation here’s what I’ve found.
SID Clause
·         Specify SID = ‘*’ if you want Oracle Database to change the value of the parameter for all instances.
·         Specify SID = ‘sid’ if you want Oracle Database to change the value of the parameter only for the instance sid. This setting takes precedence over previous and subsequent ALTER SYSTEM SET statements that specify SID = ‘*’.

If you do not specify this clause:

·         If the instance was started up with a pfile (client-side initialization parameter file), then Oracle Database assumes the SID of the current instance.
·         If the instance was started up with an spfile (server parameter file), then Oracle Database assumes SID = ‘*’.

If you specify an instance other than the current instance, then Oracle Database sends a message to that instance to change the parameter value in the memory of that instance.

So now I know that in future I need to specify the SID in a RAC instance using spfile.

https://docs.oracle.com/cd/B12037_01/server.101/b10759/statements_2013.htm

Thanks,

Alfredo