Oracle DBA – Tips and Techniques

Knowledge is Power – so share it ….

Archive for the ‘High Availability’ Category

11g Standby database creation without any RMAN backups

Posted by Gavin Soorma on December 4, 2009

Quick steps to set up a 11g Standby database with Active Data Guard using the Active Duplication feature available in 11g where we can create a standby database without having to take a backup on the primary database. Datafiles are copied over the network.

Primary machine – OATU036
Standby machine – DROU036

Database Name – SID1O

TNS alias for Primary – sid1o_fc
TNS alias for standby – sid1o_js

Enable force logging on the Primary database

SQL> alter database force logging;

Database altered.

Create the Standby log files on the Primary database

Copy the password file from the $ORACLE_HOME/dbs directory on primary server to $ORACLE_HOME/dbs on the standby server

Update listener.ora on Standby machine

(SID_DESC=
(GLOBAL_DBNAME=sid1o_js)
(ORACLE_HOME=/u01/oracle/product/11.1.0/db_1)
(SID_NAME=sid1o)
)

Stop and Restart the listener on the standby site

Update tnsnames.ora on Standby as well as Primary site with the alias ’sid1o_js’ and ’sid1o_fc’

sid1o_js =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = drou036)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = sid1o_js )
)
)

SID1O_FC =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = oatu036)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = sid1o.bankwest.com)
)
)

Create an init.ora on the Standby machine with just a single line which is the db_name parameter

sid1o:/u01/oracle/product/11.1.0/db_1/dbs> cat initsid1o.ora
db_name=sid1o

Startup the Standby instance in nomount state

SQL> startup nomount;
ORACLE instance started.

Total System Global Area 217157632 bytes
Fixed Size 2152328 bytes
Variable Size 159385720 bytes
Database Buffers 50331648 bytes
Redo Buffers 5287936 bytes

On the Primary launch RMAN and establish an auxiliary connection to the standby instance

sid1o:/u01/oracle> rman target / auxiliary sys/xxx@sid1o_js

Recovery Manager: Release 11.1.0.7.0 – Production on Fri Dec 4 10:28:51 2009

Copyright (c) 1982, 2007, Oracle. All rights reserved.

connected to target database: SID1O (DBID=2860177231)
connected to auxiliary database: SID1O (not mounted)

Run the command to create the Standby Database

DUPLICATE TARGET DATABASE
FOR STANDBY
FROM ACTIVE DATABASE
NOFILENAMECHECK
DORECOVER
SPFILE
SET DB_UNIQUE_NAME=”sid1o_js”
SET LOG_ARCHIVE_DEST_2=”service=sid1o_fc LGWR SYNC REGISTER VALID_FOR=(online_logfile,primary_role)”
SET FAL_SERVER=”sid1o_fc”
SET FAL_CLIENT=”sid1o_js”
SET CONTROL_FILES=’/u02/oradata/sid1o/control01.ctl’,'/u03/oradata/sid1o/control02.ctl’;

Change the init.ora parameters related to redo transport and redo apply

On standby and primary

SQL> alter system set standby_file_management=AUTO scope=both;

System altered.

On Primary

SQL> alter system set fal_server=sid1o_js scope=both;

System altered.

SQL> alter system set fal_client=sid1o_fc scope=both;

System altered.

SQL> alter system set LOG_ARCHIVE_DEST_2=’SERVICE=sid1o_js LGWR SYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=sid1o_js’
scope=both; 2

System altered.

SQL> alter system set LOG_ARCHIVE_DEST_1=’LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=sid1o’;

System altered.

Shutdown the Standby and enable managed recovery (active standby mode)

SQL> shutdown immediate;
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 1043886080 bytes
Fixed Size 2160352 bytes
Variable Size 775948576 bytes
Database Buffers 260046848 bytes
Redo Buffers 5730304 bytes
Database mounted.
Database opened.

SQL> recover managed standby database using current logfile disconnect;
Media recovery complete.
SQL>

Check if the MRP process is running

SQL> !ps -ef |grep mrp
oracle 446526 1 0 10:59:01 – 0:00 ora_mrp0_sid1o

TEST

On Primary

SQL> conn system/xxx
Connected.
SQL> create table test_dr
2 (mydate date);

Table created.

SQL> insert into test_dr
2 values
3 (sysdate);

1 row created.

SQL> commit;

Commit complete.

SQL> alter system switch logfile;

System altered.

On Standby

SQL> conn system/xxx
Connected.
SQL> select to_char(mydate,’DD-MON-YY HH24:MI:SS’) from test_dr;

TO_CHAR(MYDATE,’DD-MON-YYHH
—————————
04-DEC-09 11:15:49

Posted in High Availability, Oracle 11g | Tagged: , , , | Leave a Comment »

Oracle Restart – new in 11g R2

Posted by Gavin Soorma on October 29, 2009

Similar to the clusterware processes in a RAC environment, in 11g R2 even for a standalone instance, functionality is now available to automatically restart components like the database, listener, ASM diskgroup, service etc in the event of their failure.

This feature is called Oracle Restart and it runs out of the Grid Infrastructure home which is separate from the database home.

Oracle Restart is managed by CRSCTL utility and it uses Oracle High Availability Services to start and stop the components managed by Oracle Restart.

Using Oracle Restart, we can stop or start all the components running out of a single Oracle Home with a single srvctl command as well as Oracle Restart will automatically start all components on machine reboot in the proper order taking into account dependencies like ASM instance and the managed database instances. So no more do we have to use the dbstart and dbstop scripts to manage this.

Oracle Restart can be extended to Data Guard configurations as well by integrating with Data Guard Broker to manage the start and stop of database services following a data guard role transition.

Let us examine some of the srvctl commands which can be executed in a single instance environment as well as see the use of the crsctl command – normally previously only used for RAC environments.

-bash-3.2$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.DATA1.dg   ora....up.type ONLINE    ONLINE    redhat64
ora....ER.lsnr ora....er.type ONLINE    ONLINE    redhat64
ora....WARE.dg ora....up.type ONLINE    ONLINE    redhat64
ora.asm        ora.asm.type   ONLINE    ONLINE    redhat64
ora.cssd       ora.cssd.type  ONLINE    ONLINE    redhat64
ora.diskmon    ora....on.type ONLINE    ONLINE    redhat64
ora.eons       ora.eons.type  ONLINE    ONLINE    redhat64
ora.ons        ora.ons.type   ONLINE    ONLINE    redhat64

bash-3.2$ ./crsctl check has
CRS-4638: Oracle High Availability Services is online

-bash-3.2$ ./crsctl check css
CRS-4529: Cluster Synchronization Services is online

-bash-3.2$ ./srvctl start listener

-bash-3.2$ ps -ef |grep tns
oracle   14899     1  0 14:08 ?        00:00:00 /u02/app/oracle/product/11.2.0/grid/bin/tnslsnr LISTENER -inherit

-bash-3.2$ ./srvctl status listener
Listener LISTENER is enabled
Listener LISTENER is running on node(s): redhat64

-bash-3.2$ ./srvctl status asm
ASM is running on redhat64

-bash-3.2$ ./srvctl status diskgroup -g DATA1
Disk Group DATA1 is running on redhat64

-bash-3.2$ ./srvctl config asm
ASM home: /u02/app/oracle/product/11.2.0/grid
ASM listener: LISTENER
Spfile: +DATA1/asm/asmparameterfile/registry.253.700932479
ASM diskgroup discovery string: /dev/raw/raw*

In case a process dies unexpectedly, Oracle Restart will automatically start the process. In the example below, we kill the PMON process of the ora11gr2 instance and find that in less than 5 seconds the instance has been restarted by the Oracle High Availablity Service running in the background constantly monitoring the health of the managed components.

-bash-3.2$ ps -ef |grep pmon
oracle   12710     1  0 Oct22 ?        00:00:01 asm_pmon_+ASM
oracle   15752     1  0 14:14 ?        00:00:00 ora_pmon_ora11gr2
oracle   15975 18271  0 14:15 pts/2    00:00:00 grep pmon

-bash-3.2$ kill -9  15752

-bash-3.2$ ps -ef |grep pmon
oracle   12710     1  0 Oct22 ?        00:00:01 asm_pmon_+ASM
oracle   16679     1  0 14:22 ?        00:00:00 ora_pmon_ora11gr2
oracle   16775 18271  0 14:22 pts/2    00:00:00 grep pmon

In this example we stop all the components running out of a particular home by executing the srvctl stop home command. After stopping the database home, we find that only the ASM instance is running as it is managed by a seperate home which is the Grid Infrastructure home. After the database home is started, the database instance ora11gr2 also is automatically started.

[oracle@redhat346 ~]$ srvctl stop home -o /u01/app/oracle/product/11.2.0/dbhome_1 -s /tmp/state

[oracle@redhat346 ~]$ ps -ef |grep pmon
oracle   19841     1  0 Sep18 ?        00:01:24 asm_pmon_+ASM
oracle   31843 27855  0 15:19 pts/1    00:00:00 grep pmon

[oracle@redhat346 ~]$ srvctl start home -o /u01/app/oracle/product/11.2.0/dbhome_1 -s /tmp/state

[oracle@redhat346 ~]$ ps -ef |grep pmon
oracle   19841     1  0 Sep18 ?        00:01:24 asm_pmon_+ASM
oracle   26690     1  0 Oct08 ?        00:00:41 ora_pmon_emrep
oracle   32054     1  0 15:20 ?        00:00:00 ora_pmon_testdb
oracle   32344     1  0 15:20 ?        00:00:00 ora_pmon_ora11gr2
oracle   32620 27855  0 15:22 pts/1    00:00:00 grep pmon

Posted in Administration, High Availability, Oracle 11g release 2 | Tagged: , , , | 2 Comments »

11g Snapshot Standby for real time testing

Posted by Gavin Soorma on September 2, 2009

One of the very good new features in Oracle 11g is the Snapshot Standby database where we can basically convert the physical standby database into a read-write real time clone of the production database and we can then temporarily use this environment for carrying out any kind of development testing, QA type work or to test the impact of a proposed production change on an application.

Basically, the Snapshot Standby database in turn uses the Flashback Database technology to create a guaranteed restore point to which the database can be later flashed back to.

So we need to keep in mind that whatever limitations are present in the Flashback database technology, the same will also be inherent in the Snapshot database fetaure.

Let us illustrate this feature by using an example where we would like to test the impact of the change of some important optimizer related parameters on a key application query using the actual production data.

In production, the SQL query below is shown to be peforming a full table/partition scan and using hash joins. We would like to see if a nested loop join and index usage will prove beneficial to this SQL query that is used frequently by the application.

explain plan for
SELECT c.cust_last_name, sum(s.amount_sold) AS dollars,sum(s.quantity_sold) as quantity
FROM sales s , customers c, products p
WHERE c.cust_id = s.cust_id
AND s.prod_id = p.prod_id
AND c.cust_state_province IN ('Dublin','Galway')GROUP BY c.cust_last_name
/

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------
Plan hash value: 1248656060

------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |             |   519 | 23355 |   935   (6)| 00:00:12 |       |       |
|   1 |  HASH GROUP BY         |             |   519 | 23355 |   935   (6)| 00:00:12 |       |       |
|*  2 |   HASH JOIN            |             | 99644 |  4378K|   927   (5)| 00:00:12 |       |       |
|   3 |    INDEX FULL SCAN     | PRODUCTS_PK |    72 |   288 |     1   (0)| 00:00:01 |       |       |
|*  4 |    HASH JOIN           |             | 99644 |  3989K|   924   (5)| 00:00:12 |       |       |
|*  5 |     TABLE ACCESS FULL  | CUSTOMERS   |   766 | 18384 |   408   (1)| 00:00:05 |       |       |

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------
|   6 |     PARTITION RANGE ALL|             |   918K|    14M|   507   (6)| 00:00:07 |     1 |    28 |
|   7 |      TABLE ACCESS FULL | SALES       |   918K|    14M|   507   (6)| 00:00:07 |     1 |    28 |
------------------------------------------------------------------------------------------------------

We will now convert the physical standby database to a Snapshot Standby database and note that after the conversion takes place, internally a guaranteed restore point has been created and the database has been opened in read-write mode. Also note the DATABASE_ROLE column.

SQL>  ALTER DATABASE CONVERT TO SNAPSHOT STANDBY;

Database altered.

SQL> shutdown immediate;
ORA-01507: database not mounted

ORACLE instance shut down.
SQL> startup;
ORACLE instance started.

Total System Global Area  521936896 bytes
Fixed Size                  2139784 bytes
Variable Size             415238520 bytes
Database Buffers           96468992 bytes
Redo Buffers                8089600 bytes
Database mounted.
Database opened.

SQL> select NAME,GUARANTEE_FLASHBACK_DATABASE from  v$restore_point;

NAME                                                         GUA
------------------------------------------------------------ ---
SNAPSHOT_STANDBY_REQUIRED_09/02/2009 10:12:21                YES

SQL> select open_mode,database_role from v$database;

OPEN_MODE  DATABASE_ROLE
---------- ----------------
READ WRITE SNAPSHOT STANDBY

We will now make some changes to some optimizer related parameters:

SQL> alter system set optimizer_index_cost_adj=20;

System altered.

SQL> alter system set optimizer_index_caching=10;

System altered.

SQL> alter system set optimizer_mode=FIRST_ROWS;

System altered.

We run the same query again and we can see that the plan has changed and the optimizer is now using the index scan with nested loop joins.

--------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |   519 | 23355 |   588  (11)| 00:00:08 |
|   1 |  HASH GROUP BY                        |                      |   519 | 23355 |   588  (11)| 00:00:08 |
|   2 |   NESTED LOOPS                        |                      | 99644 |  4378K|   580  (10)| 00:00:07 |
|*  3 |    HASH JOIN                          |                      | 99644 |  3989K|   577  (10)| 00:00:07 |
|*  4 |     TABLE ACCESS BY INDEX ROWID       | CUSTOMERS            |   766 | 18384 |   241   (2)| 00:00:03 |
|   5 |      BITMAP CONVERSION TO ROWIDS      |                      |       |       |            |          |
|   6 |       BITMAP INDEX FULL SCAN          | CUSTOMERS_GENDER_BIX |       |       |            |          |
|   7 |     PARTITION RANGE ALL               |                      |   918K|    14M|   326  (12)| 00:00:04 |
|   8 |      TABLE ACCESS BY LOCAL INDEX ROWID| SALES                |   918K|    14M|   326  (12)| 00:00:04 |
|   9 |       BITMAP CONVERSION TO ROWIDS     |                      |       |       |            |          |
|  10 |        BITMAP INDEX FULL SCAN         | SALES_PROMO_BIX      |       |       |            |          |
|* 11 |    INDEX UNIQUE SCAN                  | PRODUCTS_PK          |     1 |     4 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

While the original physical standby database has been now converted to snapshot database, some changes are happening on the Primary database and those changes are shipped to the standby site but not yet applied. They will accumulate on the standby site and will be applied after the snapshot standby database is converted back to a physical standby database.

SQL> create table mytest as select * from sales;

Table created.

SQL> update mytest set prod_id=1;

918843 rows updated.

On the standby site, we will now convert the shapshot standby database to the original mode of physical standby database in Active Data Guard mode.

Note that after the physical standby database is opened, changes that were made on the Primary database while it was open in shapshot standby mode are now applied as well.

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.

Total System Global Area  521936896 bytes
Fixed Size                  2139784 bytes
Variable Size             415238520 bytes
Database Buffers           96468992 bytes
Redo Buffers                8089600 bytes
Database mounted.
SQL> ALTER DATABASE CONVERT TO PHYSICAL STANDBY;

Database altered.

SQL> shutdown immediate;
ORA-01507: database not mounted

ORACLE instance shut down.

SQL> startup mount;
ORACLE instance started.

Total System Global Area  521936896 bytes
Fixed Size                  2139784 bytes
Variable Size             415238520 bytes
Database Buffers           96468992 bytes
Redo Buffers                8089600 bytes
Database mounted.
SQL>  recover managed standby database disconnect;
Media recovery complete.
SQL> recover managed standby database cancel;
Media recovery complete.
SQL> alter database open;

Database altered.

Elapsed: 00:01:06.29
SQL> recover managed standby database using current logfile disconnect;
Media recovery complete.

SQL> select count(*) from sh.mytest;

  COUNT(*)
----------
    918843

SQL> select distinct prod_id from sh.mytest;

   PROD_ID
----------
         1

Posted in High Availability, Oracle 11g | Tagged: , , | Leave a Comment »

Using FLASHBACK to rollback a TRUNCATE

Posted by Gavin Soorma on August 18, 2009

This scenario will show how we can use a combination of FLASHBACK database and also recovery to take a database back in time to undo a TRUNCATE operation and then roll forward the database after the flashback operation to bring it to the current point in time.

INSERT ROWS INTO TABLE MYOBJ – THIS WILL BE TRUNCATED

SQL> insert into scott.myobj select * from all_objects;

50496 rows created.

SQL> /

50496 rows created.

SQL> select count(*) from scott.myobj;

  COUNT(*)
----------
    100992

OBTAIN THE CURRENT SCN – FLASHBACK WILL HAPPEN TO THIS SCN

SQL> select current_scn from v$database;

          CURRENT_SCN
---------------------
          15633908021

TRUNCATE THE TABLE

SQL> truncate table scott.myobj;

Table truncated.

SQL> select count(*) from scott.myobj;

  COUNT(*)
----------
         0


AT THE SAME TIME OTHER CHANGES ARE HAPPENING IN THE DATABASE AND THESE CHANGES WILL BE RECOVERED AFTER THE FLASHBACK IS DONE

SQL> insert into scott.myobj2 select * from scott.myobj2;

356874 rows created.

SQL> /

713748 rows created.

SQL> commit;

Commit complete.

SHUTDOWN THE DATABASE AND PERFORM THE FLASHBACK TO THE SCN BEFORE THE TRUNCATE WAS DONE

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.

Total System Global Area  469762048 bytes
Fixed Size                  2084880 bytes
Variable Size             377491440 bytes
Database Buffers           83886080 bytes
Redo Buffers                6299648 bytes
Database mounted.

SQL> FLASHBACK DATABASE TO SCN 15633908021;

Flashback complete.

OPEN THE DATABASE IN READ ONLY MODE AND EXPORT THE TABLE THAT WAS TRUNCATED EARLIER. THIS TABLE WILL BE IMPORTED AFTER THE RECOVERY IS DONE

SQL> alter database open read only;

Database altered.

SQL>  select count(*) from scott.myobj;

  COUNT(*)
----------
         100992

SQL> quit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
crashdb:/u03/oradata/crashdb/arch> exp file=scott.dmp tables=myobj

Export: Release 10.2.0.4.0 - Production on Fri Feb 6 09:53:00 2009

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

Username: scott
Password:

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in US7ASCII character set and AL16UTF16 NCHAR character set
server uses WE8ISO8859P1 character set (possible charset conversion)

About to export specified tables via Conventional Path ...
. . exporting table                          MYOBJ          100992 rows exported
Export terminated successfully without warnings.

NOW SHUTDOWN THE DATABASE,STARTUP MOUNT AND PERFORM THE RECOVERY

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.

Total System Global Area  696254464 bytes
Fixed Size                  2086616 bytes
Variable Size             184551720 bytes
Database Buffers          503316480 bytes
Redo Buffers                6299648 bytes
Database mounted.
SQL> recover database;
Media recovery complete.
SQL> alter database open;

Database altered.

AS EXPECTED TABLE MYOBJ WHICH WAS TRUNCATED NOW AGAIN HAS 0 ROWS AFTER THE RECOVERY – WE CAN IMPORT THE DUMP WE TOOK AFTER THE FLASHBACK

SQL> select count(*) from scott.myobj;

  COUNT(*)
----------
         0

CONFIRM THAT OTHER COMMITTED CHANGES IN THE DATABASE HAVE BEEN RECOVERED

SQL> select count(*) from scott.myobj2;

  COUNT(*)
----------
  713748

Posted in Backup and Recovery, High Availability | Tagged: , , | 1 Comment »

11g RAC Transparent Application Failover (TAF)

Posted by Gavin Soorma on August 14, 2009

This example illustrates Transparent Application Failover (TAF) in a two node AIX 11g RAC configuration.

The database name is racdb and the two instances are racdb1 and racdb2.

Node 1 – middba1 which hosts instance racdb1.
Node2 – middba2 which hosts instance racdb2

We have defined two services racdb1 and racdb2. For service racdb1 the preferred instance is racdb1 and
for service racdb2 the preferred instance is racdb2.

The following srvctl commands illustrates the same.

middba1:/u01/oracle/dump> srvctl status service -d racdb -s racdb1
Service racdb1 is running on instance(s) racdb1

middba1:/u01/oracle/dump> srvctl config service -d racdb -s racdb1
racdb1 PREF: racdb1 AVAIL: racdb2

middba1:/u01/oracle/dump> srvctl status service -d racdb -s racdb2
Service racdb2 is running on instance(s) racdb2

middba1:/u01/oracle/dump> srvctl config service -d racdb -s racdb2
racdb2 PREF: racdb2 AVAIL: racdb1

From a SQL*PLUS client, we establish a session as user SYSTEM using the service racdb1. Note the machine name where this service is running from.

testdb:/u01/oracle> sqlplus system/xxx@racdb1

SQL*Plus: Release 11.1.0.6.0 - Production on Fri Aug 14 10:37:03 2009

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
With the Partitioning, Real Application Clusters and Real Application Testing options

SQL> select host_name from v$instance;

HOST_NAME
----------------------------------------------------------------
middba1

We then execute a long running SELECT statement

SQL> select * from sh.sales;

While this statement is running we reboot the node middba1 where this service is currently connected to.

We will note that the screen will momentarily freeze while the service gets relocated to the other functional node racdb2 and then the query continues from where it got interrupted. The session will not get disconnected – so for the end user it is transparent from which instance the query is being serviced.

If we check the status of the cluster resources, we will find that the VIP from node middba1 has now relocated to middba2. Other resources that were originally running from node middba1 are now offline.

middba2:/u01/oracle> crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora....SM1.asm application    ONLINE    OFFLINE
ora....A1.lsnr application    ONLINE    OFFLINE
ora....ba1.gsd application    ONLINE    OFFLINE
ora....ba1.ons application    ONLINE    OFFLINE
ora....ba1.vip application    ONLINE    ONLINE    middba2
ora....SM2.asm application    ONLINE    ONLINE    middba2
ora....A2.lsnr application    ONLINE    ONLINE    middba2
ora....ba2.gsd application    ONLINE    ONLINE    middba2
ora....ba2.ons application    ONLINE    ONLINE    middba2
ora....ba2.vip application    ONLINE    ONLINE    middba2
ora.racdb.db   application    ONLINE    ONLINE    middba2
ora....cdb1.cs application    ONLINE    ONLINE    middba2
ora....b1.inst application    ONLINE    OFFLINE
ora....db1.srv application    ONLINE    ONLINE    middba2
ora....cdb2.cs application    ONLINE    ONLINE    middba2
ora....b2.inst application    ONLINE    ONLINE    middba2
ora....db2.srv application    ONLINE    ONLINE    middba2

If we check the status of the service, we will find that the service racdb1 is now running on node middba2 connecting to the instance racdb2.

middba2:/u01/oracle> srvctl status service -d racdb -s racdb1
Service racdb1 is running on instance(s) racdb2

Once the node middba1 which was rebooted finally does come online, we will find that the VIP will relocate
to its original node middba1 once the cluster services are brought online after the reboot.

middba2:/u01/oracle> crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora....SM1.asm application    ONLINE    ONLINE    middba1
ora....A1.lsnr application    ONLINE    ONLINE    middba1
ora....ba1.gsd application    ONLINE    ONLINE    middba1
ora....ba1.ons application    ONLINE    ONLINE    middba1
ora....ba1.vip application    ONLINE    ONLINE    middba1
ora....SM2.asm application    ONLINE    ONLINE    middba2
ora....A2.lsnr application    ONLINE    ONLINE    middba2
ora....ba2.gsd application    ONLINE    ONLINE    middba2
ora....ba2.ons application    ONLINE    ONLINE    middba2
ora....ba2.vip application    ONLINE    ONLINE    middba2
ora.racdb.db   application    ONLINE    ONLINE    middba2
ora....cdb1.cs application    ONLINE    ONLINE    middba2
ora....b1.inst application    ONLINE    ONLINE    middba1
ora....db1.srv application    ONLINE    ONLINE    middba2
ora....cdb2.cs application    ONLINE    ONLINE    middba2
ora....b2.inst application    ONLINE    ONLINE    middba2
ora....db2.srv application    ONLINE    ONLINE    middba2

Note however, that the service racdb1 will continue running on node middba2 (instance racdb2)even though the original node which it was running on middba1 has been brought online.

We will have to relocate the service back to its original node using the srvctl command

middba1:/u01/oracle> srvctl relocate service -d racdb -s racdb1 -i racdb2 -t racdb1

Posted in High Availability, Oracle 11g, Real Application Clusters | Tagged: , , , | Leave a Comment »

FLASHBACK DATABASE TO SCN

Posted by Arjun Raja on August 4, 2009

FLASHBACK DATABASE TO SCN

Time to FLASHBACK DATABASE TO – 06:30 a.m on 31 July.

Get oldest flashback time –

SELECT OLDEST_FLASHBACK_SCN,to_char(OLDEST_FLASHBACK_TIME,'DD-MON-YYYY HH24:MI:SS') FROM V$FLASHBACK_DATABASE_LOG;

If there are logs up until 06:30 on 31 Jul, you can flashback.

Get the SCN number to flashback to -

col first_change# format 99999999999
select first_change# ,to_char(first_time,'DD-MON-YYYY HH24:MI:SS'),sequence# from V$LOG_HISTORY ORDER BY 1 ;

23856899540 31-JUL-2009 04:30:09 119695
23856965501 31-JUL-2009 06:30:10 119696
23857031498 31-JUL-2009 08:30:09 119697
23859233947 31-JUL-2009 10:00:29 119704
23859242647 31-JUL-2009 10:31:34 119705

Note down the SCN at the time you want to flashback to which is the FIRST_CHANGE# in the V$LOG_HISTORY view –

Example here the SCN is 23856965501 at 06:30 – the time you want to flashback to.

Restore a few archivelogs prior to and after the flashback time – the log sequence is in the last column above….

Ex – to restore archivelogs from 119695 to 119702

rman target / catalog rman/rman@rman

RMAN > restore archivelog from sequence 119695 until sequence 119705;

Once the logs are restored make sure no automatic RMAN backup job of archivelogs run which delete old archivelogs from disk –

Next flashback the database –

SHUTDOWN IMMEDIATE;

STARTUP MOUNT;

FLASHBACK DATABASE TO SCN 23856965501;

Once FLASHBACK is complete-

Verify the data before opening with a resetlogs -

ALTER DATABASE OPEN READ ONLY;

If satisfied with the data -

Shutdown immediate;
Startup mount ;
Alter database open resetlogs;

Posted in High Availability | Tagged: | Leave a Comment »

Flashback Database after a Data Guard Failover

Posted by Gavin Soorma on July 22, 2009

After a data guard failover operation has been performed, instead of recreating the original primary database as the new standby database, we can use the FLASHBACK DATABASE to SCN command instead. This is particulary important in case the size of the primary database would mean a long backup and restore RMAN operation to recreate the standby during which we have no DR in place.

In this test case we will simulate a failure on the Primary node by shutting down the current Primary database.

We will then perform the standard Failover procedure. Before performing a Failover, we first try and ensure that there is no archive log gap between the Primary and Standby databases. If there is one (and if the Primary server is still accessible), we can register any ‘missing’ archivelog files using the ALTER DATABASE REGISTER PHYSICAL LOGFILE command.

SQL> SELECT THREAD#, LOW_SEQUENCE#, HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;

no rows selected

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE;

Database altered.

These are the relevant lines taken from the alert log ….

Identified End-Of-Redo for thread 1 sequence 19
Terminal Recovery: Updated next available block for thread 1 sequence 19 lno 4 to value 92
Wed Jul 22 12:43:53 2009
Incomplete recovery applied all redo ever generated.
Recovery completed through change 5003771
Wed Jul 22 12:43:53 2009
MRP0: Media Recovery Complete (testdb)
 ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;

Database altered.

The alert log will confirm the switchover operation ….

Standby terminal recovery start SCN: 5003412
RESETLOGS after complete recovery through change 5003771
Standby became primary SCN: 5003411

Complete the switchover operation by shutting down and starting the database

SQL> shutdown immediate;
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
SQL> startup;
ORACLE instance started.

Total System Global Area  264241152 bytes
Fixed Size                  2083304 bytes
Variable Size             142607896 bytes
Database Buffers          113246208 bytes
Redo Buffers                6303744 bytes
Database mounted.
Database opened.

Note the SCN at which the former standby has now become the primary

SQL> SELECT TO_CHAR(STANDBY_BECAME_PRIMARY_SCN) FROM V$DATABASE;

TO_CHAR(STANDBY_BECAME_PRIMARY_SCN)
----------------------------------------
5003411

On the original primary, we will now flashback the database using the SCN obtained above.

SQL> flashback database to scn 5003411;

Flashback complete.

from the alert log ….

Incomplete Recovery applied until change 5003412
Flashback Media Recovery Complete
Completed: flashback database to scn 5003411

We will now turn off the flashback logging to delete existing falshback log files. Since we will be restoring a standby control file as well in the next step, these flashback logs will become obsolete.

SQL> ALTER DATABASE FLASHBACK OFF;

Database altered.

From the alert log we can see that the flashback logs are being deleted.

Stopping background process RVWR
Deleted Oracle managed file /u02/oradata/testdb/TESTDB_DG/flashback/o1_mf_56dydp02_.flb
.......
Deleted Oracle managed file /u02/oradata/testdb/TESTDB_DG/flashback/o1_mf_56f0pz1n_.flb
Deleted Oracle managed file /u02/oradata/testdb/TESTDB_DG/flashback/o1_mf_56f0q51v_.flb
Deleted Oracle managed file /u02/oradata/testdb/TESTDB_DG/flashback/o1_mf_56f0qmw4_.flb
Deleted Oracle managed file /u02/oradata/testdb/TESTDB_DG/flashback/o1_mf_56f0qsk9_.flb
Deleted Oracle managed file /u02/oradata/testdb/TESTDB_DG/flashback/o1_mf_56f0r312_.flb
Deleted Oracle managed file /u02/oradata/testdb/TESTDB_DG/flashback/o1_mf_56f0rkro_.flb
Deleted Oracle managed file /u02/oradata/testdb/TESTDB_DG/flashback/o1_mf_56f0rtqq_.flb
Flashback Database Disabled

We will now create a standby control file and overwrite the existing control files with the standby control file.

SQL>  ALTER DATABASE CREATE STANDBY CONTROLFILE AS '/tmp/control01.ctl';

Database altered.

SQL> shutdown immediate;
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
[DEV] testdb:/tmp > cp control01.ctl /u02/oradata/testdb/control01.ctl
[DEV] testdb:/tmp > cp control01.ctl /u02/oradata/testdb/control02.ctl
[DEV] testdb:/tmp > cp control01.ctl /u02/oradata/testdb/control03.ctl

We then mount the standby database, turn on flashback logging and the managed recovery process on the standby database

SQL> startup mount;
ORACLE instance started.

Total System Global Area  390070272 bytes
Fixed Size                  2084272 bytes
Variable Size             364905040 bytes
Database Buffers           16777216 bytes
Redo Buffers                6303744 bytes
Database mounted.
SQL> alter database flashback on;

Database altered.

SQL> recover managed standby database disconnect;
Media recovery complete.

Posted in High Availability | Tagged: , , , | 1 Comment »

Data Guard Log Shipping Report

Posted by Gavin Soorma on July 16, 2009

If we have a number of Data Guard physical standby database environments to manage, the following report can help us quickly identify the log transported and log applied status of the Primary and Standby databases in our environment and if any standby database is lagging behind the primary as well.

The script can also be customised so as to email an alert notification if the standby and the primary go out of sync by say 5 log files. Drop me a line if you need that customisation script.

This report is based on a Unix shell script (check_logship.sh) which in turn calls a SQL script (check_logship.sql).

The script requires a user MONITOR to be created in each target database with the CONNECT and SELECT ANY DICTIONARY privileges. We also have a config file (in our case bw_dg.lst) which will contain the list of all the TNS aliases of the Primary databases which we need to monitor.

[PROD] emrep:/u01/oracle/scripts > ./check_logship.sh

#######################################################################################
	Data Guard Log Shipping Summary Report:  Thu Jul 16 14:22:02 WAUST 2009
#######################################################################################

DB_NAME  HOSTNAME     LOG_ARCHIVED LOG_APPLIED APPLIED_TIME  LOG_GAP
-------- ------------ ------------ ----------- ------------  -------

GENPRD   CBDORCA201          16742       16742 16-JUL/14:12       0

CPSPRD   PRDU009N1           11494       11494 16-JUL/14:10       0

LN1P     CBDORCA101          51173       51171 16-JUL/12:25       2

LA1P     CBDORCA105          76971       76970 16-JUL/13:10       1

#######################################################################################

check_logship.sql

SET PAGESIZE 124
SET HEAD OFF
COL DB_NAME FORMAT A8
COL HOSTNAME FORMAT A12
COL LOG_ARCHIVED FORMAT 999999
COL LOG_APPLIED FORMAT 999999
COL LOG_GAP FORMAT 9999
COL APPLIED_TIME FORMAT A12
SELECT DB_NAME, HOSTNAME, LOG_ARCHIVED, LOG_APPLIED,APPLIED_TIME,
LOG_ARCHIVED-LOG_APPLIED LOG_GAP
FROM
(
SELECT NAME DB_NAME
FROM V$DATABASE
),
(
SELECT UPPER(SUBSTR(HOST_NAME,1,(DECODE(INSTR(HOST_NAME,’.'),0,LENGTH(HOST_NAME),
(INSTR(HOST_NAME,’.')-1))))) HOSTNAME
FROM V$INSTANCE
),
(
SELECT MAX(SEQUENCE#) LOG_ARCHIVED
FROM V$ARCHIVED_LOG WHERE DEST_ID=1 AND ARCHIVED=’YES’
),
(
SELECT MAX(SEQUENCE#) LOG_APPLIED
FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED=’YES’
),
(
SELECT TO_CHAR(MAX(COMPLETION_TIME),’DD-MON/HH24:MI’) APPLIED_TIME
FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED=’YES’
);

check_logship.sh

if [ -f /tmp/dataguard1.out ]
then
rm /tmp/dataguard1.out
fi

if [ -f /tmp/dataguard2.out ]
then
rm /tmp/dataguard2.out
fi

export SCRPT=/u01/app/scripts

for i in `cat $SCRPT/bw_dg.lst`
do
sqlplus -s monitor/xxx@$i <> /tmp/dataguard2.out
@/$SCRPT/check_logship.sql
EOF
echo “#######################################################################################” > /tmp/dataguard1.out
echo ” Data Guard Log Shipping Summary Report: `date ` ” >> /tmp/dataguard1.out
echo “#######################################################################################” >> /tmp/dataguard1.out
echo >> /tmp/dataguard1.out
echo “DB_NAME HOSTNAME LOG_ARCHIVED LOG_APPLIED APPLIED_TIME LOG_GAP” >> /tmp/dataguard1.out
echo “——– ———— ———— ———– ———— ——-” >> /tmp/dataguard1.out
cat /tmp/dataguard2.out >> /tmp/dataguard1.out
done
cat /tmp/dataguard1.out

Posted in High Availability, Unix | Tagged: , , , , , | Leave a Comment »

11g Snapshot Standby Database

Posted by Gavin Soorma on July 7, 2009

Oracle 11g introduces the Snapshot Standby database which essentially is an updateable standby database which has been created from a physical standby database.

We can convert a physical standby database to a snapshot standby database, do some kind of testing on a database which is a read write copy of the current primary or production database and then finally revert it to it’s earlier state as a physical standby database.

While the snapshot standby database is open in read-write mode, redo is being received from the primary database, but is not applied.

After converting it back to a physical standby database, it is resynchronized with the primary by applying the accumalated redo data which was earlier shipped from the primary database but not applied.

Using a snapshot standby, we are able to do real time application testing using near real time production data. Very often we are required to do production clones for the purpose of testing. But using snapshot standby databases we can meet the same requirement sparing the effort,time,resources and disk space.

Initially we have a Data Guard broker configuration in place where APEX is the Primary database
and APEXDG is the Standby database.

DGMGRL> show configuration

Configuration
  Name:                gavin
  Enabled:             YES
  Protection Mode:     MaxAvailability
  Databases:
    apex   - Primary database
    apexdg - Physical standby database

Fast-Start Failover: DISABLED

Current status for "gavin":
SUCCESS

We now convert the physical standby database to a snapshot standby

DGMGRL> convert database 'apexdg' to snapshot standby;
Converting database "apexdg" to a Snapshot Standby database, please wait...
Database "apexdg" converted successfully

DGMGRL> show configuration

Configuration
  Name:                gavin
  Enabled:             YES
  Protection Mode:     MaxAvailability
  Databases:
    apex   - Primary database
    apexdg - Snapshot standby database

Fast-Start Failover: DISABLED

Current status for "gavin":
SUCCESS

We now connect to the snapshot standby database which is open in read-write mode and we create a table 

apex:/u01/oracle/scripts> export ORACLE_SID=apexdg
apexdg:/u01/oracle/scripts> sqlplus sh/SH

SQL*Plus: Release 11.1.0.6.0 - Production on Fri Sep 19 09:33:29 2008

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create table test_snapshot
  2  as select * from all_objects;

Table created.

SQL> select count(*) from test_snapshot;

  COUNT(*)
----------
     56467

At this point in time, we revert back to the original state

DGMGRL> convert database 'apexdg' to physical standby;

Converting database "apexdg" to a Physical Standby database, please wait...
Operation requires shutdown of instance "apexdg" on database "apexdg"
Shutting down instance "apexdg"...
Database closed.
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "apexdg" on database "apexdg"
Starting instance "apexdg"...
ORACLE instance started.
Database mounted.
Continuing to convert database "apexdg" ...
Operation requires shutdown of instance "apexdg" on database "apexdg"
Shutting down instance "apexdg"...
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "apexdg" on database "apexdg"
Starting instance "apexdg"...
ORACLE instance started.
Database mounted.
Database "apexdg" converted successfully
DGMGRL>

We now connect to the physical standby database and see that all the changes we made while
the standby database was functioning as a snapshot standby have been rolled back. The table
which we created in the snapshot standby database is not physically present in the physical
standby database.

apexdg:/u01/oracle/scripts> sqlplus / as sysdba
SQL*Plus: Release 11.1.0.6.0 - Production on Fri Sep 19 09:48:38 2008

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter database open read only;

Database altered.

SQL> conn sh/SH@apexdg
Connected.

SQL> select count(*) from test_snapshot;
select count(*) from test_snapshot
                     *
ERROR at line 1:
ORA-00942: table or view does not exist

Posted in High Availability, Oracle 11g | Tagged: , , , | Leave a Comment »

Data Guard Switchover Unix shell script

Posted by Gavin Soorma on July 6, 2009

The following Unix shell scripts can be used to automate the Data Guard Switchover process of a physical standby database.

It is very important that the scripts are run in the correct order and on right machine.

These scripts are based on a few customisations. The *.sh scripts are located in a directory called “/var/opt/oracle/dataguard” and there is another file called “set$ORACLE_SID” which is located under “/var/opt/oracle/cronjobs”. This script sets the environment for the individual Oracle instance like the $ORACLE_SID and the $ORACLE_HOME.

On the machine where the Primary Database is running we need to run the following scripts

$ cd /var/opt/oracle/dataguard
$ ./pre_switchover_check.sh
$ ./make_me_standby.sh

On the machine where the Standby Database is running we need to run the following script:

$ cd /var/opt/oracle/dataguard
$ ./make_me_primary.sh

After the switchover is completed, we need to run the following script on the machine where the former Primary (now new Standby) database is running:

$ cd /var/opt/oracle/dataguard
$ ./start_recovery.sh

pre_switchover_check.sh

!/bin/ksh

if [ "$1" = "" ]
then
echo ""
echo "##################################################################"
echo "# PLEASE ENTER ORACLE_SID AS THE FIRST COMMAND LINE ARGUMENT     #"
echo "#      SCRIPT USAGE : pre_switchover_check.sh        #"
echo "##################################################################"
echo ""
exit
fi

ORACLE_SID=`echo $1 |tr '[A-Z]' '[a-z]'`; export ORACLE_SID;

ls -l /var/opt/oracle/cronjobs/set$ORACLE_SID > /dev/null
if [ $? != 0 ]
then
echo ""
echo "##################################################################"
echo "#           ENVIRONMENT SETTING SET FILE NOT FOUND...!!!         #"
echo "#           /var/opt/oracle/cronjobs/set$ORACLE_SID              #"
echo "##################################################################"
echo ""
exit
fi

. /var/opt/oracle/cronjobs/set$ORACLE_SID

DB=`echo $1 |tr '[a-z]' '[A-Z]'`; export DB;

echo ""
echo "##################################################################"
echo "#            PERFORMING PRE-SWITCHOVER CHECKS FOR $DB            #"
echo "##################################################################"
echo ""

sqlplus  -s /nolog |&
print -p  "connect / as sysdba"
print -p  "set feedback off pause off pagesize 0 heading off verify off linesize 500 term off "
print -p  "set sqlprompt ''"
read  -p   JUNK
print -p  "SELECT DATABASE_ROLE DBROLE FROM V\$DATABASE;"
read  -p   DBROLE
print -p  "SELECT COUNT(FILE#) FILESTAT FROM V\$DATAFILE WHERE STATUS IN ('RECOVER', 'OFFLINE');"
read  -p   FILESTAT
print -p  "SELECT COUNT(FILE#) BKPSTAT FROM V\$BACKUP WHERE STATUS = 'ACTIVE';"
read  -p   BKPSTAT
print -p  "SELECT LOG_ARCHIVED-LOG_APPLIED LOGGAP
FROM (SELECT MAX(SEQUENCE#) LOG_ARCHIVED
FROM V\$ARCHIVED_LOG WHERE DEST_ID=1 AND ARCHIVED='YES'
AND RESETLOGS_CHANGE#=(SELECT MAX(RESETLOGS_CHANGE#)
FROM V\$ARCHIVED_LOG)),
(SELECT MAX(SEQUENCE#) LOG_APPLIED
FROM V\$ARCHIVED_LOG WHERE DEST_ID=2
AND APPLIED='YES'
AND RESETLOGS_CHANGE#=(SELECT MAX(RESETLOGS_CHANGE#) FROM V\$ARCHIVED_LOG)) ;"
read  -p   LOGGAP
print -p  "EXIT"

echo "CHECKING CURRENT DATABASE ROLE..."
if [ "$DBROLE" = "PHYSICAL STANDBY"  ]
then
echo ""
echo "##################################################################"
echo "# ERROR ! ! ! NOT  A PRIMARY DATABASE, PLEASE RUN IT ON PRIMARY  #"
echo "##################################################################"
echo ""
exit
else
echo ""
echo "   OK   "
echo ""
fi

echo "CHECKING LOG GAP BETWEEN PRIMARY & STANDBY..."
if [ $LOGGAP -ne 0 ]
then
echo ""
echo "######################################################################"
echo "# ERROR ! ! !  PRIMARY & STANDBY ARE NOT IN SYNC, LOG GAP = $LOGGAP  #"
echo "######################################################################"
echo ""
exit
else
echo ""
echo "   OK   "
echo ""
fi

echo "CHECKING FILES OFFLINE OR RECOVER STATUS..."
if [ $FILESTAT -ne 0 ]
then
echo ""
echo "##################################################################"
echo "#  ERROR ! ! !  ONE OR MORE FILE/S OFFLINE OR IN RECOVER STATUS  #"
echo "##################################################################"
echo ""
exit
else
echo ""
echo "   OK   "
echo ""
fi

echo "CHECKING FILES IN BACKUP MODE..."
if [ $BKPSTAT -ne 0 ]
then
echo ""
echo "##################################################################"
echo "#      ERROR ! ! !  ONE OR MORE FILE/S ARE IN BACKUP MODE        #"
echo "##################################################################"
echo ""
exit
else
echo ""
echo "   OK   "
echo ""
fi

echo ""
echo "##################################################################"
echo "# ALL PRE-SWITCHOVER CHECKS SUCCEEDED... PROCEED WITH SWITCHOVER #"
echo "##################################################################"
echo ""
make_me_standby.sh

#!/bin/ksh

if [ "$1" = "" ]
then
echo ""
echo "##################################################################"
echo "# PLEASE ENTER ORACLE_SID AS THE FIRST COMMAND LINE ARGUMENT     #"
echo "#        SCRIPT USAGE : make_me_standby.sh           #"
echo "##################################################################"
echo ""
exit
fi

ORACLE_SID=`echo $1 |tr '[A-Z]' '[a-z]'`; export ORACLE_SID;

ls -l /var/opt/oracle/cronjobs/set$ORACLE_SID > /dev/null
if [ $? != 0 ]
then
echo ""
echo "##################################################################"
echo "#           ENVIRONMENT SETTING SET FILE NOT FOUND...!!!         #"
echo "#           /var/opt/oracle/cronjobs/set$ORACLE_SID              #"
echo "##################################################################"
echo ""
exit
fi

. /var/opt/oracle/cronjobs/set$ORACLE_SID

DB=`echo $1 |tr '[a-z]' '[A-Z]'`; export DB;

sqlplus  -s /nolog |&
print -p  "connect / as sysdba"
print -p  "set feedback off pause off pagesize 0 heading off verify off linesize 500 term off"
print -p  "set sqlprompt ''"
read  -p   JUNK
print -p  "SELECT DATABASE_ROLE DBROLE FROM V\$DATABASE;"
read  -p   DBROLE
print -p  "SELECT COUNT(FILE#) FILESTAT FROM V\$DATAFILE WHERE STATUS IN ('RECOVER', 'OFFLINE');"
read  -p   FILESTAT
print -p  "SELECT COUNT(FILE#) BKPSTAT FROM V\$BACKUP WHERE STATUS = 'ACTIVE';"
read  -p   BKPSTAT
print -p  "SELECT LOG_ARCHIVED-LOG_APPLIED LOGGAP
FROM (SELECT MAX(SEQUENCE#) LOG_ARCHIVED
FROM V\$ARCHIVED_LOG WHERE DEST_ID=1 AND ARCHIVED='YES'
AND RESETLOGS_CHANGE#=(SELECT MAX(RESETLOGS_CHANGE#)
FROM V\$ARCHIVED_LOG)),
(SELECT MAX(SEQUENCE#) LOG_APPLIED
FROM V\$ARCHIVED_LOG WHERE DEST_ID=2
AND APPLIED='YES'
AND RESETLOGS_CHANGE#=(SELECT MAX(RESETLOGS_CHANGE#) FROM V\$ARCHIVED_LOG)) ;"
read  -p   LOGGAP
print -p  "EXIT"

if [ "$DBROLE" = "PHYSICAL STANDBY"  ]
then
echo ""
echo "##################################################################"
echo "# ERROR ! ! ! NOT  A PRIMARY DATABASE, PLEASE RUN IT ON PRIMARY  #"
echo "##################################################################"
echo ""
exit
fi

if [ $LOGGAP -ne 0 ]
then
echo ""
echo "######################################################################"
echo "# ERROR ! ! !  PRIMARY & STANDBY ARE NOT IN SYNC, LOG GAP = $LOGGAP  #"
echo "######################################################################"
echo ""
exit
fi

echo "CHECKING FILES OFFLINE OR RECOVER STATUS..."
if [ $FILESTAT -ne 0 ]
then
echo ""
echo "##################################################################"
echo "#  ERROR ! ! !  ONE OR MORE FILE/S OFFLINE OR IN RECOVER STATUS  #"
echo "##################################################################"
echo ""
exit
fi

if [ $BKPSTAT -ne 0 ]
then
echo ""
echo "##################################################################"
echo "#      ERROR ! ! !  ONE OR MORE FILE/S ARE IN BACKUP MODE        #"
echo "##################################################################"
echo ""
exit
fi

echo ""
echo "##################################################################"
echo "#              ALL PRE-SWITCHOVER CHECKS SUCCEEDED...            #"
echo "       SWITCHING $DB TO STANDBY ROLE, PLEASE CONFIRM ...          "
echo "##################################################################"
echo ""
echo ""
echo "Continue? - Please enter 'Y' to proceed & 'N' to exit"
read ans
if [ "$ans" = 'Y' -o "$ans" = 'y' ]
then
sqlplus -s  /nolog < /tmp/make_me_standby.log
connect / as sysdba;
startup force;
alter database commit to switchover to standby with session shutdown;
shutdown immediate;
startup nomount;
alter database mount standby database;
select database_role from v\$database;
EOF
cat /tmp/make_me_standby.log
$ORACLE_HOME/bin/lsnrctl stop PRIMARY_$DB
else
echo "Quitting ....."
exit
fi

echo ""
echo "####################################################################"
echo "# PLEASE CONFIRM THAT THE STATUS OF DATABASE IS NOW "STANDBY"      #"
echo "#                                                                  #"
echo "# On OLD STANDBY Host please run the following script:         #"
echo "# /var/opt/oracle/dataguard/make_me_primary.sh                #"
echo "####################################################################"
echo ""
make_me_primary.sh

#!/bin/ksh

if [ "$1" = "" ]
then
echo ""
echo "##################################################################"
echo "#  PLEASE ENTER ORACLE_SID AS THE FIRST COMMAND LINE ARGUMENT    #"
echo "#         SCRIPT USAGE : make_me_primary.sh          #"
echo "##################################################################"
echo ""
exit
fi

ORACLE_SID=`echo $1 |tr '[A-Z]' '[a-z]'`; export ORACLE_SID;

ls -l /var/opt/oracle/cronjobs/set$ORACLE_SID > /dev/null
if [ $? != 0 ]
then
echo ""
echo "##################################################################"
echo "#           ENVIRONMENT SETTING SET FILE NOT FOUND...!!!         #"
echo "#           /var/opt/oracle/cronjobs/set$ORACLE_SID              #"
echo "##################################################################"
echo ""
exit
fi

. /var/opt/oracle/cronjobs/set$ORACLE_SID

DB=`echo $1 |tr '[a-z]' '[A-Z]'`; export DB;

sqlplus  -s /nolog |&
print -p  "connect / as sysdba"
print -p  "set feedback off pause off pagesize 0 heading off verify off linesize 500 term off"
print -p  "set sqlprompt ''"
read  -p   JUNK
print -p  "SELECT DATABASE_ROLE DBROLE FROM V\$DATABASE;"
read  -p   DBROLE
print -p  "exit"

if [ "$DBROLE" = "PRIMARY"  ]
then
echo ""
echo "##################################################################"
echo "#   ERROR ! ! ! NOT STANDBY DATABASE, PLEASE RUN IT ON STANDBY   #"
echo "##################################################################"
echo ""
exit
fi

echo ""
echo "##################################################################"
echo "#         SWITCHING $DB TO PRIMARY ROLE, PLEASE CONFIRM ...      #"
echo "##################################################################"
echo ""
echo "Continue? - Please enter 'Y' to proceed & 'N' to exit"
read ans
if [ "$ans" = 'Y' -o "$ans" = 'y' ]
then
sqlplus -s /nolog < /tmp/make_me_primary.log
connect / as sysdba;
alter database commit to switchover to primary with session shutdown;
shutdown immediate;
startup mount;
alter system set log_archive_dest_state_2=enable scope=both;
alter database set standby database to maximize performance;
alter database open;
select database_role from v\$database;
EOF
cat /tmp/make_me_primary.log
$ORACLE_HOME/bin/lsnrctl start PRIMARY_$DB
else
echo "Quitting ....."
exit
fi

echo "####################################################################"
echo "# PLEASE CONFIRM THAT THE STATUS OF DATABASE IS NOW "PRIMARY"      #"
echo "#                                                                  #"
echo "# On NEW STANDBY Host please run the following script:             #"
echo "# /var/opt/oracle/dataguard/start_recovery.sh                 #"
echo "####################################################################"
start_recovery.sh

#!/bin/ksh

if [ "$1" = "" ]
then
echo ""
echo "##################################################################"
echo "#   PLEASE ENTER ORACLE_SID AS THE FIRST COMMAND LINE ARGUMENT   #"
echo "#      SCRIPT USAGE : start_recovery.sh              #"
echo "##################################################################"
echo ""
exit
fi

ORACLE_SID=`echo $1 |tr '[A-Z]' '[a-z]'`; export ORACLE_SID;

ls -l /var/opt/oracle/cronjobs/set$ORACLE_SID > /dev/null
if [ $? != 0 ]
then
echo ""
echo "##################################################################"
echo "#           ENVIRONMENT SETTING SET FILE NOT FOUND...!!!         #"
echo "#           /var/opt/oracle/cronjobs/set$ORACLE_SID              #"
echo "##################################################################"
echo ""
exit
fi

. /var/opt/oracle/cronjobs/set$ORACLE_SID

DB=`echo $1 |tr '[a-z]' '[A-Z]'`; export DB;

sqlplus  -s /nolog |&
print -p  "connect / as sysdba"
print -p  "set feedback off pause off pagesize 0 heading off verify off linesize 500 term off"
print -p  "set sqlprompt ''"
read  -p   JUNK
print -p  "SELECT DATABASE_ROLE DBROLE FROM V\$DATABASE;"
read  -p   DBROLE
print -p  "exit"

if [ "$DBROLE" = "PRIMARY"  ]
then
echo ""
echo "##################################################################"
echo "#   ERROR ! ! ! NOT STANDBY DATABASE, PLEASE RUN IT ON STANDBY   #"
echo "##################################################################"
echo ""
exit
fi

echo ""
echo "##################################################################"
echo "#     STARTING RECOVERY FOR $DB  STANDBY, PLEASE CONFIRM ...     #"
echo "##################################################################"
echo ""
echo "Continue? - Please enter 'Y' to proceed & 'N' to exit"
read ans
if [ "$ans" = 'Y' -o "$ans" = 'y' ]
then
sqlplus -s /nolog < /tmp/start_recovery.log
connect / as sysdba;
recover managed standby database disconnect;
alter system set log_archive_dest_state_2=defer scope=both;
EOF
cat /tmp/start_recovery.log
ps -ef | grep ora_mrp0_$ORACLE_SID |grep -v grep > /dev/null
if [ $? != 0 ]
then
echo ""
echo "##################################################################"
echo "#           RECOVERY PROCESS NOT RUNNING... PLEASE CHECK         #"
echo "##################################################################"
echo ""
exit
else
echo "##################################################################"
echo "#           MRP PROCESS SUCESSFULLY STARTED                      #"
echo "##################################################################"
echo ""
echo "####################################################################"
echo "# 		SWITCHOVER COMPLETE                              #"
echo ""
echo "* Perform Database Post-Switchover Checklist!                      "
echo "####################################################################"
fi
else
echo "Quitting ....."
exit
fi

Posted in High Availability, Scripts, Unix | Tagged: , | Leave a Comment »

10g RAC Administration Using SRVCTL

Posted by Gavin Soorma on July 2, 2009

Status of all instances and services

$ srvctl status database -d orcl
Instance orcl1 is running on node linux1
Instance orcl2 is running on node linux2

Status of a single instance

$ srvctl status instance -d orcl -i orcl2
Instance orcl2 is running on node linux2

Status of a named service globally across the database

$ srvctl status service -d orcl -s orcltest
Service orcltest is running on instance(s) orcl2, orcl1

Status of node applications on a particular node

$ srvctl status nodeapps -n linux1
VIP is running on node: linux1
GSD is running on node: linux1
Listener is running on node: linux1
ONS daemon is running on node: linux1

Status of an ASM instance

$ srvctl status asm -n linux1
ASM instance +ASM1 is running on node linux1.

List all configured databases

$ srvctl config database
orcl

Display configuration for our RAC database

$ srvctl config database -d orcl
linux1 orcl1 /u01/app/oracle/product/10.2.0/db_1
linux2 orcl2 /u01/app/oracle/product/10.2.0/db_1

Display all services for the specified cluster database

$ srvctl config service -d orcl
orcltest PREF: orcl2 orcl1 AVAIL:

isplay the configuration for node applications - (VIP, GSD, ONS, Listener)

$ srvctl config nodeapps -n linux1 -a -g -s -l
VIP exists.: /vip-linux1/192.168.1.200/255.255.255.0/eth0:eth1
GSD exists.
ONS daemon exists.
Listener exists.

Display the configuration for the ASM instance(s)

$ srvctl config asm -n linux1
+ASM1 /u01/app/oracle/product/10.2.0/db_1

All running instances in the cluster

SQL>
SELECT
inst_id
, instance_number inst_no
, instance_name inst_name
, parallel
, status
, database_status db_status
, active_state state
, host_name host
FROM gv$instance
ORDER BY inst_id;

Posted in Administration, High Availability | Tagged: , , | Leave a Comment »

10g RAC Installation and Configuration – A Practical Guide

Posted by Gavin Soorma on July 2, 2009

This note will describe in detail the steps required to install and configure a 10g RAC environment on a Red Hat Linux platform. It will discuss the installation of the Clusterware, OCFS, ASM, using DBCA to create the RAC database, configuring services using srvctl and Transparent Application Failover (TAF).

Download …

Posted in High Availability | Tagged: , , , , , | 2 Comments »

10g Flashback and Oracle Data Guard

Posted by Arjun Raja on July 1, 2009

FLASHBACK INSTEAD OF DELAYING APPLYING OF LOGS ON STANDBY:

If you have multiple standby sites, you may utilize the DELAY option in Data Guard to prevent physical/logical corruption or user errors in your primary database.

For example If your primary database has only one standby database , a logical or physical corruption in the primary database will cause an immediate corruption in the standby database.

To avoid such a pitfall, you can implement the ‘Delay’ option (introducing a delay of minutes or hours on the second standby database for applying archive log changes).

This will prevent the corruptions on the second standby database and allow recovery from a possible physical/logical corruption or user errors in the primary database.

You can issue the following command to accomplish this:

SQL> alter database recover managed standby database delay 60 disconnect;

Using Flashback Database in a Standby Database Configuration

However, in Oracle 10g, you can configure the standby database with Flashback Database to achieve the same benefit as the DELAY option.

Therefore, there is no need to implement a second standby database with the DELAY option.

Brief description:

On primary at 10 a.m a corruption occurs which is propagated immediately to the standby database.

At this point you can FLASHBACK the primary database to a time just before 10 a.m. and then flashback the standby as well so all data is recovered up until the point of corruption.

Example in document below.

Download

Posted in High Availability | Tagged: , | Leave a Comment »

High Availability for the EM Grid Control OMS Repository using Data Guard

Posted by Gavin Soorma on July 1, 2009

This note will provide a guide to setting up an active-active DR solution for the 10g EM Grid Control Oracle Management Server (OMS) Repository database using Oracle Data Guard.

Download ….

Posted in High Availability | Tagged: , , , , | Leave a Comment »

Renaming datafile on standby database

Posted by Arjun Raja on June 30, 2009

DATAGUARD TROUBLE SHOOTING -

It is possible that a datafile is added on primary in some other area other than paths covered by the db_file_name_convert parameter which ensures the file is added on the standby database as well.

This documents the real life scenario if a datafile is added to an wrong area on the primary database which is not in the db_file_name_convert path in the standby init.ora and how to recover the standby database in this case.

STANDBY SITE

SQL> show parameter convert

NAME TYPE VALUE
———————————— ———– ——————————
db_file_name_convert string /opt/oracle/, /opt/oracle/
log_file_name_convert string /opt/oracle/, /opt/oracle/

SQL> show parameter standby

NAME TYPE VALUE
———————————— ———– ——————————
standby_file_management string AUTO

PRIMARY SITE:

If a file is created in any other area other than /opt/oracle ( which is in the db_file_name_convert path)

EX: SQL> create tablespace arjun datafile ‘/u01/ORACLE/dgtest9i_bkp/arjun.dbf’ size 10m;

Tablespace created.

SQL> alter system switch logfile;

Since the db_file_name_convert on the standby site only handles files created in /opt/oracle area, the datafile cannot be added to the standby site….

Alert log in standby site…

Media Recovery Log /opt/oracle/dgtest9i/arch/arch235.log
File #7 added to control file as ‘UNNAMED00007′. Originally created as:
‘/u01/ORACLE/dgtest9i_bkp/arjun.dbf’
Recovery was unable to create the file as:
‘/u01/ORACLE/dgtest9i_bkp/arjun.dbf’
MRP0: Background Media Recovery terminated with error 1274
ORACLE AUTOMATICALLY STOPS MEDIA RECOVERY -
Sun Mar 5 18:49:06 2006
Errors in file /opt/oracle/admin/dgtest9i/bdump/dgtest9i_mrp0_30121.trc:
ORA-01274: cannot add datafile ‘/u01/ORACLE/dgtest9i_bkp/arjun.dbf’ – file could not be created
ORA-01119: error in creating database file ‘/u01/ORACLE/dgtest9i_bkp/arjun.dbf’
ORA-27040: skgfrcre: create error, unable to create file
Linux-x86_64 Error: 2: No such file or directory
Some recovered datafiles maybe left media fuzzy
Media recovery may continue but open resetlogs may fail
MRP0: Background Media Recovery process shutdown…………..Automatically shuts down the recovery process in this case.

Check whether logs have been applied….

SQL> SQL> select max(SEQUENCE#) “LAST_LOG_GENERATED” FROM V$ARCHIVED_LOG WHERE THREAD# =1 and ARCHIVED=’YES’;

LAST_LOG_GENERATED
——————
237

select max(SEQUENCE#) “LAST_LOG_APPLIED” FROM V$ARCHIVED_LOG WHERE THREAD# = 1 and APPLIED=’YES’;

LAST_LOG_APPLIED
—————-
234

Now check the v$datafile view in standby database…..

SQL> select name from v$datafile;

NAME
——————————————————————————–
/opt/oracle/oradata/dgtest9i/system01.dbf
/opt/oracle/oradata/dgtest9i/undotbs01.dbf
/opt/oracle/oradata/dgtest9i/users01.dbf
/opt/oracle/product9204/dbs/UNNAMED00007 – This is the new file.

To rename the file as required…

In standby database….

alter system set standby_file_management=manual;

SQL> alter database create datafile ‘/opt/oracle/product9204/dbs/UNNAMED00007′ as ‘/opt/oracle/oradata/dgtest9i/arjun1.dbf’ ;
Database altered.

SQL> select name from v$datafile;

NAME
——————————————————————————–
/opt/oracle/oradata/dgtest9i/system01.dbf
/opt/oracle/oradata/dgtest9i/undotbs01.dbf
/opt/oracle/oradata/dgtest9i/users01.dbf
/opt/oracle/oradata/dgtest9i/arjun1.dbf

alter system set standby_file_management=AUTO;

alter database recover managed standby database disconnect;

Now check the alertlog again….and notice recovery has started on the standby site.

ALTER SYSTEM SET standby_file_management=’MANUAL’ SCOPE=BOTH;
Sun Mar 5 18:56:23 2006
alter database create datafile ‘/opt/oracle/product9204/dbs/UNNAMED00007′
as ‘/opt/oracle/oradata/dgtest9i/arjun1.dbf’
Sun Mar 5 18:56:23 2006
Starting control autobackup
Control autobackup failed with following error :
ORA-00236: snapshot operation disallowed: mounted controlfile is a backup
Completed: alter database create datafile ‘/opt/oracle/produc
Sun Mar 5 18:57:01 2006
ALTER SYSTEM SET standby_file_management=’AUTO’ SCOPE=BOTH;
Sun Mar 5 18:57:08 2006
Alter database recover managed standby database disconnect
Attempt to start background Managed Standby Recovery process
MRP0 started with pid=11
MRP0: Background Managed Standby Recovery process started
Starting datafile 1 recovery in thread 1 sequence 235
Datafile 1: ‘/opt/oracle/oradata/dgtest9i/system01.dbf’
Starting datafile 2 recovery in thread 1 sequence 235
Datafile 2: ‘/opt/oracle/oradata/dgtest9i/undotbs01.dbf’
Starting datafile 3 recovery in thread 1 sequence 235
Datafile 3: ‘/opt/oracle/oradata/dgtest9i/users01.dbf’
Starting datafile 4 recovery in thread 1 sequence 235
Datafile 4: ‘/opt/oracle/oradata/dgtest9i/arjun1.dbf’
Media Recovery Log /opt/oracle/dgtest9i/arch/arch235.log
Media Recovery Log /opt/oracle/dgtest9i/arch/arch236.log
Media Recovery Log /opt/oracle/dgtest9i/arch/arch237.log
Media Recovery Waiting for thread 1 seq# 238 (in transit)

Now check logs applied…..

SQL> SQL> select max(SEQUENCE#) “LAST_LOG_GENERATED” FROM V$ARCHIVED_LOG WHERE THREAD# =1 and ARCHIVED=’YES’;

LAST_LOG_GENERATED
——————
237

select max(SEQUENCE#) “LAST_LOG_APPLIED” FROM V$ARCHIVED_LOG WHERE THREAD# = 1 and APPLIED=’YES’;

LAST_LOG_APPLIED
—————-
237

Therefore it is a good idea to check the alert_log on a regular basis to make sure managed recovery has not crashed, just log shipping taking place is not enough as logs will continue to get shipped irrespective of whether recovery is in progress or not.

Posted in Administration, High Availability | Leave a Comment »

Data Guard switchover checklist

Posted by Arjun Raja on June 30, 2009

In some establishments, Dataguard switchovers are manual -

Please perform these pre-requisite checks before undertaking a switchover to primary.

1. ON STANDBY SITE:

SQL> select database_role from v$database;

DATABASE_ROLE
—————-
PHYSICAL STANDBY…………….PROCEED.

If you receive a reply like the one below, then do not proceed- you are most probably firing the sql command in the primary site:

SQL> select database_role from v$database;

DATABASE_ROLE
—————-
PRIMARY…………….do not proceed if this message is received …in this case you are most probably on the primary site..

Again on STANDBY SITE:

SQL> select name,value from v$parameter where name in (‘log_archive_dest_1′,’log_archive_dest_state_1′, ‘log_archive_dest_2′,’log_archive_dest_state_2′);

NAME
—————————————————————-
VALUE
——————————————————————————–
log_archive_dest_1
LOCATION=/opt/oracle/opsdb9i/arch

log_archive_dest_2
SERVICE=opsdb9i_blade07 lgwr sync affirm nodelay……….Make sure lgwr and not arch is mentioned here, otherwise new primary database will not open after switchover ( if the protection_mode is Maximum availability).

log_archive_dest_state_1
ENABLE

log_archive_dest_state_2
DEFER …………..DO NOT PROCEED.

While functioning as a STANDBY, it is better to set log_archive_dest_state_2 to DEFER to avoid errors appearing in the alert_log , but make sure it is set to ENABLE before starting a switchover.

This command can be issued to convert it to ENABLE

sql> alter system set log_archive_dest_state_2=’ENABLE’ scope=both;

Now check again…

SQL>NAME
—————————————————————-
VALUE
——————————————————————————–
log_archive_dest_state_2
ENABLE…………………………………PROCEED .

SQL> select distinct status from v$datafile;

STATUS
——-
ONLINE
SYSTEM

If any file/files are in RECOVER status, DO NOT PROCEED with the switchover…

SQL> select distinct open_mode from v$database;

OPEN_MODE
———-
MOUNTED………This is the correct response ..PROCEED with switchover.

If the STANDBY database has been opened in READ ONLY mode, you will receive the following message…

SQL> select distinct open_mode from v$database;

OPEN_MODE
———-
READ ONLY……………….DO NOT PROCEED with switchover.

To confirm that logs are being shipped and more importantly the LAST LOG archived has been applied…

On PRIMARY site
SQL> select max(SEQUENCE#) “LAST_LOG_GENERATED” FROM V$ARCHIVED_LOG WHERE THREAD# =1 and ARCHIVED=’YES’;

LAST_LOG_GENERATED
——————
14

Now on STANDBY site.
SQL> select max(SEQUENCE#) “LAST_LOG_APPLIED” FROM V$ARCHIVED_LOG WHERE THREAD# = 1 and APPLIED=’YES’;

LAST_LOG_APPLIED
—————-
14

SAFE TO PROCEED.

Posted in Administration, High Availability | Tagged: , , , | Leave a Comment »

Resize standby datafile if disk runs out of space on standby site.

Posted by Arjun Raja on June 30, 2009

PURPOSE: TO AVOID RECREATION OF STANDBY DATABASE IN CASE FILE IS NOT RESIZED ON STANDBY :

ITLINUXDEVBLADE07-PRIMARY

Database is DGTEST9i

[oracle@itlinuxdevblade07 dgtest9i]$ df -h .

Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p2 25G 21G 2.0G 92% /opt

[oracle@itlinuxdevblade07 dgtest9i]$

2 gb freespace on disk on PRIMARY.

ITLINUXDEVBLADE08

[oracle@itlinuxdevblade08 oradata]$ df -h .

Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p2 25G 23G 550M 98% /opt

[oracle@itlinuxdevblade08 oradata]$

The corresponding disk on the standby site as in db_file_name_convert has only 550 mb free.

Now create a new tablespace for the testing purpose..

SQL> create tablespace dropme datafile ‘/opt/oracle/oradata/dgtest9i/dropme.dbf’ size 200m;

Tablespace created.

SQL> alter system switch logfile;

System altered.

SQL> /

System altered.

[oracle@itlinuxdevblade07 dgtest9i]$ df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p2 25G 22G 1.8G 93% /opt
[oracle@itlinuxdevblade07 dgtest9i]$

On STANDBY

[oracle@itlinuxdevblade08 dgtest9i]$ df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p2 25G 23G 350M 99% /opt
[oracle@itlinuxdevblade08 dgtest9i]$

Space available reduced from 550 mb to 350 mb

The logs were succesfully applied on STANDBY disk as there was enough space.

Now add a sum of more than the amount available on the standby disk by RESIZING the file on PRIMARY to the tablespace.

PRIMARY site..

SQL> Alter database datafile ‘/opt/oracle/oradata/dgtest9i/dropme.dbf’ RESIZE 1024M;
Database altered.

SQL> SQL>
SQL> alter system switch logfile;
System altered.
Go to STANDBY…

[oracle@itlinuxdevblade08 dgtest9i]$ df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p2 25G 23G 350M 99% /opt
[oracle@itlinuxdevblade08 dgtest9i]$ ls -lrt
total 801848
-rw-r—– 1 oracle dba 209723392 May 10 14:23 dropme.dbf
-rw-r—– 1 oracle dba 7544832 May 10 14:23 control01.ctl
[oracle@itlinuxdevblade08 dgtest9i]$

The file has not been resized….as there was not enough space…

Now check the logs for errors..

On standby …alert log shows media recovery failed as the datafile could not be resized and media recovery stops

The standby database cannot be opened now as the files need recovery..

Media Recovery Log /opt/oracle/dgtest9i/arch/arch59.log
MRP0: Background Media Recovery terminated with error 1237
Wed May 10 14:23:04 2006
Errors in file /opt/oracle/admin/dgtest9i/bdump/dgtest9i_mrp0_10198.trc:
ORA-01237: cannot extend datafile 3
ORA-01110: data file 3: ‘/opt/oracle/dgtest9i/dropme.dbf’
ORA-19502: write error on file “/opt/oracle/dgtest9i/dropme.dbf”, blockno 71297 (blocksize=8192)
ORA-27072: skgfdisp: I/O error
Linux-x86_64 Error: 2: No such file or directory
Additional information: 71296
Some recovered datafiles maybe left media fuzzy
Media recovery may continue but open resetlogs may fail
MRP0: Background Media Recovery process shutdown

STEPS to solve this problem.

1. Shutdown both databases.

2. Create a new db_file_name_convert path as the file in question has to be moved to different disks where space exists.

*.db_file_name_convert=’/opt/oracle/dgtest9i/’,'/opt/oracle/oradata/dgtest9i/’,'/tmp/dgtest9i/’,'/tmp/dgtest9i/’…both machines.

3. Copy dropme.dbf to /tmp/dgtest9i on both machines from original area.

4. Startup mount;..PRIMARY DATABASE

5. alter database rename file ‘/opt/oracle/oradata/dgtest9i/dropmedbf’ to ‘/tmp/dgtest9i/dropme.dbf’;

6. alter database open;

7. alter database create standby controlfile as ‘/tmp/control01.ctl’ ….primary site.

8. ftp the standby controlfile to /opt/oracle/dgtest9i on standby….controlfile area in standby spfile.

9.startup nomount;…….STANDBY site.

SQL> show parameter convert;

NAME TYPE VALUE
———————————— ———– ——————————
db_file_name_convert string /opt/oracle/oradata/dgtest9i/, /opt/oracle/dgtest9i/, /tmp/dgtest9i/, /tmp/dgtest9i/
log_file_name_convert string /opt/oracle/oradata/dgtest9i/, /opt/oracle/dgtest9i/

SQL> alter database mount standby database;

Database altered.

SQL> select name from v$datafile;

/opt/oracle/dgtest9i/system01.dbf
/opt/oracle/dgtest9i/undotbs01.dbf
/tmp/dgtest9i/dropme.dbf

Primary site : Switch a few logfiles….

Now start managed recovery on standby.

SQL>recover managed standby database disconnect;

Alert log below…

Completed: ALTER DATABASE RECOVER managed standby database d
Wed May 10 14:54:30 2006
RFS: Possible network disconnect with primary database
Wed May 10 14:54:38 2006
Fetching gap sequence for thread 1, gap sequence 59-63
Trying FAL server: DGTEST9I_BLADE07
Wed May 10 14:54:41 2006
RFS: Successfully opened standby logfile 4: ‘/opt/oracle/dgtest9i/st01.log’
Wed May 10 14:54:56 2006
Media Recovery Log /opt/oracle/dgtest9i/arch/arch59.log
Media Recovery Log /opt/oracle/dgtest9i/arch/arch60.log
Media Recovery Log /opt/oracle/dgtest9i/arch/arch61.log
Media Recovery Log /opt/oracle/dgtest9i/arch/arch62.log
Media Recovery Log /opt/oracle/dgtest9i/arch/arch63.log
Media Recovery Log /opt/oracle/dgtest9i/arch/arch64.log
Media Recovery Log /opt/oracle/dgtest9i/arch/arch65.log
Media Recovery Waiting for thread 1 seq# 66 (in transit)

Go to /tmp/dgtest9i on STANDBY and make sure file is resized to 1 gb..

[oracle@itlinuxdevblade08 dgtest9i]$ ls -lrt
total 1049612
-rw-r—– 1 oracle dba 1073750016 May 10 14:55 dropme.dbf
[oracle@itlinuxdevblade08 dgtest9i]$ pwd

SUCCESS !

Posted in Administration, High Availability | Tagged: , | 1 Comment »

Recreate online redo logs and standby logs

Posted by Arjun Raja on June 29, 2009

In a DATAGUARD ENVIRONMENT, If online redologs are resized, the standby logs also need to be resized to match the online redo logs in the Primary database.

STEPS TO ACHIEVE THIS -

1. Drop and recreate online redologs on the PRIMARY site.

2. On primary site, create the standby controlfile -

alter database create standby controlfile as ‘/tmp/standby01.ctl’;

3. Copy the standby control file to standby site…

4. Cancel recovery, shutdown immediate the standby database.

5. Startup and mount the standby database -using the new standby controlfile.

5. Defer recovery until standby logs are created..…do not start managed recovery.

6. Add the standby logs with same size as Primary online redologs.

7. The online logs on the standby side will be created at switchover time so no need to do anything now.

8. Start recovery on standby site..

To recreate ONLY the STANDBY logs only…follow from point 2 above.

Posted in High Availability | Tagged: , | Leave a Comment »

Resumable space allocation.

Posted by Arjun Raja on June 29, 2009

RESUMABLE SPACE ALLOCATION AND AFTER SUSPEND TRIGGER

A resumable statement allows you to :

1. Suspend large operations instead of receiving an error.

2. Allows you to fix the problem while the operation is suspended, rather than start from scratch.

A session remains suspended for the following reasons

1. Out of space
2. Max extents reached.
3. Space quota exceeded on tablespace.

Pre-requisites:

1. INIT.ORA prameter : RESUMBABLE_TIMEOUT=3600( 1 HOUR) , default is 7200 seconds or 2 hours…This parameter can be changed dynamically with an ‘alter system set resumable_timeout=3600′

2. Grant RESUMABLE system privilege to schema owner/user….

3. User has to issue ‘ alter session enable resumable’ before running the transaction.

EXAMPLE…

Table test in arjun schema….

SQL> create table test (col_1 number(10),col_2 date);

Table created.

Create a script called loop.sql which is created in this manner…..

vi loop.sql in unix .

declare
x number;
begin
x:= 0;
for i in 1 .. 5000
loop
insert into test
values
(x,sysdate);
x := x+1;
commit;
end loop;
end;
/
save the file as loop.sql

Now set the Resumable_timeout parameter to 0…

SQL> alter system set resumable_timeout=0 scope=both;

System altered.

SQL> @loop
declare
*
ERROR at line 1:
ORA-01653: unable to extend table ARJUN.TEST by 128 in tablespace ARJUN
ORA-06512: at line 7

Now issue command as sysdba.

Sql> alter system set resumable_timeout=3600 scope=both;

Conn arjun/arjun

@loop

Session hangs after sometime….but no error message is thrown( see below the use of after suspend trigger to throw an error message when the session hangs)

From another session….

SQL> select count(*) from arjun.test;

COUNT(*)
———-
98900

SQL> /

COUNT(*)
———-
98900

SQL> /

COUNT(*)
———-
98900

Although space has run out tablespace arjun, the first session does not throw an error but allows you to increase space in the tablespace from another session.

The first session will wait for 1 hour – the time set by you in the resumable_timeout=3600 parameter…If you do not take corrective action, ie add space within that period, an error is thrown and transaction rolled back.

SQL> select sum(bytes/1048576) from dba_free_space where tablespace_name=’ARJUN’;

SUM(BYTES/1048576)
——————
.9375

Add space to tablespace..

SQL> Alter database datafile ‘/u03/ORACLE/test9/arjun.dbf’ resize 5m;

Database altered.

As soon as you add space the table begins to get populated again…

SQL> select count(*) from arjun.test;

COUNT(*)
———-
119097

SQL> /

COUNT(*)
———-
136680

SQL> /

COUNT(*)
———-
145413

##################################################################

USAGE OF AFTER SUSPEND TRIGGER IN RELATION TO RESUMABLE SPACE ALLOCATION:

Another very useful feature is the usage of the AFTER SUSPEND trigger which results in an error message appearing in the session, rather than the session hanging( in which case the DBA or USER will never know there is an error unless he checks from another session.)

This helps the DBA to take corrective action once the error message is thrown.

Once corrective action is taken , the original session can be restarted.

Firstly create the trigger…As SYS user.

CREATE OR REPLACE TRIGGER ARJUN_RESUMABLE
AFTER SUSPEND ON DATABASE
BEGIN
DBMS_RESUMABLE.SET_TIMEOUT(60);
END;
/

This creates a trigger which sets the RESUMABLE_TIMEOUT to 60 seconds or 1 minute if there is a situation where a session is stuck. This is the time allowed to take corrective action before oracle aborts the transaction.

Begin inserting into the TEST table…

@loop

Hangs after sometime….

To check if a session is hanging and the error message—

From another session just select error_msg from the dba_resumable view….

SQL> select error_msg from dba_resumable;

ERROR_MSG
——————————————————————————–
ORA-01653: unable to extend table ARJUN.TEST by 128 in tablespace ARJUN

SQL> select count(*) from arjun.test;

COUNT(*)
———-
353926

Now the original session that was running the insert hangs for 1 minute and then throws the error….

declare
*
ERROR at line 1:
ORA-30032: the suspended (resumable) statement has timed out
ORA-01653: unable to extend table ARJUN.TEST by 128 in tablespace ARJUN
ORA-06512: at line 7

Take corrective action and then rerun the insert …

@loop…

SQL> select count(*) from arjun.test;………notice the session has resumed and count increases….

COUNT(*)
———-
416321

SQL> /

COUNT(*)
———-
421232

SQL> /

COUNT(*)
———-
425740

Posted in Administration, High Availability | Leave a Comment »

Flashback query

Posted by Arjun Raja on June 29, 2009

FLASHBACK QUERY

FLASHBACK QUERY feature allows you to perform queries on the database as of a certain time or user-specified SCN.

FLASHBACK VERSIONS QUERY feature allows you to use the VERSIONS clause to retrieve all of the versions of the rows that exist between two points in time, or two SCN’s.

YOU REQUIRE THE SELECT ANY TRANSACTION PRIVILEGE TO BE ABLE TO ISSUE A QUERY AGAINST FLASHBACK_TRANSACTION_QUERY.

SQL> create table test(salary number(10));

Table created.

SQL> insert into test values(1000);

1 row created.

SQL> select * from test;

SALARY
———-
1000

SQL> update test set salary=2000;

1 row updated.

SQL> commit;

Commit complete.

SQL> update test set salary=4000;

1 row updated.

SQL> commit;

Commit complete.

SQL> update test set salary=5000;

1 row updated.

SQL> commit;

Commit complete.

SQL> select salary from test versions between scn minvalue and maxvalue;

SALARY
———-
5000
4000
2000

Using FLASHBACK VERSIONS query

select salary from test versions between timestamp
TO_TIMESTAMP( ‘2005-09-12 14:38:00′,’YYYY-MM-DD HH24:MI:SS’) and
TO_TIMESTAMP( ‘2005-09-12 14:43:00′,’YYYY-MM-DD HH24:MI:SS’);

SALARY
———-
5000
4000
2000

Select to_char(versions_starttime,’DD-MON HH:MI’) “START DATE”,
to_char (versions_endtime,’DD-MON HH:MI’) “END DATE”,VERSIONS_XID,VERSIONS_OPERATION,empno FROM VERSIONS BETWEEN SCN MINVALUE AND MAXVALUE;

START DATE END DATE VERSIONS_XID V SALARY
———— ———— —————- – ———-
12-SEP 02:40 0700190004010000 U 5000
12-SEP 02:40 12-SEP 02:40 0A00290001020000 U 4000
12-SEP 02:40 12-SEP 02:40 07000C0005010000 I 2000

Using FLASHBACK TRANSACTION query:

The flashback transaction query helps to get the actual query run:

SQL> Select table_name,operation, undo_sql from flashback_transaction_query Where XID=’06002C00F7060000′;

TABLE_NAME
——————————————————————————–
OPERATION
——————————–
UNDO_SQL
——————————————————————————–
TEST
UPDATE
update “ARJUN”.”TEST” set “SALARY” = ‘1000′ where ROWID = ‘AAAC7iAAFAAAAAPAAA’;

TEST
INSERT
delete from “ARJUN”.”TEST” where ROWID = ‘AAAC7iAAFAAAAAPAAA’;

TABLE_NAME
——————————————————————————–
OPERATION
——————————–
UNDO_SQL
——————————————————————————–

To narrow down to a point-in-time…

Select table_name,operation ,undo_sql
From flashback_transaction_query where start_timestamp >= to_timestamp ( ‘2005-09-12 14:38:00′,’YYYY-MM-DD HH24:MI:SS’)
AND TABLE_OWNER=’ARJUN’;

TABLE_NAME
——————————————————————————–
OPERATION
——————————–
UNDO_SQL
——————————————————————————–
TEST
UPDATE
update “ARJUN”.”TEST” set “SALARY” = ‘4000′ where ROWID = ‘AAAC7iAAFAAAAAPAAA’;

TEST
UPDATE
update “ARJUN”.”TEST” set “SALARY” = ‘2000′ where ROWID = ‘AAAC7iAAFAAAAAPAAA’;

Please remember that the FLASHBACK VERSIONS QUERY cannot be used to query:

1. EXTERNAL TABLES
2. TEMPORARY TABLES
3. FIXED TABLES
4. VIEWS

IT also cannot span DDL commands, ie…alter table etc..

Posted in Administration, High Availability | Leave a Comment »

Flashback table

Posted by Arjun Raja on June 29, 2009

FLASHBACK TABLE

Only FLASHBACK DATABASE requires flashback to be ‘ON’ since only FLASHBACK DATABASE uses flashback logs in the flash_recovery_area.

All other forms of FLASHBACK use the recycle bin and undo_tablespace.

Important info:

You cannot ‘flashback table to before drop’ a table which has been created in the SYSTEM tablespace.

The table is sent to the recyclebin only if it existed in some other tablespace other than SYSTEM tablespace and that tablespace must be locally managed.

When you drop a table, the objects are temporarily placed in a ‘recycle bin’ and still belong to the owner.

The space used by recycle bin objects is never reclaimed unless there is space pressure.

The space associated with the dropped object is not immediately reclaimable although it appears in the DBA_FREE_SPACE view.

Query the dba_recyclebin view as SYS or just recyclebin as the user for information about the recycle bin.

Flashback drop allows you to recover a dropped table.

Example.

Connect arjun/arjun

Create table tempp (col_1 number(10)) tablespace users;

Insert into tempp values (10);

1 row created.

SQL> drop table tempp;

Table dropped.

SQL> show recyclebin;

ORIGINAL NAME RECYCLEBIN NAME OBJECT TYPE DROP TIME
—————- —————————— ———— ——————-
TEMPP BIN$AI2Gp/QiZsPgQAw5yUImEA==$0 TABLE 2005-09-12:10:49:16

SQL> flashback table tempp to before drop;

Flashback complete.

SQL> select * from tempp;

COL_1
———-
10

In case the table is created in the system tablespace :

SQL> show user

USER is “SYS”…..Default tablespace SYSTEM

SQL> create table arjun1 (col_1 number(10));

Table created.

SQL> drop table arjun1;

Table dropped.

SQL> select * from dba_recyclebin;

no rows selected

SQL> flashback table arjun1 to before drop;
flashback table arjun1 to before drop
*
ERROR at line 1:
ORA-38305: object not in RECYCLE BIN

Permanently drop without recycle bin
Drop table test purge;

QUERYING DROPPED TABLES

Dropped tables can be queried from the recycle bin. No DML or DDL operations are allowed on the table.

SQL> drop table tempp;

Table dropped.

SQL> show recyclebin;
ORIGINAL NAME RECYCLEBIN NAME OBJECT TYPE DROP TIME
—————- —————————— ———— ——————-
TEMPP BIN$AI9AwvFRdf7gQAw5yUIsGA==$0 TABLE 2005-09-12:13:15:22

While querying the recycle bin, make sure the system generated table name is enclosed in double quotes.

SQL> select * from BIN$AI9AwvFRdf7gQAw5yUIsGA==$0;
select * from BIN$AI9AwvFRdf7gQAw5yUIsGA==$0
*
ERROR at line 1:
ORA-00933: SQL command not properly ended

SQL> select * from ‘BIN$AI9AwvFRdf7gQAw5yUIsGA==$0′;
select * from ‘BIN$AI9AwvFRdf7gQAw5yUIsGA==$0′
*
ERROR at line 1:
ORA-00903: invalid table name

SQL> select * from “BIN$AI9AwvFRdf7gQAw5yUIsGA==$0″;

COL_1
———-
10

You cannot run any DML or DDL on dropped tables ….

SQL> Insert into tempp values(20);
Insert into tempp values(20)
*
ERROR at line 1:
ORA-00942: table or view does not exist

FLASHBACK TABLE TO A TIME IN THE PAST.

Firstly enable row movement for table test..

SQL> ALTER TABLE TEST ENABLE ROW MOVEMENT;

TIME : 08:00:00

SQL> select * from test;

SALARY
———-
5000

TIME :08:00:01

SQL> update test set salary =6000;

1 row updated.

SQL> select * from test;

SALARY
———-
6000

SQL> commit;

Commit complete.

Now flashback table to time 08:00:00

SQL> FLASHBACK TABLE TEST to timestamp TO_TIMESTAMP( ‘2005-09-13 08:00:00′,’YYYY-MM-DD HH24:MI:SS’);

Flashback complete.

SQL> SELECT * FROM TEST;

SALARY
———-
5000

Posted in Administration, High Availability | Leave a Comment »

FLASHBACK DATABASE

Posted by Arjun Raja on June 29, 2009

FLASHBACK DATABASE

Flashback database allows you to rollback database to a time in the past.

Useful if you have :

1. Dropped user
2. Truncated table
3. Batch job:Partial changes.

PRE-REQUISITE: DATABASE MUST BE IN ARCHIVELOG MODE AND YOU MUST SET FLASHBACK ON( BY DEFAULT IT IS OFF)

FLASHBACK LOGS ARE CREATED IN YOUR FLASH_RECOVERY_AREA SPECIFIED BY THE INIT.ORA PARAMETER – DB_RECOVERY_FILE_DEST

To set FLASHBACK ON in your 10g database.

shutdown immediate;

startup mount exclusive;….Must be mounted exclusive.

alter database flashback on;

alter database open;

To Flashback database…

Flashback database can be issued with 3 different conditions:

1. TO_TIME
2. TO SCN
3. TO SEQUENCE( LOG ARCHIVE SEQ)

FLASHBACK TO_TIME

PRESENT TIME IS 2009-06-29 07:40:00

drop user arjun cascade;

user dropped.

shutdown immediate;

startup mount;

flashback database to timestamp TO_TIMESTAMP( ‘2009-06-29 07:38:00′,’YYYY-MM-DD HH24:MI:SS’);

alter database open resetlogs;

select username from dba_users;

SQL> select username from dba_users;

USERNAME
——————————
SYS
SYSTEM
DBSNMP
ARJUN
OUTLN

After this is done you cannot flashback the database to a time before the original flashback…

Shutdown immediate;

Startup mount;

SQL> flashback database to timestamp TO_TIMESTAMP( ‘2009-06-29 07:35:00′,’YYYY-MM-DD HH24:MI:SS’);
flashback database to timestamp TO_TIMESTAMP( ‘2009-06-29 07:35:00′,’YYYY-MM-DD HH24:MI:SS’)
*
ERROR at line 1:
ORA-38729: Not enough flashback database log data to do FLASHBACK.

But you can flashback the tablespace to a time after the original flashback time of 07:38:00

SQL> flashback database to timestamp TO_TIMESTAMP( ‘2009-06-29 07:45:00′,’YYYY-MM-DD HH24:MI:SS’);

Flashback complete.

alter database open resetlogs;

FOR FLASHBACK TO SCN….

Alter database mount;

Flashback database to scn=12355;

To get current_scn: select current_scn from v$database;

TO FLASHBACK TO SEQUENCE NUMBER…

Alter database mount;

Flashback database to sequence=223 thread=1;

Posted in Backup and Recovery, High Availability | Leave a Comment »

ARCHIVE AREA FULL – CONNECT INTERNAL UNTIL FREED

Posted by Arjun Raja on June 29, 2009

When a database is running in ARCHIVELOG mode, if the area which stores the archivelogs on disk gets full, the database comes to a halt and can be accessed only as a sysdba.

The error message when you try to connect as a normal user is ‘ CONNECT INTERNAL UNTIL FREED’

If the following parameter is in use – DB_RECOVERY_FILE_DEST as the archivelog destination , then the size of the parameter DB_RECOVERY_FILE_DEST_SIZE can be increased dynamically and then the database begins to function again -

Ex

SQL> show parameter db_recovery_file_dest;

NAME TYPE VALUE
———————————— ———– ——————————
db_recovery_file_dest string /opt/oracle/flash_recovery_area

db_recovery_file_dest_size big integer 1000M

SQL> SELECT * FROM V$RECOVERY_FILE_DEST;

NAME
——————————————————————————–
SPACE_LIMIT SPACE_USED SPACE_RECLAIMABLE NUMBER_OF_FILES
———– ———- —————– —————
/opt/oracle/flash_recovery_area

1048576000 1048260608 0 163

There are a couple of possible options.

1) Increase the parameter db_recovery_file_dest_size, which is dynamic.

SQL> alter system set db_recovery_file_dest_size=XG; (larger amount) – Example 2G

2) Stop using the db_recovery_file_dest by unsetting the parameter.
( This assumes you never really wanted to use this option )

Posted in Administration, High Availability | Tagged: , , | Leave a Comment »

10g Data Guard Broker Fast Start Failover

Posted by Gavin Soorma on June 25, 2009

FAST START FAILOVER IS DISABLED BY DEFAULT

Configuration
  Name:                streams2.hq.emirates.com
  Enabled:             YES
  Protection Mode:     MaxAvailability
  Fast-Start Failover: DISABLED  Databases:
    streams2 - Primary database
    blade08  - Physical standby database

Current status for "streams2.hq.emirates.com":
SUCCESS

PREPARE TO ENABLE TO FAST START FAILOVER

DGMGRL> EDIT DATABASE 'streams2' SET PROPERTY FastStartFailoverTarget = 'blade08';
Property "faststartfailovertarget" updated
DGMGRL> EDIT DATABASE 'blade08' SET PROPERTY FastStartFailoverTarget = 'streams2';
Property "faststartfailovertarget" updated

WE SET THE THRESHOLD FOR FAILOVER TO 60 SECONDS.
OBSERVER WILL MONITOR FOR NO RESPONSE > 60 SECONDS

DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold =60;
Property "faststartfailoverthreshold" updated

FROM ANOTHER SESSION START OBSERVER - IT WILL RUN IN THE FOREGROUND 

streams2:/opt/oracle/product10gr2/db/network/admin>dgmgrl
DGMGRL for Linux: Version 10.2.0.1.0 - 64bit Production

Copyright (c) 2000, 2005, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys/oracle
Connected.
DGMGRL> start observerObserver started

ENABLE THE FAST START FAILOVER 

DGMGRL> ENABLE FAST_START FAILOVER;
Enabled.

CONFIRM THE CONFIGURATION IF FAST START FAILOVER IS ENABLED 

DGMGRL> show configuration;

Configuration
  Name:                streams2.hq.emirates.com
  Enabled:             YES
  Protection Mode:     MaxAvailability
  Fast-Start Failover: ENABLED
  Databases:
    streams2 - Primary database
    blade08  - Physical standby database
             - Fast-Start Failover target

Current status for "streams2.hq.emirates.com":
SUCCESS

SIMULATE A FAILURE BY SHUTTING DOWN CURRENT PRIMARY DATABASE - STANDBY

standby:/opt/oracle/product10gr2/db/network/admin>sql

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Oracle Label Security, OLAP
and Data Mining options

SQL> shutdown abort
ORACLE instance shut down.

AFTER ONE MINUTE HAS LAPSED, YOU WILL SEE THAT THE OBSERVER HAS INITIATED A FAILOVER
12:51:28.78  Tuesday, May 15, 2007
Initiating fast-start failover to database "streams2"...
Performing failover NOW, please wait...
Failover succeeded, new primary is "streams2"
12:51:40.35  Tuesday, May 15, 2007

CONFIRM THAT THE NEW PRIMARY IS NOW STREAMS2

DGMGRL> show configuration;

Configuration
  Name:                streams2.hq.emirates.com
  Enabled:             YES
  Protection Mode:     MaxAvailability
  Fast-Start Failover: ENABLED
  Databases:
    streams2 - Primary database
    blade08  - Physical standby database (disabled)
             - Fast-Start Failover target

Current status for "streams2.hq.emirates.com":
Warning: ORA-16608: one or more databases have warnings

USING FLASHBACK WE CAN REINITIATE THE CURRENT STANDBY - BLADE08 - NO NEED TO RECREATE FROM SCRATCH

Posted in High Availability | Tagged: , , , | Leave a Comment »

Using FLASHBACK to Reinstate a Standby after a FAILOVER

Posted by Gavin Soorma on June 25, 2009

PERFORM A FAILOVER USING DATA GUARD BROKER

DGMGRL> failover to 'standby_blade07';
Performing failover NOW, please wait...
Failover succeeded, new primary is "standby_blade07"

On the NEW PRIMARY issue this command to obtain the SCN to FLASHBACK the old primary(new standby) to

SQL> conn / as sysdba
Connected.

SQL> SELECT TO_CHAR(STANDBY_BECAME_PRIMARY_SCN) FROM V$DATABASE;
TO_CHAR(STANDBY_BECAME_PRIMARY_SCN)
----------------------------------------
2147132

FLASHBACK THE OLD PRIMARY AND CONVERT IT TO A PHYSICAL STANDBY

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.

Total System Global Area  574619648 bytes
Fixed Size                  2022408 bytes
Variable Size             327156728 bytes
Database Buffers          239075328 bytes
Redo Buffers                6365184 bytes
Database mounted.

SQL> FLASHBACK DATABASE TO SCN 2147132;
Flashback complete.

SQL> ALTER DATABASE CONVERT TO PHYSICAL STANDBY;
Database altered.

SQL> shutdown immediate;
ORA-01507: database not mounted

ORACLE instance shut down.
SQL>

USING DATA GUARD BROKER REINSTATE THE NEW STANDBY DATABASE INTO THE DATA GUARD CONFIGURATION

DGMGRL> reinstate database 'streams2';Reinstating database "streams2", please wait...
Operation requires shutdown of instance "streams2" on database "streams2"
Shutting down instance "streams2"...
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "streams2" on database "streams2"
Starting instance "streams2"...
ORACLE instance started.
Database mounted.
Continuing to reinstate database "streams2" ...
Reinstatement of database "streams2" succeeded

DGMGRL> show configuration;
Configuration
  Name:                streams2.hq.emirates.com
  Enabled:             YES
  Protection Mode:     MaxPerformance
  Fast-Start Failover: DISABLED
  Databases:
    streams2        - Physical standby database
    standby_blade07 - Primary database
Current status for "streams2.hq.emirates.com":
SUCCESS

Posted in High Availability | Tagged: , , , | Leave a Comment »

Some top causes for unplanned outages

Posted by Gavin Soorma on June 23, 2009

I have quickly thought about some of the events which had caused unplanned downtime at my past organisation. Idea is to share the same and ensure our current monitoring infrastructure is taking into account aspects discussed below – better to learn from other’s experiences than to experience it yourself.

 

  • Listener log file grew > 2GB in a Red Hat environment. This prevented user sessions from connecting even though the listener was running. Not sure if there is an OS file limit in AIX, but if logging is enabled, a good practice will be to truncate the log on a regular basis.
  • Controlfile was multiplexed over three mount points and one of the mount points got 100% full. Error thrown by the application as well as alert log is ORA-0600 with a lot of arguments. Will not specifically indicate a disk space issue when looked up on Metalink.
  • Temporary Tablespaces created using large size for the tempfiles. At creation time, Oracle will not check if sufficient disk space is available or not. Problem will manifest itself when some operation like a sort or index build will start using temporary tablespace disk space and disk space is not adequate.
  • MAXEXTENTS value reached for tables and indexes. Application outage occurred when new tables and indexes were created and developer hard coded a very small value for MAXEXTENTS instead of leaving it to the default value which in UNLIMITED.
  • Disk holding archive log suddenly got filled due to a spurt in redo activity caused by a month end or year end batch job. Good practice is to have an automated RMAN job which will backup and delete archive log files if a certain disk space usage threshold is crossed.
  • Mount point holding Oracle software binaries got full. This will prevent any SQL*PLUS connections directly to the host machine. Since the trace file and core dump directories are usually on the same mount point as the ORACLE_HOME, we should ensure that effective daily housekeeping jobs are in place to not only clean up old trace files but also to rename the alert log on a daily basis so that the alert log only has entries for one day. The alert log is normally visited when a problem occurs and having the alert log have entries for the past 6 months or take a long time for the file to open using vi because it has grown very large is not a good practice.
  • Procedures or Packages got INVALID because some DDL activity was done on another object with dependencies. A good practice is to ensure that no INVALID objects exist in the database and to run a job which polls the database for any objects that have got invalidated. This will also serve from an auditing viewpoint as it may identify if anyone is doing an unauthorised modifications to the database.
  • Data Guard log shipping terminated because of some activity on the Primary database like password change for SYS, datafile added or resized on the Primary database, but the corresponding mount point on the standby machine did not have adequate space to accommodate the resize. Best practice is to ensure that the disk space allocation and mount point mapping is the same on both Primary as well as Standby machines.
  • Public database links are used by the application and the password of the user account used in the database link was changed as part of a monthly password change. Best practice is to never change the password for the user account that is used to connect via the database link.
  • Limit for parameters SESSIONS or PROCESSES which are defined in the init,ora was crossed. The only way to fix this is to kill sessions at the OS level or change the parameters and restart the database. A good practice would be to have an alerting mechanism which warns you if the threshold set is being reached. That way we can be proactive and kill any idle or inactive sessions so that this upper limit is not reached.
  • Planned outage for a data movement exercise was extended because Import of data failed because of lack of tablespace free space in target database.Consequently the whole exercise had to be repeated. Best practice is to use the RESUMABLE parameter while doing such operations so that they can be resumed from the point of failure and also to write a Database Event Trigger which can alert the DBA in case such an event happens.

Posted in High Availability | Tagged: , , | Leave a Comment »

Some FLASHBACK DATABASE examples using RMAN and SQL*PLUS

Posted by Gavin Soorma on June 18, 2009

Wrong Update

SQL> update customers set credit_limit=100000;

 Shutdown the database

 Startup mount the database

 invoke rman

 $ rman target /

 Flashback database to time one hour prior to current time

 RMAN>  FLASHBACK DATABASE TO TIME ‘(SYSDATE-1/24)’;

 Starting flashback at 27-JUN-05

using target database controlfile instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: sid=156 devtype=DISK

 starting media recovery

media recovery complete

 Finished flashback at 27-JUN-05

 Using SQL*PLUS

 SQL> flashback database to timestamp TO_TIMESTAMP(‘2005-06-27 09:00:00′,’YYYY-MM-DD HH24:MI:SS’);

 Flashback complete.

 SQL> alter database open  resetlogs;

Check the application

 SQL> select credit_limit from customers where rownum < 5;
 CREDIT_LIMIT

————

        2400

        2400

        2400

        1200

Posted in High Availability | Leave a Comment »

Flashback Database – Normal and Guaranteed Restore Points

Posted by Gavin Soorma on June 18, 2009

 

Normal Restore Points

 If the db_recovery_dest_file_size has been allocated less space then what is required for the flashback transaction logs as dictated by the db_flashback_retention_target parameter, then the flashback logs are deleted to free up space for the additional logs that are being generated.

Alert log will show also that flashback logs are being deleted

Deleted Oracle managed file /u02/oradata/testdb/TESTDB/flashback/o1_mf_51nc8494_.flb
Deleted Oracle managed file /u02/oradata/testdb/TESTDB/flashback/o1_mf_51nc86n3_.flb
Deleted Oracle managed file /u02/oradata/testdb/TESTDB/flashback/o1_mf_51nc8963_.flb

Flashback then MAY fail

SQL> flashback database to scn 1580000;
flashback database to scn 1580000
*
ERROR at line 1:
ORA-38729: Not enough flashback database log data to do FLASHBACK.

Guaranteed Restore Points

With guaranteed restore points, logs are not deleted, but if the space allocated for the flashback logs is not adequate, the database may hang until more space is allocated

The parameter db_recovery_file_dest_size is dynamic and can be increased on the fly to allocate more space for the flashback logs.

This query should be run when we have enabled a guaranteed restore point -  the STORAGE_SIZE column will show the disk space that is currently being used by the flashback logs and this should be compared with the db_recovery_file_dest_size value so as to ensure that we have allocated sufficient space for the flashback logs.

select name,scn,time,database_incarnation#,guarantee_flashback_database,storage_size
from v$restore_point
where guarantee_flashback_database = ‘YES’
;

Posted in High Availability | 1 Comment »