rac faq for interview

17
What is RAC and how is it different from non RAC databases? RAC stands for Real Application Clusters. It allows multiple nodes in a clustered system to mount and open a single database that resides on shared disk storage. Should a single system fail (node), the database service will still be available on the remaining nodes. A non-RAC database is only available on a single system. If that system fails, the database service will be down (single point of failure). Oracle Clusterware Software Component Processing Details The Oracle Clusterware comprises several background processes that facilitate cluster operations. The Cluster Synchronization Service (CSS), Event Management (EVM), and Oracle Cluster components communicate with other cluster component layers in the other instances within the same cluster database environment. These components are also the main communication links between the Oracle Clusterware high availability components and the Oracle Database. In addition, these components monitor and manage database operations The following list describes the functions of some of the major Oracle Clusterware components. This list includes these components which are processes on Unix and Linux operating systems or services on Windows. Cluster Synchronization Services (CSS)—Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using third-party clusterware, then the css process interfaces with your clusterware to manage node membership information. Cluster Ready Services (CRS)—The primary program for managing high availability operations within a cluster.

Upload: eid2ss

Post on 17-Nov-2014

105 views

Category:

Documents


0 download

DESCRIPTION

RAC Frequently asked question

TRANSCRIPT

Page 1: RAC FAQ for Interview

What is RAC and how is it different from non RAC databases?

RAC stands for Real Application Clusters. It allows multiple nodes in a clustered system to mount and open a single database that resides on shared disk storage. Should a single system fail (node), the database service will still be available on the remaining nodes.

A non-RAC database is only available on a single system. If that system fails, the database service will be down (single point of failure).

Oracle Clusterware Software Component Processing Details

The Oracle Clusterware comprises several background processes that facilitate cluster operations. The Cluster Synchronization Service (CSS), Event Management (EVM), and Oracle Cluster components communicate with other cluster component layers in the other instances within the same cluster database environment. These components are also the main communication links between the Oracle Clusterware high availability components and the Oracle Database. In addition, these components monitor and manage database operations

The following list describes the functions of some of the major Oracle Clusterware components. This list includes these components which are processes on Unix and Linux operating systems or services on Windows.

Cluster Synchronization Services (CSS)—Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using third-party clusterware, then the css process interfaces with your clusterware to manage node membership information.

Cluster Ready Services (CRS)—The primary program for managing high availability operations within a cluster. Anything that the crs process manages is known as a cluster resource which could be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so on. The crs process manages cluster resources based on the resource's configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. The crs process generates events when a resource status changes. When you have installed Oracle RAC, crs monitors the Oracle instance, Listener, and so on, and automatically restarts these components when a failure occurs. By default, the crs process makes five attempts to restart a resource and then does not make further restart attempts if the resource does not restart.

Event Management (EVM)—A background process that publishes events that crs creates.

Oracle Notification Service (ONS)—A publish and subscribe service for communicating Fast Application Notification (FAN) events.

Page 2: RAC FAQ for Interview

RACG—Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur.

Process Monitor Daemon (OPROCD)—This process is locked in memory to monitor the cluster and provide I/O fencing. OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms.

Oracle Clusterware Processes on UNIX-Based Systems

crsd—Performs high availability recovery and management operations such as maintaining the OCR and managing application resources. This process runs as the root user, or by a user in the admin group on Mac OS X-based systems. This process restarts automatically upon failure.

evmd—Event manager daemon. This process also starts the racgevt process to manage FAN server callouts.

ocssd—Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart.

oprocd—Process monitor for the cluster. Note that this process only appears on platforms that do not use vendor clusterware with Oracle Clusterware.

The Oracle Real Application Clusters Software Components

Each instance has a buffer cache in its System Global Area (SGA). Using Cache Fusion, Oracle RAC environments logically combine each instance's buffer cache to enable the instances to process data as if the data resided on a logically combined, single cache.

(The SGA size requirements for Oracle RAC are greater than the SGA requirements for single-instance Oracle databases due to Cache Fusion.)

To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy a query or transaction, Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file and each cached block using a Global Resource Directory (GRD). The GRD contents are distributed across all of the active instances, which effectively increases the size of the System Global Area for an Oracle RAC instance.

After one instance caches data, any other instance within the same cluster database can acquire a block image from another instance in the same database faster than by reading the block from disk. Therefore, Cache Fusion moves current blocks between instances rather than re-reading the blocks from disk. When a consistent block is needed or a

Page 3: RAC FAQ for Interview

changed block is required on another instance, Cache Fusion transfers the block image directly between the affected instances. Oracle RAC uses the private interconnect for inter-instance communication and block transfers. The Global Enqueue Service Monitor and the Instance Enqueue Process manages access to Cache Fusion resources as well as enqueue recovery processing.

The Oracle Clusterware Voting Disk and Oracle Cluster Registry

The Oracle Clusterware requires the following two critical files:

Voting Disk—Manages cluster membership by way of a health check and arbitrates cluster ownership among the instances in case of network failures. Oracle RAC uses the voting disk to determine which instances are members of a cluster. The voting disk must reside on shared disk. For high availability, Oracle recommends that you have multiple voting disks. The Oracle Clusterware enables multiple voting disks but you must have an odd number of voting disks, such as three, five, and so on. If you define a single voting disk, then you should use external mirroring to provide redundancy.

Oracle Cluster Registry (OCR)—Maintains cluster configuration information as well as configuration information about any cluster database within the cluster. The OCR also manages information about processes that Oracle Clusterware controls. The OCR stores configuration information in a series of key-value pairs within a directory tree structure. The OCR must reside on shared disk that is accessible by all of the nodes in your cluster. The Oracle Clusterware can multiplex the OCR and Oracle recommends that you use this feature to ensure cluster high availability. You can replace a failed OCR online, and you can update the OCR through supported APIs such as Enterprise Manager, the Server Control Utility (SRVCTL), or the Database Configuration Assistant (DBCA).

Can any application be deployed on RAC?

Most applications can be deployed on RAC without any modifications and still scale linearly (well, almost).

However, applications with 'hot' rows (the same row being accessed by processes on different nodes) will not work well. This is because data blocks will constantly be moved from one Oracle Instance to another. In such cases the application needs to be partitioned based on function or data to eliminate contention.

Benefits of RAC

fault tolerance (Fault-tolerant design, also known as fail-safe design, is a design that enables a system to continue operation, possibly at a reduced level (also known as graceful degradation), rather than failing completely, when some part of the system fails.)

load balancing

Page 4: RAC FAQ for Interview

scalability (scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged)

Do you need special hardware to run RAC?

RAC requires the following hardware components:

A dedicated network interconnect - might be as simple as a fast network connection between nodes; and

A shared disk subsystem.

Example systems that can be used with RAC:

Windows Clusters Linux Clusters UNIX Clusters like SUN PDB (Parallel DB). IBM z/OS in SYSPLEX

How many OCR and voting disks should one have?

When you use normal redundancy, Oracle Clusterware automatically maintains two copies of the Oracle Cluster Registry (OCR) file and three copies of the Voting Disk file.

If you choose external redundancy for the OCR and voting disk, then to enable redundancy, your disk subsystem must be configurable for RAID mirroring. Otherwise, your system may be vulnerable because the OCR and voting disk are single points of failure.

How does one convert a single instance database to RAC?

Oracle 10gR2 introduces a utility called rconfig (located in $ORACLE_HOME/bin) that will convert a single instance database to a RAC database.

$ cp $ORACLE_HOME/assistants/rconfig/sampleXMLs/ConvertToRAC.xml racconv.xml$ vi racconv.xml$ rconfig racconv.xml

One can also use dbca and enterprise manager to convert the database to RAC mode.

For prior releases, follow these steps:

Page 5: RAC FAQ for Interview

Shut Down your Database:

SQL> CONNECT SYS AS SYSDBASQL> SHUTDOWN NORMAL

Enable RAC - On Unix this is done by relinking the Oracle software.

Make the software available on all computer systems that will run RAC. This can be done by copying the software to all systems or to a shared clustered file system.

Each instance requires its own set of Redo Log Files (called a thread). Create additional log files:

SQL> CONNECT SYS AS SYSBDASQL> STARTUP EXCLUSIVE

SQL> ALTER DATABASE ADD LOGFILE THREAD 2SQL> GROUP G4 ('RAW_FILE1') SIZE 500k,SQL> GROUP G5 ('RAW_FILE2') SIZE 500k,SQL> GROUP G6 ('RAW_FILE3') SIZE 500k;

SQL> ALTER DATABASE ENABLE PUBLIC THREAD 2; Each instance requires its own set of Undo segments (rollback segments). To add

undo segments for New Nodes:

UNDO_MANAGEMENT = autoUNDO_TABLESPACE = undots2

Edit the SPFILE/INIT.ORA files and number the instances 1, 2,...:

CLUSTER_DATABASE = TRUE (PARALLEL_SERVER = TRUE prior to Oracle9i). INSTANCE_NUMBER = 1THREAD = 1UNDO_TABLESPACE = undots1 (or ROLLBACK_SEGMENTS if you use UNDO_MANAGEMENT=manual)# Include %T for the thread in the LOG_ARCHIVE_FORMAT string.# Set LM_PROCS to the number of nodes * PROCESSES# etc....

Create the dictionary views needed for RAC by running catclust.sql (previously called catparr.sql):

SQL> START ?/rdbms/admin/catclust.sql On all the computer systems, startup the instances:

SQL> CONNECT / as SYSDBASQL> STARTUP;

Page 6: RAC FAQ for Interview

How does one stop and start RAC instances?

There are no difference between the way you start a normal database and RAC database, except that a RAC database needs to be started from multiple nodes. The CLUSTER_DATABASE=TRUE (PARALLEL_SERVER=TRUE) parameter needs to be set before a database can be started in cluster mode.

In Oracle 10g one can use the srvctl utility to start instances and listener across the cluster from a single node. Here are some examples:

$ srvctl status database -d RACDB$ srvctl start database -d RACDB$ srvctl start instance -d RACDB -i RACDB1$ srvctl start instance -d RACDB -i RACDB2$ srvctl stop database -d RACDB$ srvctl start asm -n node2

Can I test if a database is running in RAC mode?

Use the DBMS_UTILITY package to determine if a database is running in RAC mode or not. Example:

BEGIN IF dbms_utility.is_cluster_database THEN dbms_output.put_line('Running in SHARED/RAC mode.'); ELSE dbms_output.put_line('Running in EXCLUSIVE mode.'); END IF;END;/

Another method is to look at the database parameters. For example, from SQL*Plus:

SQL> show parameter CLUSTER_DATABASE

If the value of CLUSTER_DATABASE is FALSE then database is not running in RAC Mode.

How can I keep track of active instances?

You can keep track of active RAC instances by executing one of the following queries:

SELECT * FROM SYS.V_$ACTIVE_INSTANCES;SELECT * FROM SYS.V_$THREAD;

To list the active instances from PL/SQL, use DBMS_UTILITY.ACTIVE_INSTANCES().

Page 7: RAC FAQ for Interview

Can one see how connections are distributed across the nodes?

Select from gv$session. Some examples:

SELECT inst_id, count(*) "DB Sessions" FROM gv$session WHERE type = 'USER' GROUP BY inst_id;

With login time (hour):

SELECT inst_id, TO_CHAR(logon_time, 'DD-MON-YYYY HH24') "Hour when connected", count(*) "DB Sessions" FROM gv$session WHERE type = 'USER' GROUP BY inst_id, TO_CHAR(logon_time, 'DD-MON-YYYY HH24') ORDER BY inst_id, TO_CHAR(logon_time, 'DD-MON-YYYY HH24');

What is pinging and why is it so bad?

Starting with Oracle 9i, RAC can transfer blocks from one instance to another across the interconnect* (cache fusion). This method is much faster than the old "pinging" method, where one instance had to write the block to disk before another instance could read it.

*An interconnect is the high-speed, low latency communication link between nodes in a cluster.

Oracle 8i and below:

Pinging is the process whereby one instance requests another to write a set of blocks from its SGA to disk so it can obtain it in exclusive mode. This method of moving data blocks from one instance's SGA to another is extremely slow. The challenge of tuning RAC/OPS is to minimize pinging activity.

What is Cache Fusion?

Prior to Oracle 9i, network-clustered Oracle databases used a storage device as the data-transfer medium (meaning that one node would write a data block to disk and another node would read that data from the same disk), which had the inherent disadvantage of lackluster performance. Oracle 9i addressed this issue, and RAC uses a dedicated network-connection for communications internal to the cluster.

Since all computers/instances in an RAC access the same database, the overall system must guarantee the coordination of data changes on different computers such that whenever a computer queries data it receives the current version — even if another computer recently modified that data. Oracle RAC refers to this functionality as Cache

Page 8: RAC FAQ for Interview

Fusion. Cache Fusion involves the ability of Oracle RAC to "fuse" the in-memory data cached physically separately on each computer into a single, global cache.

Administrative Tools for Oracle Real Application Clusters Environments

Oracle enables you to administer a cluster database as a single system image through Enterprise Manager, SQL*Plus, or through Oracle RAC command-line interfaces such as Server Control (SRVCTL). You can also use several tools and utilities to manage your Oracle RAC environment and its components as follows:

Enterprise Manager—Enterprise Manager has both the Database Control and Grid Control GUI interfaces for managing both single instance and Oracle RAC environments.

Cluster Verification Utility (CVU)—CVU is a command-line tool that you can use to verify a range of cluster and Oracle RAC-specific components such as shared storage devices, networking configurations, system requirements, and Oracle Clusterware, as well as operating system groups and users. You can use CVU for pre-installation checks as well as for post-installation checks of your cluster environment. CVU is especially useful during pre-installation and during installation of Oracle Clusterware and Oracle RAC components. The OUI runs CVU after Oracle Clusterware and the Oracle installation to verify your environment.

Server Control (SRVCTL)—SRVCTL is a command-line interface that you can use to manage an Oracle RAC database from a single point. You can use SRVCTL to start and stop the database and instances and to delete or move instances and services. You can also use SRVCTL to manage configuration information.

Cluster Ready Services Control (CRSCTL)—CRSCTL is a command-line tool that you can use to manage Oracle Clusterware. You can use CRSCTL to start and stop Oracle Clusterware. CRSCTL has many options such as enabling online debugging,

Oracle Interface Configuration Tool (OIFCFG)—OIFCFG is a command-line tool for both single-instance Oracle databases and Oracle RAC environments that you can use to allocate and de-allocate network interfaces to components. You can also use OIFCFG to direct components to use specific network interfaces and to retrieve component configuration information.

OCR Configuration Tool (OCRCONFIG)—OCRCONFIG is a command-line tool for OCR administration. You can also use the OCRCHECK and OCRDUMP utilities to troubleshoot configuration problems that affect the OCR.

Page 9: RAC FAQ for Interview

What is Oracle Cluster File System (OCFS)?

A: Oracle Cluster File System (OCFS) is a shared file system designed specifically for Oracle Real Application Clusters (RAC). OCFS eliminates the requirement for Oracle database files to be linked to logical drives. OCFS volumes can span one shared disk or multiple shared disks for redundancy and performance enhancements.OCFS is designed to provide an alternative to using raw devices for Oracle 9iRAC. Managing raw devices is usually a difficult task.A cluster file system allows a number of nodes in a cluster to concurrently access a given file system. Every node sees the same files and data. This allows easy management of data that needs to be shared between nodes.

How many nodes support OCFS?

A: Supports a maximum of 32 nodes

Is OCFS like NFS (Network File System)?

A:No.With NFS, the file system is hosted by one node; the other nodes must access the file system via the network.On NFS there is a single point of failure and slow data throughput. With OCFS the File system is natively mounted on all nodes.

What is the benefit of OCFS for customers and developers

A: OCFS on Linux implements most of the features of a generic cluster file system. For Oracle9i Real Application Clusters customers, OCFS will eliminate the need to manage and set up raw devices, making cluster database administration much easier as it looks and feels just like a regular file system. With Raw Devices it is possible to have a maximum of 255 raw partitions, on OCFS there are no limit on number of files.

Moreover, with a shared file system, the different instances of the database can share archive logs, which makes media recovery much more convenient because every node has access to all archived log file when needed.

Page 10: RAC FAQ for Interview

Known Limitations: at this time Oracle Cluster File System supports Oracle Datafiles. This includes Redo Log files, Archive log files, Controlfiles and database Datafiles. The shared quorum disk file for the cluster manager and the shared init file (srv) are also supported.

ASM Features

PURPOSE-------Automatic Storage Management is a file system and volume manager built into the database kernel that allows the practical management of thousands of disk drives with 24x7 availability.

It provides management across multiple nodes of a cluster for Oracle Real Application Clusters (RAC) support as well as single SMP machines.

It automatically does load balancing in parallel across all available disk drives to prevent hot spots and maximize performance, even with rapidly changing data usage patterns.

It prevents fragmentation so that there is never a need to relocate data to reclaim space. Data is well balanced and striped over all disks.

It does automatic online disk space reorganization for the incremental addition or removal of storage capacity.

It can maintain redundant copies of data to provide fault tolerance, or it can be built on top of vendor supplied reliable storage mechanisms. Data management is done by selecting the desired reliability and performance characteristics for classes of data rather than with human interaction on a per file basis.

ASM solves many of the practical management problems of large Oracle databases.

As the size of a database server increases towards thousands of disk drives, or tens of nodes in a cluster, the traditional techniques for management stop working.They do not scale efficiently, they become too prone to human error, and they require independent effort on every node of a cluster.

Other tasks, such as manual load balancing, become so complex as to prohibit their application.

These problems must be solved for the reliable management of databases in the tens or hundreds of terabytes. Oracle is uniquely positioned to solve these problems as a result of our existing Real Application Cluster technology.

Page 11: RAC FAQ for Interview

Oracle’s control of the solution ensures it is reliable and integrated with Oracle products.

This document is intended to give some insight into the internal workings of ASM.

It is not a detailed design document. It should be useful for people that need to support ASM.

Automatic Storage Management is part of the database kernel. It is linked into $ORACLE_HOME/bin/oracle so that its code may be executed by all database processes.

One portion of the ASM code allows for the start-up of a special instance called an ASM Instance. ASM Instances do not mount databases, but instead manage the metadata needed to make ASM files available to ordinary database instances.

Both ASM Instances and database instances have access to some common set of disks.

ASM Instances manage the metadata describing the layout of the ASM files. Database instances access the contents of ASM files directly, communicating with an ASM instance only to get information about the layout of these files. This requires that a second portion of the ASM code run in the database instance, in the I/O path.

Note:1. One and only one ASM instance required per node. So you might have multiple databases, but they will share the same single ASM.

2. ASM is for DATAFILE, CONTROLFILE, REDOLOG, ARCHIVELOG and SPFILE. So you can use CFS for common oracle binary in RAC.

3. ASM can provide mirroring for files in a disk group.

4. In external redundancy disk groups, ASM does not mirroring. For normal redundancy, ASM 2-way mirrors files by default, but can also leave files unprotected. [Unprotected files are not recommended]. For high redundancy disk groups, ASM 3-way mirrors files.

5. Unless a user specifies an ASM alias filename during file creation, the file is OMF. OMF files are deleted automatically when the higher level object (eg tablespace) is dropped, whereas non-OMF files must be manually deleted. Oracle is recommending to use OMF.

HOW TO USE ?------------

Use DBCA to configure your ASM.

DBCA eases the configuring and creation of your database while EM provides an integrated approach for managing both your ASM instance and database instance.

Page 12: RAC FAQ for Interview

Automatic Storage Management is always installed by the Oracle Universal Installer when you install your database software. The Database Configuration Assistant (DBCA) determines if an ASMinstance already exists, and if not, then you are given the option of creating and configuring an ASM instance as part of database creation and configuration. If an ASM instance already exists,then it is used instead. DBCA also configures your instance parameter file and password file.Steps in DBCA:

1. Choose ASM disk.2. Create diskgroup by choosing available disk.3. While creating ASM you have choice of mirroring for files in a disk group and the options are like below

HIGH, NORMAL or EXTERNAL.

High -> ASM 3-way mirrors Normal -> ASM 2-way mirrors External -> If you have already mirror disk in H/W label like EMC or another third party

4. dbca will create a separate instance called "+ASM" which will be in nomount stage to control your ASM.

5. Choose your all datafile, controlfile, redolog and spfile to your ASM volume.

Preinstall:

Here DBA will create the ASM volume, so Sysadmin should give the ownership or proper privs to DBA.

Example in LINUX----------------

You have two disk say "/dev/sdf" and "/dev/sdg"

Determine what those devices are bound as raw: $ /usr/bin/raw -qa

If not:

Include devices in diskgroups by editing /etc/sysconfig/rawdevices : eg /dev/raw/raw1 /dev/sdf /dev/raw/raw2 /dev/sdg

Set owner, group and permission on device file for each raw device:

$ chown oracle:dba /dev/raw/rawn, chmod 660 /dev/raw/rawn

Bind disk devices to raw devices:

$ service rawdevices restart

So from DBCA you can see the device "raw1" and "raw2".

Page 13: RAC FAQ for Interview

After finishing of ASM volume creation, when you create a database on an ASM volume, you should see the file details using Enterprise Manager (EM). Or you can use V$ or DBA view to check the datafile name. Oracle recommended not to specify the datafile name while adding datafile or creating new tablespace, because ASM will automatically generate OMF file.

Note: If DBA's by mistake or intentionally choose the datafile name, dropping of tablespace, will not drop the datafile from ASM volume.

IMPORTANT VIEW in ASM=====================V$ASM_DISKGROUPV$ASM_CLIENTV$ASM_DISKV$ASM_TEMPLATEV$ASM_ALIASV$ASM_OPERATI