Thursday, August 19, 2010

R12 Upgrade Best Practice and Few Known Issues

Overview of upgrade to R12

Upgrading an application from 11i to R12 involves, upgrading the database side, upgrading the middleware techstack and upgrading the application side.
Supported upgrade path for application side upgrade is as given below.












Upgrade Steps in brief


Here are the 4 simple steps, briefly presented below for upgrade. These steps are at very high level of abstraction. Detailed each steps are further down.

1) Understand installed components, system sizing information, NLS considerations

2) Prepare for upgrade using Upgrade Manual Script(TUMS).

3) Upgrading to R12. This includes upgrading the database and applying the required patches through AutoPatch.

4) Post-Upgrade process. Complete the upgrade process by applying the latest RUP patches to keep the system most current.



Upgrade steps in detail



1) Understanding installed components


 Technology Stack Components

            Rapid Install automatically installs and configures the required technology stack components for both the database tier and the application tier.

The database tier technology stack for both a new installation and for a system upgrade is based on Oracle10g Release 2.

The technology stack installed on the application tier includes, among other components:

- Oracle 10g Application Server (AS) 10.1.2

- Oracle 10g Application Server (AS) 10.1.3

- Oracle Developer 10g (includes Oracle Forms)

- Java (J2SE) native plug-in 1.5.0_08

- Java Developer Kit (JDK) 5.0


Memory Requirements

       To calculate the memory requirements for an upgrade, consider the following:

- Number of concurrent users

- Infrastructure requirements for multi-tiered architecture

For example:

A test upgrade of the largest Oracle production system (oraprod) used the following:

- Database tier machine – 48 GB of memory

- Application tier machine – 12 GB of memory

A test upgrade of the Vision database and application tier machine used 6 GB of memory.


Database Size

             To estimate the increase in required disk space for upgrading, consider the products, the number of languages being installed, and changes in the data model.

For example:

In a test upgrade of the largest Oracle production system (oraprod), the database increased 10-20 percent. In a test upgrade, the Vision database increased 5 percent. For guidelines based on an upgrade of the Oracle production system (oraprod), see E-Business Suite Release 12 Upgrade Sizing and Best Practices (Doc ID: 399362.1).



Database Backup

*** Its highly recommended that you back up your entire system before you begin the upgrade. ***



Database Initialization Parameters

Initialization parameters required at each stage of the upgrade may vary depending on when you upgrade your database. Review the requirements for these parameters before you begin. Refer to metalink note ID 396009.1 for initialization parameters.

Change the following initialization parameters as specified below for upgrade process. Once the upgrade process completes, reset the parameters back.


db_file_multiblock_read_count – Remove this parameter. (this is not required).

_db_file_optimizer_read_count = 8 (default setting is 8. Keep default setting).

job_queue_processes (set the value of this parameters equal to number of CPUs).

parallel_max_servers (set the value of this parameters equal to twice the number of CPUs).

pga_aggregate_target (refer to metalink note ID 396009.1 for recommended value).

Make sure that the temporary tablespace you have is locally managed and not dictionary managed. You can check this information using below query.

select CONTENTS,EXTENT_MANAGEMENT,ALLOCATION_TYPE from dba_tablespaces where tablespace_name=’TEMP’;


CONTENTS     EXTENT_MANAGEMENT       ALLOCATION_TYPE

————               —————–                              —————

TEMPORARY         LOCAL                                   UNIFORM

Else if the extent management is not local, you can drop and recreate temp tablespace.

NLS Upgrade Considerations

    For NLS considerations, please refer to Applications upgrade docs


Character Sets


         You have to be careful while selecting the character set for APPL_TOP. Depending on whether your Applications system connects to the database during the upgrade process, you may be able to select a new character set for the Release 12 APPL_TOP on the Rapid Install wizard upgrade screens. However, if you do, the new set must be either identical to, or compatible with, the existing database character set. If you change the character set in the APPL_TOP to one that is not compatible with the current database character set, the upgraded system will be corrupted.


SQL> create TEMPORARY tablespace TEMP tempfile ’ts_p_temp1.dbf’ size 2048M EXTENT MANAGEMENT LOCAL UNIFORM SIZE 1M;



2) Prepare for upgrade using Upgrade Manual Script(TUMS) and upgrading database

          R12 upgrade process involve replacing 11i Tech stack (9iAS & 806) to Fusion Middleware (10g Application Server)

Basic upgrade process involves Rapid Install & Autopatch

Rapid Install involves installing new R12 tech stack as mentioned in first point

Auto patch process involves upgrading E-Business Suite database compatible to R12 (Data Model)

Final upgrade process is of updating data model using enhanced version of AutoPatch

Minimum version from which you can upgrade to R12 is 11.5.7 and higher

Minimum database version from which you can upgrade to R12 is 9i

As per Oracle R12 Upgrade Documentation, Apps 11i Instance is classified in Two Categories based on Apps & DB Version

Category 1 – 11.5.7, 11.5.8, 11.5.9 (CU1), 11.5.10 (CU1)

Category 2 – 11.5.9 (CU2), 11.5.10 (CU2) or 11.5.10.2


Why we cannot upgrade database to 10.2.0.2 before for category 1 ?

This is because there is no Interoperability patch for above release & 10.2.0.2 was supported for 11.5.9 (CU2) & 11.5.10(CU2) only.


What are advantages of Upgrading database to 10.2.0.2 before R12 upgrade ?

Downtime can be broken down to two small downtimes (one for Database upgrade another one for R12 upgrade) and can be achieved during weekends or long weekends (depending on your system & resources)



 
Best Practices - Pre Upgrade
 
  • Use TUMS to eliminate the tasks that are not relevant for your system.
  • Use Shared File System for Multi-Nodes
  • Use Distributed AD for Multi-Node.
  • Estimate Tablespaces size (Note # 399362.1)
  • Modify Database Parameters which are discribed above, that will help in performance of upgade scripts.
  •  Perform these tasks in advance to reduce downtime in Upgrade ( Convert to Multi Org, Convert to OATM, Upgrade Database to 10.2.0.3)
  • Gather Statistics before upgrade.
  • Record timing for each step during Test Upgrade.
  • Add PL/SQL no compile option in R12 upgrade driver to save time during upgrade
  • Choose proper Batchsize and Workers for Autopatch ( e.g. for a 24 CPU DB Server Batchsize=10000 and workers=30)
  • Make sure you reset the Database parameters which were modified earlier.
  • Merge All NLS patches and apply them as single merged patch.
  • Merging all Post 12.0.4 upgrade patches (including the 12.0.6 R12
    RUP6 Patch) in which, we applied the rule of thumb of merging both "non-AD" and "non-FND"
    patches together and thereafter, to individually apply AD, FND and online-Help patches. Merging
    the patches did save a lot of time, both in limiting the number of adpatch runs, and also in limiting
    the number of times the actual adpatch maintenance steps were executed (i.e. compiling Apps
    schema, compiling jtf files, etc.).
  • Isolate post upgrade concurrent programs to a separate manager queue ( Note # 399362.1)

Few Known Issues

• Purging Non-Required Data From the "GL_INTERFACE" Table Dramatically Reduced
Overall Patch Runtime

               In reviewing the patch runtime worker jobs, we've clearly noticed that script "jaiglintfsrcmig.sql" was taking roughly 4 hrs 55 mins to complete. After carefully reviewing the script's SQL path, we noticed that the script was actually doing several updates on the "GL_INTERFACE" table. And this was occurring on an implementation where table "GL_INTERFACE" is (was) not being purged after data loads. We then proceeded to ask the client's Development team to identify the non required data from the "GL_INTERFACE" table, which was then purged prior to engaging the Apps upgrade. Purging the non-required data from the "GL_INTERFACE" table, turned out to save approximately 4 1/2 hours of upgrade time.

• Apache Lock File Issues


             In setting a second and third Mid-Tier nodes, we've successfully completed the configuration of a shared APPL_TOP and shared Tech Stack. Redhat GFS was used for the shared filesystem. The only issue faced with the shared filesystem, was that Apache would fail to startup due to the "adapcctl.sh" startup script timing out. After carefully diagnosing the issue (using the "strace" command to see where the "httpd" processes were hanging), we identified that the issue resulted in Apache trying to lock files on the GFS mountpoints. This appeared to be a generic issue with most shared filesystem when locking files take place (which is in fact an Apache requirement). The issue was fixed by configuring Apache parameters "s_lock_pid_dir", "s_pids_dir" and "s_web_pid_file" (in the context xml file) to point to local mount point "/var/tmp" (instead of pointing to shared mountpoints). These changes were made to all context xml files
for all nodes in the cluster.

• R12 Login Page not Working After Applying the 12.0.4 Patch


           After applying the 12.0.4 patch, for some reason, the R12 login page stopped working properly, more specifically, the R12 login page would loop between "/OA_HTML/AppsLogin" and "/OA_HTML/RF.jsp?" and consequently, the login page would not come up at all. After thorough troubleshooting, we determined that the FND tables and WF tables were not completely synchronized, thus making the "RF.jsp" go back to the AppsLogin page as user "GUEST" which in fact, does not have necessary responsibilities against WF tables. The problem was then fixed by running the "WF_LOCAL_SYNCH.VALIDATEUSERROLES" procedure.

• Intermittent Issues


                At one point, we were faced with two types of intermittent issues. The first was with respect to a report customization where the "Payables Open Interface Import" program used to fail intermittently with error "REP-0069: Internal error, REP-50002: Server is shutting down". After researching this issue, we've identified patch 6116405 from Metalink Note 549910.1 "Error REP-0069 and REP-50002 while Generating Reports through Concurrent Managers", which in the end, fixed the issue.

The other intermittent issue dealt with Apache producing errors of the form "404 not found in OA Framework pages". After carefully researching and diagnosing the issue, we narrowed it down to OC4J not behaving correctly and further more, identified that the error would occur only when OC4J oacore jvms were increased from 1 to 3. We then tracked down Metalink Note 731115.1, specifying to apply patch 7311892 to the 10.1.3 Oracle Home and also to add set context file parameter "s_oacore_jvm_start_options" to "-Doracle.ons.numprocs=", and of course to also run Autoconfig to propgagate all changes. This fixed the Apache errors of the form "404 not found in OA Framework pages".


Let me know if I missed anything or wants me add anything.... Take Care all.

Wednesday, April 21, 2010

How to Register EBS with SSO

1) Start the EBS environment

2) Start the SSO Environment

3) Check you can login into SSO

http://.:/oiddas

• Log in using the orcladmin userid

• Navigate to Directory > Create.

• Create a test userid (test/welcome123), supplying a password and other user information. Click Submit.

• Log out.

• Log into Oracle Internet Directory Delegated Administration Services using the newly created test userid.

• Ensure the Directory Integration and Provisioning Platform Server is running. The command ps -ef
grep odi should show a process called $ORACLE_HOME/bin/odisrv running.

4) Check you can login into the EBS database and R12 Environment

a. Log into the database - sqlplus /@$TWO_TASK

b. Log into the Env - http://.:/OA_HTML/AppsLocalLogin.jsp



5) Create a parameter checklist: -

Parameter Checklist:

Sr. No Parameter Description Example Comments

1 Hostname of Oracle Application Server Infrastructure database {mandatory} Ibmaix30.server.com Fully qualified name recommended, e.g. alpha.company.com rather than just alpha

2 LDAP port of Oracle Internet Directory{mandatory} 13061 Check for LDAP port number in $ORACLE_HOME/install/portlist.ini

3 LDAP SSL port of Oracle Internet Directory {mandatory} 636 Check for LDAP port number in $ORACLE_HOME/install/portlist.ini

4 Password of Oracle E-Business Suite database user, "APPS" {mandatory} APPS APPS user password.

5 Password of Oracle Internet Directory admin user, "orcladmin {mandatory} Welcome123 No comment needed.

6 Password to register E-Business Suite instance with Oracle Internet Directory {mandatory} Welcome123 No comment needed.

7 Oracle Internet Directory administration user name. orcladmin OID superuser name. Default value is "cn=orcladmin".

8 apps name s_contextname This instance will be registered with OID Server with this appname. Default value of appname s_contextname.

9 svcname s_contextname This instance will be registered with OID Server with this svcname. Default value of appname s_contextname.

11 Provisiontype 2 It specifies provisioning type between instance and OID Server. Allowed values are 1,2,3,4. This are for 1. Bidirectional, 2.Instance to OID Server, 3.OID Server to Instance, 4.Bidirectional no creation. Default value is 1.

12 ldaphost beta.company.com For Non-Colocated Infrastructure, i.e. if ldaphost is different from infradbhost, pass value of ldaphost for this parameter in command line. Default value of ldaphost is infradbhost.

12 dbldapauthlevel 0 authentication level between E-Business database and OID Server for provisioning purpose. Values are, 0 - Non-SSL Communication, 1 - SSL with no authentication, 2 - SSL with server authentication, 3 - SSL with Client and Server authentication.

13 dbwalletdir FND_DB_WALLET_DIR E-Business database wallet directory. This is must if dbldapauthlevel > 1. Default dbwalletdir is the value of site level profile FND_DB_WALLET_DIR

14 dbwalletpass E-Business database wallet password. This is must if dbldapauthlevel > 1

15 rdbmsdn RDBMS DN of this E-Business database instance that is registered with OID Server e.g. cn=OracleContext





5) As the owner of the application-tier file system, source the file $APPL_TOP/APPS.env to set the environment correctly. Once done On the EBS apps tier run the SSO registration script: -

a. cd $FND_TOP/bin/

b. txkrun.pl -script=SetSSOReg -provisiontype=2



Parameter Prompts:

The registration script will prompt for several parameters. Use the parameter values from the parameter checklist. The script will prompt for the parameters in the following order:

• Enter the host name where Oracle iAS Infrastructure database is installed? Ibmaix30.server.com

• Enter the LDAP Port on Oracle Internet Directory server ? 389

• Enter SSL LDAP Port on Oracle Internet Directory server ? 636

• Enter the Oracle Internet Directory Administrator (orcladmin) Bind password ? welcome123

• Enter the instance password that you would like to register this application instance with ? welcome123

• Enter Oracle E-Business apps database user password ? APPS



6) When the registration script completes successfully, it will print the following line:-

End of /patch/115/bin/txkSetSSOReg.pl : No Errors encountered

If you do not see this confirmation, examine the following file to investigate the problem: $APPLRGF/TXK/txkSetSSOReg_[timestamp].xml



7) If the registration is successful then bounce the EBS application tier.



8) Validate that Single Sign-On is Working Correctly by following these steps: -

i) Run the Diagnostic Utility: -

• Login as sysadmin locally (http[s]://[:port]/OA_HTML/AppsLocalLogin.jsp)

• Select the responsibility "CRM HTML Administration" from the Navigator's left pane

• Select the function "Diagnostics" from the Navigator's right pane. This will launch a new window. If you do not see a new window, make sure any browser pop-up blockers are disabled.



Run SSO Diagnostics

• Click on the "Basic" tab

• Choose "Application Object Library" from the Applications drop down

• Click on "SSO Setup Tests" - Click on "Run Without Pre-Requisite"

• All the tests should complete successfully

• Click on the "Report" icon for each test and verify the results

Run OID Diagnostics

• Click on "OID Setup" - Click on "Run Without Pre-Requisite"

• All the tests should complete successfully

• Click on the "Report" icon for each test and verify the results

ii) Verify SSO Integration with Oracle E-Business Suite

• Create a user in the EBS environment locally. (Local URL = (http://[:port]/OA_HTML/AppsLocalLogin.jsp)

• Check that user can login to the EBS environment through the SSO login

o Enter EBS login URL = http://[EBS_server]:[port]/OA_HTML/AppsLogin

o This should redirect to the SSO login screen.

o Enter user and password created above, which log you into the EBS env.

• Check you can login to SSO OIM with the same user/password through URL: - http://.:/oiddas





















How to Deregister SSO with EBS

1) SSO-OID Deregistration can be done using a single command, which does the following: -

• OID Deregistration

• SSO Deregistration

• Instance Deregistration

To deregister run the following command, which starts the interactive mode deregistration: -

$FND_TOP/bin/txkrun.pl -script=SetSSOReg -deregister=yes

The above command will prompt for the following: -

• Enter Oracle E-Business apps database user password ? APPS

• Enter the Oracle Internet Directory Administrator (orcladmin) Bind password ? welcome123

2) SSO-OID Registration stores a set of preferences on E-Business Database. If E-Business Instance is cloned from SSO/OID Registered E-Business Instance, cloned environment has same preferences as the source environment and throws errors while SSO/OID Registration. So following command should be called in post cloning phase or before proceeding for SSO/OID Registration to remove all the preferences or settings from cloned environments.

NOTE – ONLY NEEDS TO BE DONE IF EBS HAS BEEN SSO REGISTERED PREVIOUSLY

To remove the references in EBS run the following command:-

$FND_TOP/bin/txkrun.pl -script=SetSSOReg -removereferences=Yes

The above command will prompt for the following: -

• Enter Oracle E-Business apps database user password? ******



3) Bounce the EBS environment and test login: -

http://[EBS_server]:[port]/OA_HTML/AppsLogin

Thursday, September 10, 2009

Oracle RAC Questions

What is RAC?
RAC stands for Real Application cluster. It is a clustering solution from Oracle Corporation that ensures high availability of databases by providing instance failover, media failover features.

Mention the Oracle RAC software components:-

Oracle RAC is composed of two or more database instances. They are composed of Memory structures and background processes same as the single instance database.Oracle RAC instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that enable cache fusion.
Oracle RAC instances are composed of following background processes:
ACMS—Atomic Controlfile to Memory Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor

What is GRD?

GRD stands for Global Resource Directory. The GES and GCS maintains records of the statuses of each datafile and each cahed block using global resource directory.This process is referred to as cache fusion and helps in data integrity.

Details on Cache Fusion:-

Oracle RAC is composed of two or more instances. When a block of data is read from datafile by an instance within the cluster and another instance is in need of the same block,it is easy to get the block image from the insatnce which has the block in its SGA rather than reading from the disk. To enable inter instance communication Oracle RAC makes use of interconnects. The Global Enqueue Service(GES) monitors and Instance enqueue process manages the cahce fusion.

Details on ACMS:-
ACMS stands for Atomic Controlfile Memory Service.In an Oracle RAC environment ACMS is an agent that ensures a distributed SGA memory update(ie)SGA updates are globally committed on success or globally aborted in event of a failure.

Details on GTX0-j :-
The process provides transparent support for XA global transactions in a RAC environment.The database autotunes the number of these processes based on the workload of XA global transactions.

Details on LMON:-
This process monitors global enques and resources across the cluster and performs global enqueue recovery operations.This is called as Global Enqueue Service Monitor.

Details on LMD:-
This process is called as global enqueue service daemon. This process manages incoming remote resource requests within each instance.

Details on LMS:-
This process is called as Global Cache service process.This process maintains statuses of datafiles and each cahed block by recording information in a Global Resource Dectory(GRD).This process also controls the flow of messages to remote instances and manages global data block access and transmits block images between the buffer caches of different instances.This processing is a part of cache fusion feature.

Details on LCK0:-
This process is called as Instance enqueue process.This process manages non-cache fusion resource requests such as libry and row cache requests.Give details on RMSn:-This process is called as Oracle RAC management process.These pocesses perform managability tasks for Oracle RAC.Tasks include creation of resources related Oracle RAC when new instances are added to the cluster.

Details on RSMN:-
This process is called as Remote Slave Monitor.This process manages background slave process creation andd communication on remote instances. This is a background slave process.This process performs tasks on behalf of a co-ordinating process running in another instance.

What components in RAC must reside in shared storage?
All datafiles, controlfiles, SPFIles, redo log files must reside on cluster-aware shared storage.

What is the significance of using cluster-aware shared storage in an Oracle RAC environment?All instances of an Oracle RAC can access all the datafiles,control files, SPFILE's, redolog files when these files are hosted out of cluster-aware shared storage which are group of shared disks.

Few examples for solutions that support cluster storage:-
ASM(automatic storage management),raw disk devices,network file system(NFS), OCFS2 and OCFS(Oracle Cluster Fie systems).

What is an interconnect network?
An interconnect network is a private network that connects all of the servers in a cluster. The interconnect network uses a switch/multiple switches that only the nodes in the cluster can access.

How can we configure the cluster interconnect?
Configure User Datagram Protocol(UDP) on Gigabit ethernet for cluster interconnect.On unia and linux systems we use UDP and RDS(Reliable data socket) protocols to be used by Oracle Clusterware.Windows clusters use the TCP protocol.

Can we use crossover cables with Oracle Clusterware interconnects?
No, crossover cables are not supported with Oracle Clusterware intercnects.

What is the use of cluster interconnect?
Cluster interconnect is used by the Cache fusion for inter instance communication.

How do users connect to database in an Oracle RAC environment?
Users can access a RAC database using a client/server configuration or through one or more middle tiers ,with or without connection pooling.Users can use oracle services feature to connect to database.

What is the use of a service in Oracle RAC environemnt?
Applications should use the services feature to connect to the Oracle database.Services enable us to define rules and characteristics to control how users and applications connect to database instances.

What are the characteriscs controlled by Oracle services feature?
The charateristics include a unique name, workload balancing and failover options,and high availability characteristics.

Which enable the load balancing of applications in RAC?
Oracle Net Services enable the load balancing of application connections across all of the instances in an Oracle RAC database.

What is a virtual IP address or VIP?
A virtl IP address or VIP is an alternate IP address that the client connectins use instead of the standard public IP address. To configureVIP address, we need to reserve a spare IP address for each node, and the IP addresses must use the same subnet as the public network.

What is the use of VIP?
If a node fails, then the node's VIP address fails over to another node on which the VIP address can accept TCP connections but it cannot accept Oracle connections.

Situations under which VIP address failover happens:-
VIP addresses failover happens when the node on which the VIP address runs fails, all interfaces for the VIP address fails, all interfaces for the VIP address are disconnected from the network.

What is the significance of VIP address failover?
When a VIP address failover happens, Clients that attempt to connect to the VIP address receive a rapid connection refused error .They don't have to wait for TCP connection timeout messages.

What are the administrative tools used for Oracle RAC environments?
Oracle RAC cluster can be administered as a single image using OEM(Enterprise Manager),SQL*PLUS,Servercontrol(SRVCTL),clusterverificationutility(cvu),DBCA,NETCA

How do we verify that RAC instances are running?
Issue the following query from any one node connecting through SQL*PLUS.
$connect sys/sys as sysdba
SQL>select * from V$ACTIVE_INSTANCES;
The query gives the instance number under INST_NUMBER column,host_:instancename under INST_NAME column.

What is FAN?
Fast application Notification as it abbreviates to FAN relates to the events related to instances,services and nodes.This is a notification mechanism that Oracle RAc uses to notify other processes about the configuration and service level information that includes service status changes such as,UP or DOWN events.Applications can respond to FAN events and take immediate action.

Where can we apply FAN UP and DOWN events?
FAN UP and FAN DOWN events can be applied to instances,services and nodes.

Use of FAN events in case of a cluster configuration change?
During times of cluster configuration changes,Oracle RAC high availability framework publishes a FAN event immediately when a state change occurs in the cluster.So applications can receive FAN events and react immediately.This prevents applications from polling database and detecting a problem after such a state change.

Why should we have seperate homes for ASm instance?
It is a good practice to have ASM home seperate from the database hom(ORACLE_HOME).This helps in upgrading and patching ASM and the Oracle database software independent of each other.Also,we can deinstall the Oracle database software independent of the ASM instance.

What is the advantage of using ASM?
Having ASM is the Oracle recommended storage option for RAC databases as the ASM maximizes performance by managing the storage configuration across the disks.ASM does this by distributing the database file across all of the available storage within our cluster database environment.

What is rolling upgrade?
It is a new ASM feature from Database 11g.ASM instances in Oracle database 11g release(from 11.1) can be upgraded or patched using rolling upgrade feature. This enables us to patch or upgrade ASM nodes in a clustered environment without affecting database availability.During a rolling upgrade we can maintain a functional cluster while one or more of the nodes in the cluster are running in different software versions.

Can rolling upgrade be used to upgrade from 10g to 11g database?
No,it can be used only for Oracle database 11g releases(from 11.1).

Initialization parameters that must have same value for every instance in an Oracle RAC database:-
Some initialization parameters are critical at the database creation time and must have same values.Their value must be specified in SPFILE or PFILE for every instance.The list of parameters that must be identical on every instance are given below:ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET
COMPATIBLE
CLUSTER_DATABASE
CLUSTER_DATABASE_INSTANCE
CONTROL_FILES
DB_BLOCK_SIZE
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
INSTANCE_TYPE (RDBMS or ASM)PARALLEL_MAX_SERVERS
REMOTE_LOGIN_PASSWORD_FILE
UNDO_MANAGEMENT

Can the DML_LOCKS and RESULT_CACHE_MAX_SIZE be identical on all instances?
These parameters can be identical on all instances only if these parameter values are set to zero.

What two parameters must be set at the time of starting up an ASM instance in a RAC environment?
The parameters CLUSTER_DATABASE and INSTANCE_TYPE must be set.

Mention the components of Oracle clusterware:-
Oracle clusterware is made up of components like voting disk and Oracle Cluster Registry(OCR).

What is a CRS resource?
Oracle clusterware is used to manage high-availability operations in a cluster.Anything that Oracle Clusterware manages is known as a CRS resource.Some examples of CRS resources are database,an instance,a service,a listener,a VIP address,an application process etc.

What is the use of OCR?
Oracle clusterware manages CRS resources based on the configuration information of CRS resources stored in OCR(Oracle Cluster Registry).

How does a Oracle Clusterware manage CRS resources?
Oracle clusterware manages CRS resources based on the configuration information of CRS resources stored in OCR(Oracle Cluster Registry).

Some Oracle clusterware tools and their uses?
OIFCFG - allocating and deallocating network interfaces
OCRCONFIG - Command-line tool for managing Oracle Cluster Registry
OCRDUMP - Identify the interconnect being used
CVU - Cluster verification utility to get status of CRS resources

What are the modes of deleting instances from ORacle Real Application cluster Databases?
We can delete instances using silent mode or interactive mode using DBCA(Database Configuration Assistant).

How do we remove ASM from a Oracle RAC environment?
We need to stop and delete the instance in the node first in interactive or silent mode.After that asm can be removed using srvctl tool as follows:
srvctl stop asm -n node_namesrvctl remove asm -n node_name
We can verify if ASM has been removed by issuing the following command:
srvctl config asm -n node_name

How do we verify that an instance has been removed from OCR after deleting an instance?
Issue the following srvctl command:
srvctl config database -d database_namecd CRS_HOME/bin./crs_stat

How do we verify an existing current backup of OCR?
We can verify the current backup of OCR using the following command :
ocrconfig -showbackup

What are the performance views in an Oracle RAC environment?
We have v$ views that are instance specific. In addition we have GV$ views called as global views that has an INST_ID column of numeric data type.GV$ views obtain information from individual V$ views.

What are the types of connection load-balancing?
There are two types of connection load-balancing:server-side load balancing and client-side load balancing.

What is the differnece between server-side and client-side connection load balancing?
Client-side balancing happens at client side where load balancing is done using listener.In case of server-side load balancing listener uses a load-balancing advisory to redirect connections to the instance providing best service.

Give the usage of srvctl:-
srvctl start instance -d db_name -i "inst_name_list" [-o start_options]
srvctl stop instance -d name -i "inst_name_list" [-o stop_options]
srvctl stop instance -d orcl -i "orcl3,orcl4" -o immediate
srvctl start database -d name [-o start_options]
srvctl stop database -d name [-o stop_options]
srvctl start database -d orcl -o mount

Tuesday, May 5, 2009

Configure Parallel Concurrent Processing R12

Configure Parallel Concurrent Processing



Set Up PCP



1. Execute AutoConfig by running $INST_TOP/admin/scripts/adautocfg.sh on all concurrent nodes.

2. Source the application environment.

3. Check the configuration files tnsnames.ora and listener.ora located at $INST_TOP/ora/10.1.2/network/admin. Ensure that you have information of all the other concurrent nodes for FNDSM and FNDFS entries.

4. Restart the application listener process on each application tier node.

5. Log on to Oracle E-Business Suite Release 12 as SYSADMIN, and choose the System Administrator Responsibility. Navigate to Install > Nodes screen and ensure that each node in the cluster is registered.


6. Verify whether the Internal Monitor for each node is defined properly with correct primary and secondary node specification and work shift details. Also ensure the Internal Monitor manager is activated by going to Concurrent -> Manager -> Administrator and activating the manager. (For example, Internal Monitor: Host2 must have primary node as host2 and secondary node as host3).

The secondary nodes for Internal Monitor: host2 and Internal Monitor: host3 were not defined.
Also the workshifts were not present. Set them as 24 hrs and the processes was set to 1.

Went to the concurrent - > manager - > administrator screen and activated the above 2 managers.


7. Set the $APPLCSF environment variable on all the CP nodes pointing to a log directory on a shared file system.

Change to be made in the context file. Initially it was :

$ORACLE_SID $> grep -i applcsf $CONTEXT_FILE
/ORACLE_HOME/inst/apps/SID_HOST/logs/appl/conc

Changed it to :

$ORACLE_SID $ > grep -i applcsf $CONTEXT_FILE
/ORACLE_HOME/inst/apps/$ORACLE_SID /logs/appl/conc

Do not run autoconfig yet, you have to make more changes in the below step. We will run autoconfig after that.

These changes need to be on just the conc manager tiers.


8. Set the $APPLPTMP environment variable on all the CP nodes to the value of the UTL_FILE_DIR entry in init.ora on the database nodes. This value should be pointing to a directory on a shared file system.

Initially, APPLTMP was set as ‘/usr/tmp’.

We need this to be on a file system, that is shared across all the CM nodes and is also writable by the owner of the oracle db.

Now we will need this to be the first entry in the “utl_file_dir” parameter in the db.

Use grid to have this change done, to the “utl_file_dir” parameter for database. This would need a bounce of db.

Also, modify the CONTEXT FILE on all the apps tier , to have APPLPTMP and APPLTMP point to shared mount point and run autoconfig.

$ORACLE_SID $> grep -i s_applptmp $CONTEXT_FILE
/usr/tmp

$ORACLE_SID $> grep -i s_appltmp $CONTEXT_FILE
/ORACLE_HOME/inst/apps/$ORACLE_SID /appltmp

Both of the above were changed to

$ORACLE_SID $> grep -i s_applptmp $CONTEXT_FILE
/ORACLE_HOME/inst/apps/temp

$ORACLE_SID $ > grep -i s_appltmp $CONTEXT_FILE
/ORACLE_HOME/inst/apps/temp


These changes need to be on just the conc manager tiers.

9. Set profile option ' Concurrent: PCP Instance Check' to OFF if database instance-sensitive failover is not required. By setting it to 'ON', Concurrent Managers will fail over to a secondary Application tier node if the database instance they are connected to fails.




Set Up Transaction Managers



1. Shut down the application services (servers) on all nodes.

2. In the Oracle RAC environment, shut down all the database instances cleanly with the command:
SQL>shutdown immediate;

3. Edit $ORACLE_HOME/dbs/_ifile.ora. Add the following parameters:

_lm_global_posts=TRUE
_immediate_commit_propagation=TRUE

4. Start the instances on all database nodes, one by one.

5. Start up the application services (servers) on all nodes.

6. Log on to Oracle E-Business Suite Release 12 using SYSADMIN login and System Administrator Responsibility.
Navigate to Profile > System and change the profile option ‘Concurrent: TM Transport Type' to ‘QUEUE' and verify the transaction manager works across the RAC instance.

7. Navigate to Concurrent > Manager > Define screen, and set up the primary and secondary node names for transaction managers.

8. Restart the concurrent managers.

9. If any of the transaction managers have a deactivated status, activate them using Concurrent > Manager > Administrator.




Set Up Load Balancing on Concurrent Processing Nodes



1. Edit the applications context file through Oracle Applications Manager interface and set the value of Concurrent Manager TWO_TASK (s_cp_twotask) to load balancing alias (_balance>).


Change need to be made in CONTEXT_FILE.

Initially it was:
$ORACLE_SID $ > grep -i s_cp_twotask $CONTEXT_FILE
$ORACLE_SID


It was changed to :

$ORACLE_SID $> grep -i s_cp_twotask $CONTEXT_FILE
$ORACLE_SID _TWO TASK



These changes need to be on just the conc manager tiers ( host2 and host3).
2. Execute AutoConfig by running $INST_TOP/admin/scripts/adautocfg.sh on all concurrent nodes.

Wednesday, January 21, 2009

This configuration has been successfully tested on Oracle 10g (Release 2). We will set up this configuration on the following machines.
Contents:

1 PRIMARY SITE: (Information)
2 STANDBY SITE: (Information)
3 CONFIGURATION ON PRIMARY
4 CONFIGURATION ON STANDBY
5 LOG SHIPPING
6 START PHYSICAL LOG APPLY SERVICE.
7 ROLE TRANSITION :
8 Manual Switchover:
9 DATAGUARD BROKER :
10 BROKER CONFIGURATION:
11 Switchover:
12 DATAGUARD OBSERVER :
13 OBSERVER CONFIGURATION:
14 OBSERVER SITE: (Information)
15 Step 1:
16 Step 2:
17 Step 3:
18 Step 4:
19 Step 5:
20 Data Guard With Real Time
21 Real Time Apply – Benefits
22 Note
23 Converting To Real time
24 Enable Real Time

PRIMARY SITE: (Information)Database name (db_name): ORCL
Database Unique Name (db_unique_name): primary
TNS Service Name: to_standby (Through this service, the primary machine will be connected to STANDBY machine)

STANDBY SITE: (Information)
Database name (db_name): ORCL
Database Unique Name (db_unique_name): standby
TNS Service Name: to_primary (Through this service, the standby machine will be connected to PRIMARY machine)

CONFIGURATION ON PRIMARY
Create pfile from spfile on the primary database:
SQL> Create pfile from spfile=’C:\oracle\product\10.2.0\db_1\dbs\spfilePrimary.ora’;


Then make/add following settings in the initPrimary.ora file on the PRIMARY Machine.

db_unique_name=’PRIMARY’FAL_Client=’to_primary’FAL_Server=’to_standby’Log_archive_config=’DG_CONFIG=(primary,standby)’Log_archive_dest_1=’Location=c:\oracle\backupVALID_FOR=(ALL_LOGFILES,ALL_ROLES)db_unique_name=primary’Log_archive_dest_2=’Service=to_standby lgwr asyncVALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)db_unique_name=standby’Log_archive_dest_state_1=ENABLE
Log_archive_dest_state_2=DEFER
Service_names=’primary’
Standby_File_Management=’AUTO’
Create password file using ‘cmd’.
C:\> orapwd file=C:\oracle\product\10.2.0\db_1\database\PWDOrcl.ora password=oracle entries=5 [force=y].
Force option is used to replace an existing password file.
Now startup the PRIMARY database to MOUNT stage.
• SQL> startup mount;

Make the following changes: Take the database to Archive Mode.

• SQL> Alter database ArchiveLog;

Enable Force Logging.

• SQL> Alter database Force Logging;

On the PRIMARY site, also create standby redo logfile for the Standby database. This standby redo logfile will be used for Dataguard Observer later on. If you don’t want to use DG broker (observer), then there is no need to create standby redo logfile.

• SQL>Alter database add standby logfile (‘C:\oracle\product\10.2.0oradata\ORCL\StandbyRedo.log’) size 150m;

Now shutdown the primary database.

• SQL> shutdown immediate;

Copy all the Datafiles and standby redo logfile from PRIMARY site to the same location on the STANDBY site. Then again startup the PRIMARY database to mount stage.
• SQL> startup mount;
Now create a standby controlfile on the PRIMARY site.•
SQL> Alter database create standby controlfile as ‘c:\oracle\backup\standcontrol.ctl’;

Now copy this new created standby control file to the standby site where other database file like datafiles, logfiles and control files are located. Rename this file to Control01.ctl, Control02.ctl and Control03.ctl.

Create spFile from pfile.

• SQL> Create spfile from pfile;

Restart the primary database.
Now on PRIMARY site create a service in TNSnames.ora file through which the PRIMARY site will be connected to the Standby machine.

TO_STANDBY = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.12.196)(PORT = 1521)) ) (CONNECT_DATA = (SERVICE_NAME = standby) ) )

Also check the connectivity from the SQL Prompt.

• SQL> connect sys/oracle@to_standby as sysdbaConnected.

Service can also be created through Net Manager utility available with Oracle Server. Connectivity can also be checked there. Register the Primary Database in the Listener.ora file.
Then stop and start the listener in the ‘cmd’.
> Lsnrctl stop> Lsnrctl start

Query the DATABASE_ROLE column from V$DATABASE to view the role of primary database. It should return ‘PRIMARY’.

CONFIGURATION ON STANDBY

Check the mode of Archiving by following command:
• SQL> Archive Log List

Then create pfile from spfile on the standby database:

SQL> Create pfile from spfile=’C:\oracle\product\10.2.0\db_1\dbs\spfileStandby.ora’;

Then make/add following settings in the initStandby.ora file on the STANDBY Machine.

db_unique_name=’STANDBY’
FAL_Client=’to_standby’
FAL_Server=’to_primary’
Log_archive_config=’DG_CONFIG=(primary,standby)’Log_archive_dest_1=’Location=c:\oracle\backupVALID_FOR=(ALL_LOGFILES,ALL_ROLES)
db_unique_name=standby’Log_archive_dest_2=’Service=to_primaryVALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)
db_unique_name=primary’
Log_archive_dest_state_1=ENABLE
Log_archive_dest_state_2=ENABLE
Service_names=’STANDBY’
Standby_File_Management=’AUTO’
db_file_name_convert='/home/u001/primary/','/home/u001/standby/'log_file_name_convert='/home/u001/primary/','/home/u001/standby/'lock_name_space=standby
Create password file using ‘cmd’.
C:\> orapwd file=C:\oracle\product\10.2.0\db_1\database\PWDOrcl.ora password=oracle entries=5 [force=y].
Force option is used to replace an existing password file.

Now on STANDBY site create a service in TNSnames.ora file through which the STANDBY site will be connected to the PRIMARY machine.

TO_PRIMARY = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.12.146)(PORT = 1521)) ) (CONNECT_DATA = (SERVICE_NAME = primary) ) )

Check the connectivity from the SQL Prompt.
• SQL> connect sys/oracle@to_primary as sysdba Connected.

Service can also be created through Net Manager utility available with Oracle Server. Connectivity can also be checked there. Register the Standby Database in the Listener.ora file. Then stop and start the listener in the ‘cmd’.

> Lsnrctl stop> Lsnrctl start

Create spfile from pfile.
• SQL> Create spfile from pfile;
Restart the database. Now startup the STANDBY database to mount stage.
• SQL> startup mount
Enable Force Logging.
• SQL> Alter database Force Logging;

Query the column DATABASE_ROLE from V$DATABASE to view the role of standby database. It should return ‘PHYSICAL STANDBY’.

LOG SHIPPING

On PRIMARY site enable Log_archive_dest_state_2 to start shipping archived redo logs.
• SQL> Alter system set Log_archive_dest_state_2=ENABLE scope=both;
System Altered.

Check the sequence # and the archiving mode by executing following command.
• SQL> Archive Log List

Then switch the logfile.
• SQL> Alter system switch logfile;System Altered.

Now on the PRIMARY site check the status of Standby Archiving destination.
• SQL> Select Status, Error from v$Archive_dest where dest_id=2;

The STATUS should return – VALID. If it returns Error, then check the connectivity between the Primary and Standby machines.

START PHYSICAL LOG APPLY SERVICE.

On the STANDBY database execute the following command to start Managed Recovery Process (MRP). This command is executed on Mount stage.

• SQL> Alter Database Recover Managed Standby Database;
Database Altered.

By executing the above command the current session will become hanged because MRP is a foreground recovery process. It waits for the logs to come and apply them. To avoid this hanging, you can execute the following command with DISCONNECT option.

• SQL> Alter Database Recover Managed Standby Database Disconnect;
Database Altered.

Now the session will be available to you and MRP will work as a background process and apply the redo logs.

You can check whether the log is applied or not by querying V$ARCHIVED_LOG.
• SQL> Select Name, Applied, Archived from v$Archived_log;

This query will return the name of archived files and their status of being archived and applied.
ROLE TRANSITION :

In Dataguard configuration, two roles are defined; Primary and Standby. Primary database is the production database which is used by the users. For high availability purpose, the dataguard provides a standby database which remains available side by side to the primary database.
Standby databases can be more than one and can be at remote locations as well.
Oracle Data Guard supports two role transition operations:

Switchover:
Switchover allows the primary database to switch roles with one of its standby databases. There is no data loss during a switchover. After a switchover, each database continues to participate in the Data Guard configuration with its new role.

Failover:
Failover transitions a standby database to the primary role in response to a primary database failure. If the primary database was not operating in either maximum protection mode or maximum availability mode before the failure, some data loss may occur. After a failover, the failed database no longer participates in the Data Guard configuration. It needs to be reinstated to become an active part of Data Guard configuration.

Manual Switchover:

On the PRIMARY Database: (Open stage)

Query V$DATABASE to check the role of Primary Database.
• SQL> Select Database_rolefrom v$Database;
It will return “PRIMARY”,

Now check the Switchover Status of the Primary Database.

• SQL> Select switchover_statusfrom v$Database;

It will return “SESSIONS ACTIVE”.

Now you are ready to perform a manual switchover. Execute the following command using “WITH SESSION SHUTDOWN” option.

• SQL> Alter Database Commit to Switchover to Physical Standby with session Shutdown;
Database Altered.

Now your PRIMARY Database has become Physical Standby.
To verify this change, again query the Database_role column of V$DATABASE. Now it will return “PHYSICAL STANDBY”.

Shutdown the database and again Start it to mount stage.

On the PHYSICAL STANDBY Database: (Mount stage) Query V$DATABASE to check the role of Standby Database.

• SQL> Select Database_rolefrom v$Database;
It will return “PHYSICAL STANDBY”,

Now check the Switchover Status of the Standby Database.
• SQL> Select switchover_statusfrom v$Database;
It will return “SESSIONS ACTIVE”.

Now cancel the MRP which is running in the background of the Standby Database. Execute the following command:

• SQL> Alter database Recover Managed Standby Database Cancel;
Database Altered.

Now you are ready to perform a manual switchover from Physical Standby to Primary. Execute the following command using “WITH SESSION SHUTDOWN” option.

• SQL> Alter Database Commit to Switchover to PRIMARY with session Shutdown;
Database Altered.

Now your PHYSICAL STANDBY Database has become PRIMARY. To verify this change, again query the Database_role column of V$DATABASE. Now it will return “PRIMARY”.

Shutdown the database and again Start it to Open stage.

DATAGUARD BROKER :The Data Guard broker is a distributed management framework that automates and centralizes the creation, maintenance, and monitoring of Data Guard configurations. You can use either the Oracle Enterprise Manager graphical user interface (GUI) or command-line interface (CLI) to automate and simplify:

• Creating and enabling Data Guard configurations, including setting up log transport services and log apply services.
• Managing an entire Data Guard configuration from any system in the configuration.
• Managing and monitoring Data Guard configurations that contain Real Application Clusters primary or standby databases.

BROKER CONFIGURATION:
On both Primary and Standby sites, change the initialization parameter in the spfile to enable the Data guard broker.

•SQL> Alter system set dg_broker_start=True scope=both;
System Altered.

On the PRIMARY site, open the ‘cmd’ and start Command Line Interface (CLI) of the Dataguard Broker (DGMGRL).

C:\> dgmgrlDGMGRL for 32-bit Windows: Version 10.2.0.1.0 - ProductionCopyright (c) 2000, 2005, Oracle. All rights reserved.Welcome to DGMGRL, type "help" for information.DGMGRL>_

Now connect to the database through the service you made previously.

DGMGRL> connect sys/oracle@to_primaryConnected.
Create broker configuration.

DGMGRL> create configuration ‘broker1’ as
> primary database is ‘primary’
> connect identifier is to_primary;

(‘to_primary’ in Connect identifier is the service name through which the broker is connected to the PRIMARY database)

Add Standby Database to the above configuration.

DGMGRL> Add database ‘standby’ as
> connect identifier is to_standby
> maintained as physical;

(‘to_standby’ in Connect identifier is the service name through which the broker is connected to the STANDBY database)

Now the configuration has been set up but it is still disabled. You can view the configuration by executing:
DGMGRL> show configuration
Configuration
Name: broker1
Enabled: NO
Protection Mode: MaxPerformance
Fast-Start Failover: DISABLE
Databases:
primary - Physical standby database
standby - Primary database
Current status for "broker1":
DISABLE

The next step is to ENABLE the configuration ‘broker1’.

DGMGRL> enable configuration;
Enabled
Again view the configuration.
DGMGRL> show configuration
Configuration
Name: broker1
Enabled: YES
Protection Mode: MaxPerformance
Fast-Start Failover: DISABLE
Databases:
primary - Physical standby database
standby - Primary database
Current status for "broker1":
SUCCESS

Switchover:
Now we are ready to switch over the PRIMARY database Role to STANDBY database Role.


DGMGRL> switchover to ‘Standby’;
…..…..

Primary Database Successfully converted to Physical Standby. You can again switch over the Standby Database to Primary by executing following command.

DGMGRL> switchover to ‘Primary’;
…..…..

Standby Database is successfully converted to Primary Database.

Failover:
Failover can be done through the same configuration without any alteration. You simply need to execute following command:


DGMGRL> failover to ‘Standby’;
….
Failover to standby succeeded.
And also…

DGMGRL> failover to ‘Primary’;
….
Failover to primary succeeded.

DATAGUARD OBSERVER :
Observer is a utility that is available with the dataguard. Its basic purpose is to keep the database available to the clients all the time. It is started on a separate location other than Primary and Standby locations. After starting the observer, it starts watching continuously both the sites. Whenever the PRIMARY database fails due to any reason and the connection between the PRIMARY site and the observer breaks, the observer waits for a certain time which is set up during setting the broker configuration. When this time passes, the observer automatically starts the failover process. As a result, the Physical Standby database is converted to PRIMARY database. After this failover, the users again connect to the database. This connection redirects them automatically to the new PRIMARY database.

OBSERVER CONFIGURATION:
There are some important configuration settings that are needed to be set before starting the Dataguard (DG) Observer. Main thing is to set the STANDBY database to Maximum Availability protection mode. DG broker configuration must also be set to “MaxAvailability” protection mode. After this we need to enable “fast_start failover” parameter of the broker configuration. Without setting these options, broker will not allow to proceed for observer to work for remote failover. Following are the steps to enable the observer.


OBSERVER SITE:
OBSERVER Machine IP: 192.168.12.165 PRIMARY Machine IP: 192.168.12.146 STANDBY Machine IP: 192.168.12.196

TNS Service Name for PRIMARY site: to_primary (Through this service, the observer machine will be connected to PRIMARY machine)

TNS Service Name for STANDBY site: to_standby (Through this service, the observer machine will be connected to STANDBY machine)

(NOTE: Fresh DG broker configuration will be created for implementation of Observer. Therefore, remove any previously created configuration.)

Step 1:Create TNS services on the observer machine for PRIMARY site and STANDBY site through which the observer machine will be connected to the primary and standby databases.
TNS service name for PRIMARY site is ‘to_primary’. TNS service name for STANDBY site is ‘to_standby’.

Step 2:Before starting the broker CLI (DGMGRL) we have to enable FLASHBACK on both the Primary and Standby databases.
At mount stage execute:

• SQL> Alter database Flashback on;
Database altered.

Step 3:On the STANDBY site, edit Log_Archive_Dest_2 parameter and add three properties i.e. LGWR AFFIRM SYNC. These properties are needed to bring the broker protection mode to MaxAvailability.

• SQL> Alter System set Log_archive_dest_2=’Service=to_primaryLGWR AFFIRM SYNCVALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)db_unique_name=primary’;
System altered.

Step 4:
Now execute the following command on the primary database to change the protection mode from MAXIMUM PERFORMANCE to MAXIMUM AVAILABILITY.


• SQL> Alter database set primary database to maximize availability;

Shut down the PRIMARY database and then restart it. Check the protection mode on both PRIMARY and STANDBY databases.

• SQL> select protection_modefrom v$database.
It should return MAXIMUM AVAILABILITY.

Step 5:
Now start the DG CLI (DGMGRL) to start the broker.
C:\> dgmgrlDGMGRL for 32-bit Windows: Version 10.2.0.1.0 - ProductionCopyright (c) 2000, 2005, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.

DGMGRL>
DGMGRL> connect sys/oracle@primaryConnected.
Check any previously created configuration and remove it.

DGMGRL> show configuration;
If a configuration is displayed then remove it.

DGMGRL> remove configuration;
Removed.

Now create a new configuration.
DGMGRL> create configuration ‘broker1’ as
> primary database is ‘primary’
> connect identifier is to_primary;
Configuration “broker1” created with primary database “primary”

DGMGRL> add database ‘standby’ as
> connect identifier is to_standby
> maintained as physical;
Database “broker1” added

DGMGRL> show configuration;
Configuration
Name: broker1
Enabled: NO
Protection Mode: MaxPerformance
Fast-Start Failover: DISABLE
Databases: primary - Physical standby database
standby - Primary database
Current status for "broker1":
DISABLE

DGMGRL> enable configuration;
Enabled.

DGMGRL> show configuration;
Configuration
Name: broker1
Enabled: YES
Protection Mode: MaxPerformance
Fast-Start Failover: DISABLE
Databases:
primary - Physical standby database
standby - Primary database
Current status for "broker1":
SUCCESS
DGMGRL> edit database PRIMARY set property LogXptMode=’SYNC’;
Property "logxptmode" updated
DGMGRL> edit database STANDBY set property LogXptMode=’SYNC’;
Property "logxptmode" updated
DGMGRL> edit configuration set protection mode as MaxAvailability;
Succeeded.
DGMGRL> enable fast_start failover;
Enabled.
DGMGRL> show configuration;
Configuration
Name: dg1
Enabled: YES
Protection Mode: MaxAvailability
Fast-Start Failover: ENABLED
Databases: primary - Physical standby database
- Fast-Start Failover target
standby - Primary database

Current status for "dg1": Warning: ORA-16608: one or more databases have warnings

If you check this error by querying StatusReport property of the PRIMARY database, you will find that Fast Start Failover OBSERVER is not started.
Simply start the observer.
DGMGRL> show database primary StatusReport

Now start the observer.
DGMGRL> start observer
Observer Started.

The Observer has started its working. The current session of ‘cmd’ will not be returned to you because it is a foreground process. It will continuously observe PRIMARY site. To check the configuration, open another ‘cmd’ window and check the configuration.

DGMGRL> show configuration;
Configuration
Name: broker1
Enabled: YES
Protection Mode: MaxAvailability
Fast-Start Failover: Enabled Databases: standby - Primary database primary - Physical standby database - Fast – Start Failover target
Fast-Start Failover

Threshold: 30 seconds Observer: orat

Current status for “broker1”:SUCCESS

At this stage every thing is going on normally. An identical copy of PRIMARY database is being made at the STANBY site. If due to any reason, PRIMARY database crashes, the observer will start its working.

DGMGRL> start observer
Observer started.15:21:21.69 Wednesday, January 07, 2009
Initiating fast-start failover to database “standby”…
Performing failover NOW, please wait…
Operation requires shutdown of instance “orcl” on database “standby”
Shutting down instance “orcl”…
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.

Operation requires startup of instance “orcl” on database “standby” Starting instance “orcl”…


Data Guard With Real Time
Redo data is applied to the standby database as soon as it is received from the primary database.In Oracle9i Data Guard this apply has to wait till an archivelog is� created on the standby database For Redo Apply:

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE


For SQL Apply:

ALTER DATABASE START LOGICAL STANDBY APPLY IMMEDIATE

When real time apply is enabled, RECOVERY_MODE column in V$ARCHIVE_DEST_STATUS displays “MANAGED REAL�TIME APPLY”


Real Time Apply – BenefitsStandby databases now more closely synchronized with the primary
More up-to-date, real-time reporting
Faster switchover and failover times
Reduces planned and unplanned downtime
Better Recovery Time Objective (RTO) for DR

Note


For Real Time We must create standby log files .Redo logs can be created even after the standby has been created

Create the SRL's :

SQL>ALTER DATABASE ADD STANDBY LOGFILE GROUP 4 (‘C:\oracle\product\10.2.0\oradata\ORCL\StandbyRedo4.log’) SIZE 100M;
SQL>ALTER DATABASE ADD STANDBY LOGFILE GROUP 5 (‘C:\oracle\product\10.2.0\oradata\ORCL\StandbyRedo5.log’) SIZE 100M;
SQL>ALTER DATABASE ADD STANDBY LOGFILE GROUP 6 (‘C:\oracle\product\10.2.0\oradata\ORCL\StandbyRedo6.log’) SIZE 100M;

Converting To Real time
Real-time apply : When real-time apply is enabled, the log apply services can apply redo data as it is received, without waiting for the current standby redo log file to be archived. In this example we are going to use LGWR on the primary for redo transport just to prove that a committed record on the primary without switching a log will show up on the standby. However real-time apply will work with both LGWR and ARCH using SRL's.


Set up log_archive_dest_2 on the primary with LGWR ASYNC or SYNC

log_archive_dest_2='SERVICE=to_standby LGWR ASYNCVALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)
DB_UNIQUE_NAME=standby '

shutdown and startup the primary or if done dynamically switch a log file
You will see the following message in the alert log

*LGWR: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2

Enable Real Time

On the standby cancel out of the current managed recovery

SQL>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Place it back in recovery with Real time apply
SQL>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;


Test Case
Create a table on Primary.


SQL> create table test 2 ( name varchar2(30));Table created.

Now Insert some records and commit.

SQL> insert into test 2 values 3 ('Vikram'); 1 row created.SQL> commit;Commit complete.
Now on Standby Run these commands.

SQL>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Database altered.
SQL>ALTER DATABASE OPEN READ ONLY;
Database altered.

Now check the table and record which you inserted.
SQL> select * from test;
NAME
------------------------------
Vikram

We can our table and record without any log switch.

After Checking your Real Time work Dont Forget to put your database back in recover mode ;)
Place the standby back in managed recover mode

SQL>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;Database altered.

This will take the standby directly from read only mode and place it in managed recovery mode.

Saturday, July 26, 2008

Clusterware Commands

To work efficiently on Clusterware one should be familier with commands.

Below are few commands which I came across while working on it.

-> To Check whether clusterware is up or not, and which services are up and running.

$CRS_HOME/bin > ./crs_stat -t

This is will show complete list of components installed and their status eg. on a RAC environment with ASM it will show following components.

Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE node name
ora....03.lsnr application ONLINE ONLINE node name
ora....003.gsd application ONLINE ONLINE node name
ora....003.ons application ONLINE ONLINE node name
ora....003.vip application ONLINE ONLINE node name
ora....SM2.asm application ONLINE ONLINE node name
ora....04.lsnr application ONLINE ONLINE node name
ora....004.gsd application ONLINE ONLINE node name
ora....004.ons application ONLINE ONLINE node name
ora....004.vip application ONLINE ONLINE node name
ora.database.db application ONLINE ONLINE node name
ora....d1.inst application ONLINE ONLINE node name
ora....d2.inst application ONLINE ONLINE node name

-> To Start clusterware services

$CRS_HOME/bin > ./crsctl start crs

-> To Stop clusterware services

$CRS_HOME/bin > ./crsctl stop crs

-> To Start one specific service eg. after checking services status, you came to know that Listener is down on Node A, to bring that up :

$CRS_HOME/bin > ./srvctl start listener -n Node A

-> To check all the commands to start/stop/add etc related to clusterware services

$CRS_HOME/bin > ./srvctl

This will show all the avaliable commands

Friday, June 27, 2008

Installing Clusterware and RAC DB

Well today I have to install an Oracle RAC environment on AIX.

We use ASM for our shared disks between our servers. This is where our database (tablespaces), redo logs, spfiles, and archive logs reside. For the Oracle Cluster Ready Services you can get by with 1 OCR disk and 2 voting disks. For redundancy, we run with 2 OCR disks and 3 voting disks. The redundancy, raid, striping etc is done in our SAN. One thing you do need to know is that if you use ASM then you need your luns to be the same size for each diskgroup or well your raw devices to be the same size. Otherwise it can throw off your rebalancing algorithm. NOTE: in the latest 10.2.0.3

The basic requirements of installing Oracle on AIX /Unix still applies. You need to create the oracle user, the dba group and the oinstall group. This is in the Oracle Database install guide and also in the Oracle RAC install guide. It took our SA’s about 2 DAYS to build the boxes according to specs. Another thing I would highly advise is to get sudo for ALL just to make it easier. You will also find that it works well that you are in constant contact with your SA’s should you need something or find it missed. After the install is done, they can remove the sudo command.

Pre-Configuring the Servers

A couple of things you won’t find in the doc that we did is we set the io buffer cache to 2-5% of available RAM. We also had our Lun’s precarved out to 64GB Luns. Luns being the physical disk that is on the SAN. The SAN admin and AIX admin handle attaching these and making them shared. I will get into how you do that below. The thing is we did the same thing in Linux and we used fdisk to carve out the OCR and voting disks. In AIX you use a LVM. The problem with the LVM in AIX according to a couple of our AIX admins is that they can’t give both nodes access to the same disk at the same time. They would need to use something like HACMP which we don’t have a license for on this server. The answer? The admins are going to get 5 luns at 1GB a piece. They will be attached and each will have a major and minor spindle number. He can then use mknod to essentially create an alias and voila. I will also show what the cluster verification utility is and how I use it to verify things like user equivalence, node connectivity, etc… I will even show where we get an error an I ignore it. Also, depending on if your database does a lot of reading and writing, you will want fine striping like 128k. If you are heavy read and write intensive like a data warehouse then you want course striping.


My main reference point was the Oracle 10gR2 RAC - AIX install guide and the Oracle ASM guide.


System Requirements for all nodes:

AIX 5L version 5.3 maintenance Level 02 or later


Hardware
· Allocate swap space equivalent to 1.5 x physical memory
· 1GB of disk space in the /tmp directory
· 64 - bit kernel / processor is required


OS filesets
· bos.adt.base
· bos.adt.lib
· bos.adt.libm
· bos.perf.libperfstat
· bos.perf.perfstat
· bos.perf.proctools
· rsct.basic.rte
· rsct.compat.client.rte
· xlC.aix50.rte 7.0.0.4 or later
· clC.rte 7.0.0.1 or later


C/C++ Compiler
· IY65361
· IY65362


AIX PTF for APARs
· IY89080
· IY92037
· IY68989
· IY68874
· IY70031
· IY76140

JDK
· IY63533 - SDK 1.4 64-bit 1.4.2.1
· IY65305 - SDK 1.4 32-bit 1.4.2.2
· IY58350 - SDK 1.3.1.16 32-bit



Shell Limits and Network Tuning Parameters
· Ensure ssh2 functions on all nodes
· Soft File size = -1 (Unlimited)
· Soft CPU time = -1 (Unlimited)
· Soft Data segment = -1 (Unlimited)
· Soft Stack size = -1 (Unlimited)
· maxuproc = 65536
· ipqmaxlen = 512
· rfc1323 = 1
· sb_max = 2*655360


Set u-limit for Oracle account as follows:
· sudo -u root ulimit -a
o time(seconds) unlimited
o file(blocks) unlimited
o data(kbytes) unlimited
o stack(kbytes) unlimited
o memory(kbytes) unlimited
o coredump(blocks) unlimited
o nofiles(descriptors) unlimited


You should have unlimited for oracle and root user otherwise it will fail to spawn the process.



We also need to configure the network configuration. We have a public network and also a vip ip that goes on that public network. The SA only configures the static public network to your public interface but he puts both that and the VIP into the /etc/hosts file and even in DNS if you want it there (in DNS). Oracle’s CRS does the bonding of the VIP to the card. That way if a node fails it can bring the VIP up on another node in the cluster. You also need to configure your interconnects. For this we have 2 NIC’s on each node going through 2 separate switches so no single point of failure; however, if one of the interconnects fails that CRS uses, then it will fence and evict other nodes out of the cluster. So how did we get around this? We had the SA bond the two interconnects into one IP. We then used etherchannel to do load balancing and the speed of the interconnects is full 1000FD. Something else you can do is use jumbo frame sizes of 9000 MTU instead of the default of 1500 MTU. For this to work on Cisco switches you have to have a dedicated switch and not VLAN’s because it is a global setting on the switch and you don’t want to use jumbo frames for anything else (in most cases). Since RAC involves potentially feeding a lot of blocks across the interconnect, the jumbo frames remove a lot of the overhead and allow plenty of room for data. It also means there is less work to do to break up the blocks into small packets and reassemble since there are fewer parts.Also, you never ever ever ever want to use a cross over cable for your interconnect. If the cable breaks or if a card fails, which one do you know is bad? Also, if you want to scale, then how do you do it beyond two nodes? For a reference on this, look in metalink and also Google Mike Erwin and RAC. He has some discussions about it.



So this is what our network configuration looks like:

cat /etc/hosts

127.0.0.1 loopback localhost

#PUBLIC INTERFACE

10.10.10.1 racnode1 racnode1.domain
10.10.10.2 racnode2 racnode2.domain

#VIP ON PUBLIC - Assuming a subnet mask of 255.255.255.0

#Notice how it is on the same subnet

10.10.10.11 racnode1-vip racnode1-vip.domain
10.10.10.12 racnode2-vip racnode2-vip.domain

#PRIVATE INTERCONNECTS

#These are bonded - really 2 NICs on each node

192.168.1.1 racnode1-priv racnode1-priv.domain
192.168.1.2 racnode2-priv racnode2-priv.domain



Once this is all configured, you then need to setup your raw devices. We did not want to license HACMP to carve out a couple of raw disks from logical volumes from existing physical volumes or LUNS, so we had the SAN admins carve out the smallest LUNS they would give me which were 1GB. The AIX SA then made them available to me. The 5 disks were /dev/hdisk6 - /dev/hdisk10. 2 for the OCR and 3 for the voting disks. The OCR’s for AIX only need to be 256MB and the Voting only need to be 20MB a piece. This might seem like overkill but was cheaper to lose 3.5GB of disk than to license HACMP. There were also devices labeled /dev/rhdisk6 - /dev/rhdisk10 which is the raw character device. This is what Oracle actually accesses and the permissions on the 2 OCR disks need to be owned by root and the dba group. The 3 voting disks need to be owned by oracle and the dba group. I have seen these work with the oinstall group used as well but the Oracle doc says to use the dba group so I will. You also need to change permissions on the underlying raw device unless using multipathing software which creates a virtual name to address multiple paths to the same disk (multiple HBAs). To change the permissions issue the following command as root or use sudo.



chown root:dba /dev/rhdisk6
chmod 640 /dev/rhdisk6



Do the same thing for rhdisk7 or whichever your raw devices are. With the 3 voting disks as well as all your other ASM disks (if you are using ASM) then issue the following command as root or use sudo:


chown oracle:dba /dev/rhdisk8
chmod 660 /dev/rhdisk8



Do the same thing for the rest of the devices.


There are a couple of things to note about setting ownership on these “raw devices”.


First, find out where the logical character device is pointing. For example, ls -l /dev/rdsk and see where it points and then grant permission on that as well. If you are using multipathing software, then ignore. For example, if you are using emcpower path, then just grant ownership of emcpowera through emcpowerz which is the virtual name.


Secondly, you need to set the no-reservation flag to prevent locking on the hdisks and allow concurrent I/O.


For all multipath I/O or MPIO capable devices such as ESS, DS8000, and DS6000, use this command.


chdev -l hdisk1 -a reserve_policy=no_reserve.

For EMC Symmetrix, Clarion, Hitachi, IBM DS4000, and non-MPIO capable devices, do the following

chdev -l hdisk -a reserve_lock=no


Next, user equivalence needs to be setup. What user equivalence does is allows Oracle to login to other nodes in the cluster without needing a password. It does this via interactive logins. We also have an issue and motd fie in our etc directory which throws it off. To get around this you either remove those files or rename them or you can create a .hushlogin file in the Oracle user’s home directory.


To setup user equivalence, issue the following commands:


NOTE: The docs say to give them a password but I gave them no password. It might be a little less secure but in doing this I won’t have to issue the ssh-agent $SHELL command and then the ssh-add command to add the keys for each session.


/usr/bin/ssh-keygen -t rsa
/usr/bin/ssh-keygen -t dsa


You then create an authorized_keys files and add keys to it then copy it to all nodes in the RAC.


Example for a 2 node RAC cluster:

touch ~/.ssh/authorized_keys

ssh racnode1 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys

ssh racnode1 cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys

ssh racnode2 cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys

ssh racnode2 cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys

scp ~/.ssh/authorized_keys racnode2:/home/oracle/.ssh/


Now we can verify if it works. When you run the command below, you should ONLY get a driectory output. If you get any banners or anything else it will fail.

ssh racnode2 ls

Now you need to verify things using Oracle’s cluster verification utility. It could fail if you have some private interfaces that are really public, like anything that starts with 10 for example. If you get this, just ignore it. You also want to verify connectivity to shared storage.


./runcluvfy.sh stage -pre crsinst -n litprxor01dm,litprxor11dm

./runcluvfy.sh comp ssa -n litprxor01dm,litprxor11dm -s

/dev/rhdisk8,/dev/rhdisk9,/dev/rhdisk10

Installation:

After this has been verified it will be time to start the install. The first thing you need to install is Oracle’s CRS. I would also recommend reading through the documentation first.


I ran the rootpre.sh that was in the rootpre folder on each node. I used scp -r to copy the directory over to each node and run it. The software only needs to be on one node though because it uses user equivalence to copy it over. If you have a slow interconnect then it could be a long install : ).


After running the rootpre, I then bring up the ./runInstaller on my local screen using X redirect since my AIX box is not running XWindows. I then proceed through the install and select both nodes. I answer a few questions including what I want to use for my interconnect. If your etc/hosts file is off then it will throw errors. Make sure they match up correctly. You will also need to tell it which interface is public (VIPs), which is private (interconnect), and which ones are not used. At the end it will ask you to run a series of scripts and it will tell you which node to run them on. Sudo and the comman through an error. I had to sudo su - root and then run them. I can’t recall the error but if I do I will update this post. It was about something missing because of the path of the Oracle user not root.


Also, you will need to run VIPCA if it errors. The reason it errors is because one of the programmers had it look for a private IP address and mark it as an invalid VIP candidate. It runs silently as part of the CRS install. Manually running it after the install fixes the problem.


After this, I installed the Oracle Database binaries. Again, I had the software on one node and it installed across all nodes. I told it not to create a database. One thing you have to decide if you use ASM is if you want it to share the same binaries as the database. If you do, then patching becomes much faster with less downtime. The drawback is if you want multiple versions of the DB software to use ASM. Example, Oracle 11g comes out and I want to put it on this server. I also want to use ASM, what happens if it requires a new version of ASM? 10g won’t work with that new version. It also takes up more space to have separate homes. As you can see, there are pros and cons to both sides. After installing it in the same home without creating any instances or databases I ran the 10.2.0.3 patch on the CRS home and the database home. You need a metalink account to get this.



Patching:


NOTE: I have included my path but if it is set in your $PATH variable then issuing crsctl should work.


You need to issue with root privileges the following command:

/u01/app/oracle/product/10.2.0/crs/crsctl stop crs

You will have to start it after the finish patching it with the following command:

/u01/app/oracle/product/10.2.0/crs/crsctl start crs

You can verify that CRS starts by issuing the following command:

crs_stat -t

You will have to do the same thing for the Oracle home. Once patched, you can then create a database. For now, and because of the extensity of this already post already, I would say use dbca. Before you create the database, you should run NETCA and setup your listeners on both nodes. This will then setup the correct local_listeners and remote_listeners values in your DB parameters as well as your TNSNAMES.ORA file and LISTENER.ORA file.


When you setup ASM using DBCA, you will need to use a pfile and not an spfile. Use the default location. It will then create it in your admin/+ASM/pfile directory but it also creates it in your $ORACLE_HOME/dbs directory. The dbs directory is where you will want to make changes to your files later. When I setup the ASM and discovered the disks, I chose external redundancy because the SAN has it built in with RAID 5 or RAID 10. It will also ask you for an admin password. Remember this because if you use Grid Control, you will need to know that password to monitor your ASM disks. If you forget it, you will need to create a new password file. Also, if you use the default of +ASM then the instance on the first node will be +ASM1 and on the second will be +ASM2. You can only log into ASM as sysdba. It is not a database, just an instance with some data dictionary views. Also, when I built my diskgroups, I made my flash recovery area bigger than my database. I multiplex my redo logs and control files here. You canalso decide if you want to multiplex your archive logs between them. I send my RMAN backups uncompressed to DG2 as well. You can see why this has to be bigger than the diskgroup I use for the DB. You can use lower tier storage here than you do for the database. Work with your SAN guys in setting that up.


When you create the database and have it use ASM you can select the diskgroups you want to use. In my case, I chose both. I named them DG1 and DG2. When I turned on archiving, I used DG2. For the multiplexing of redo logs I used +DG1 and +DG2. That is all I selected. For the Flash Recovery Area, I used +DG2. When I got into my redo logs, you will notice there is 4 by default. There is 2 for the first node and 2 for each additional node. If you look at them you will see the thread numbers are different. I went ahead and created a 5th and 6th group. 5th was thread 1 and 6th was thread 2. This gives me 3 redo logs for each node. When they archive, they will not overwrite eachother. I also made my redo logs 100MB in size. Oracle makes them way too small for my liking. If I can reduce the number of log switches, then I can reduce some overhead. When you are done, it will create your database on both nodes. After it finishes you can issue the crs_stat -t command again and you will see your services are started. You can also create other RAC databases doing the same thing.


A gotcha that I ran into was the load balancing. The parameter remote_listener is set to point to a tnsname in your tnsnames.ora. Whatever the host is, is what it will return when it tries to load balance. If it is not in DNS, then when it returns the connect string to go to the other node your local machine will not resolve it. You could make an entry in your etc/hosts file or you could just hardcode the VIP address and then it will always resolve. Up to you.