rac troubleshooting and diagnosability sangam2016
TRANSCRIPT
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
Troubleshooting and Diagnosing Oracle RAC in the Private CloudSandeshRao,SeniorDirector,RACDevelopment
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
SafeHarborStatementThefollowingisintendedtooutlineourgeneralproductdirection.Itisintendedforinformationpurposesonly,andmaynotbeincorporatedintoanycontract.Itisnotacommitmenttodeliveranymaterial,code,orfunctionality,andshouldnotberelieduponinmakingpurchasingdecisions.Thedevelopment,release,andtimingofanyfeaturesorfunctionalitydescribedforOracle’sproductsremainsatthesolediscretionofOracle.
Confidential– OracleRestricted 2
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
Agenda
• ArchitecturalOverview• TroubleshootingScenarios• ProactiveandReactivetools• Q&A
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
GridInfrastructureOverview
• OracleClusterwareisrequiredfor11gR2+RACdatabases• OracleClusterwarecanmanagenonRACdatabaseresourcesusingagents• OracleClusterwarecanmanageHAforanyBusinessCriticalApplicationwithagentinfrastructurealsocalledXAG–OraclepublishesBundledAgentsforsomenonRACDBresources
• SAP,GoldenGate,Siebel,Apache..
OracleGridInfrastructure
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
GridInfrastructureOverview
• GridInfrastructureisthenameforthecombinationof:-–OracleClusterReadyServices(CRS)–OracleAutomaticStorageManagement(ASM)
• TheGridHomecontainsthesoftwareforbothproducts• CRScanalsobeStandaloneforASMand/orOracleRestart• CRScanrunbyitselforincombinationwithothervendorclusterware• GridHomeandRDBMShomemustbeinstalledindifferentlocations
– TheinstallerlockstheGridHomepathbysettingrootpermissions.
OracleGridInfrastructure
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
GridInfrastructureOverview
• CRSrequiressharedOracleClusterRegistry(OCR)andVotingfiles–MustbeinASMorCFS–OCRbackedupevery4hoursautomaticallyGIHOME/cdata– Kept4,8,12hours,1day,1week– Restoredwithocrconfig– VotingfilebackedupintoOCRateachchange.– Votingfilerestoredwithcrsctl
OracleGridInfrastructure
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
GridInfrastructureOverview
• FornetworkCRSrequires–One/multiplehighspeed,lowlatency,redundantprivatenetworkforinternodecommunications
– Thinkofinterconnectasamemorybackplaneforthecluster– Shouldbeaseparatephysicalnetwork ormanagedconvergednetwork– VLANSaresupported– Usedfor:-
• Clusterwaremessaging• RDBMSmessagingandblocktransfer• ASMmessaging• HANFSforblocktraffic
OracleGridInfrastructure
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
GridInfrastructureOverview
• OnlyonesetofClusterwaredaemonscanrunoneachnode• TheCRSstackisspawnedfromOracleHAServicesDaemon(ohasd)• OnUnixohasd runsoutofinittab withrespawn• Anodecanbeevictedwhendeemedunhealthy
–MayrequirerebootbutatleastCRSstackrestart(rebootless restart)– IPMIintegrationordiskmon incaseofExadata
• CRSprovidesClusterTimeSynchronizationservices– Alwaysrunsbutinobservermodeifntpd configured
OracleGridInfrastructure
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
GridInfrastructureProcesses11.2+Agentschangeeverything.• Multi-threadedDaemons• Managemultipleresourcesandtypes• Implementsentrypointsformultipleresourcetypes
– Start,stop check,clean,fail
• oraagent,orarootagent,applicationagent,scriptagent,cssdagent• SingleprocessstartedfrominitonUnix(ohasd)• Diagrambelowshowsallcoreresources
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
GridInfrastructureProcesses
Level1
Level2a
Level2b
Level3
Level4a
Level4b
Level0
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
GridInfrastructureProcessesInitScripts• /etc/init.d/ohasd (locationO/Sdependent)
– RCscriptwith“start”and“stop”actions– InitiatesOracleClusterware autostart– ControlfilecoordinateswithCRSCTL
• /etc/init.d/init.ohasd (locationO/Sdependent)–OHASDFrameworkScriptrunsfrominit/upstart– ControlfilecoordinateswithCRSCTL– NamedpipesyncswithOHASD
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
GridInfrastructureProcesses
• Level1:OHASDSpawns:– cssdagent - AgentresponsibleforspawningCSSD– orarootagent- Agentresponsibleformanagingallrootownedohasd resources– oraagent - Agentresponsibleformanagingalloracleownedohasd resources– cssdmonitor - MonitorsCSSDandnodehealth(alongwiththecssdagent)
• Level2a:OHASDrootagent spawns:– CRSD- Primarydaemonresponsibleformanagingclusterresources.– CTSSD- ClusterTimeSynchronizationServicesDaemon– Diskmon (Exadata)– ACFS(ASMClusterFileSystem)Drivers
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
GridInfrastructureProcesses
• Level2b:OHASDoraagent spawns:–MDNSD– MulticastDNSdaemon– GIPCD– GridIPCDaemon– GPNPD– GridPlugandPlayDaemon– EVMD– EventMonitorDaemon– ASM– ASMinstancestartedhereasmayberequiredbyCRSD
• Level3:CRSDspawns:– orarootagent - Agentresponsibleformanagingallrootownedcrsd resources.– oraagent - Agentresponsibleformanagingallnonroot ownedcrsd resources.
• OneisspawnedforeveryuserthathasCRSresourcestomanage.
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
GridInfrastructureProcesses
• Level4:CRSDoraagent spawns:– ASMResouce - ASMInstance(s)resource(proxyresource)– Diskgroup- Usedformanaging/monitoringASMdiskgroups.– DBResource- UsedformonitoringandmanagingtheDBandinstances– SCANListener- Listenerforsingleclientaccessname,listeningonSCANVIP– Listener- NodelistenerlisteningontheNodeVIP– Services- Usedformonitoringandmanagingservices– ONS- OracleNotificationService– eONS - EnhancedOracleNotificationService(pre11.2.0.2)– GSD- For9ibackwardcompatibility– GNS(optional)- GridNamingService- Performsnameresolution
Startup Sequence
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
TroubleshootingScenariosClusterStartup ProblemTriage(11.2+)
StartupSequence
ps –ef|grep init.ohasdps –ef|grep ohasd.bin Running?
YES
NO crsctl config crsohasd.log Obvious?
NO EngageOracleSupportEngageSysadminTeam
ClusterStartupDiagnosticFlow
TFACollector
ps –ef|grep cssdagentps –ef|grep ocssd.binps –ef|grep orarootagentps –ef|grep ctssd.binps –ef|grep crsd.binps –ef|grep cssdmonitorps –ef|grep oraagentps –ef|grep ora.asmps –ef|grep gpnpd.binps –ef|grep mdnsd.binps –ef|grep evmd.binCrsctl checkcrsCrsctl checkcluster
Running?
YES
NO
YES
EngageSysadminTeam
ohasd.logagentlogsprocesslogs
Obvious?YES
NO
EngageSysadminTeam
EngageOracleSupportSysadminTeam
TFACollectorohasd.logOLRperms
Comparereferencesystem
Obvious?YESNO
TFACollector EngageSysadminTeam
EngageOracleSupportSysadminTeam
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• MulticastDomainNameServiceDaemon(mDNS(d))– UsedbyGridPlugandPlaytolocateprofilesinthecluster,aswellasbyGNStoperformnameresolution.ThemDNS processisabackgroundprocessonLinuxandUNIXandonWindows.
– Usesmulticastforcacheupdatesonserviceadvertisementarrival/departure.– Advertises/servesonallfoundnodeinterfaces.– LogisGI_HOME/log/<node>/mdnsd/mdnsd.log
Troubleshooting ScenariosCluster Startup Problem Triage
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
<?xmlversion="1.0"encoding="UTF-8"?>
<gpnp:GPnP-ProfileVersion="1.0" xmlns="http://www.grid-pnp.org/2005/11/gpnp-profile"xmlns:gpnp="http://www.grid-pnp.org/2005/11/gpnp-profile"xmlns:orcl="http://www.oracle.com/gpnp/2005/11/gpnp-profile"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.grid-pnp.org/2005/11/gpnp-profilegpnp-profile.xsd"ProfileSequence="6" ClusterUId="b1eec1fcdd355f2bbf7910ce9cc4a228"ClusterName="staij-cluster"PALocation="">
<gpnp:Network-Profile><gpnp:HostNetwork id="gen"HostName="*"><gpnp:Network id="net1"IP=”192.168.1.0"Adapter="eth0"Use="public"/><gpnp:Network id="net2"IP=”192.168.2.0"Adapter="eth1“Use="cluster_interconnect"/></gpnp:HostNetworkcss"></gpnp:Network-Profile>
<orcl:CSS-Profileid="DiscoveryString="+asm"LeaseDuration="400"/>
<orcl:ASM-Profileid="asm"DiscoveryString=""SPFile="+SYSTEM/staij-cluster/asmparameterfile/registry.253.693925293"/>
<ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"><ds:SignedInfo><ds:CanonicalizationMethodAlgorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/><ds:SignatureMethodAlgorithm="http://www.w3.org/2001/10/xml-exc-c14n#"><InclusiveNamespaces xmlns="http://www.w3.org/2001/10/xml-exc-c14n#"PrefixList="gpnp orclxsi"/></ds:Transform></ds:Transforms><ds:DigestMethodAlgorithm="http://www.w3.org/2000/09/xmldsig#sha1"/><ds:DigestValue>x1H9LWjyNyMn6BsOykHhMvxnP8U=</ds:DigestValue></ds:Reference></ds:SignedInfo><ds:SignatureValue>N+20jG4=</ds:SignatureValue></ds:Signature>
</gpnp:GPnP-Profile>
Troubleshooting ScenariosCluster Startup Problem Triage
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• cssdagentandmonitor– Samefunctionalityinbothagentandmonitor– Functionalityofseveralpre-11.2daemonsconsolidatedinboth
• OPROCD– systemhang• OMON– oracleclusterwaremonitor• VMON– vendorclusterwaremonitor
– Runrealtime withlockeddownmemory,likeCSSD– Providesenhancedstabilityanddiagnosability– Logsare
• GI_HOME/log/<node>/agent/oracssdagent_root/oracssdagent_root.log• GI_HOME/log/<node>/agent/oracssdmonitor_root/oracssdmonitor_root.log• 12.1– ORACLE_BASE/diag/node/agent/..
Troubleshooting ScenariosCluster Startup Problem Triage
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
NodeEvictions
EvictionScenario
Clusteralertocssd.log
NHB?1050693.11534949.11546004.1
Engagenetworking teamYES
NO
DHB?1549428.11466639.1 YES
NO
Obvious?
NO
YES
Obvious?
NO
YES
Fenced?Resourcestarvation
YES
NO
NOYES
NodeEvictionDiagnosticFlow
Troubleshooting Scenarios
ResourceStarvation?
NO
EngageOracleSupport
Engagesysadminteam
Engagestorageteam
1531223.11328466.1Systemlog
YES
Engageappropriate
team
Resolved?NO
YES
Freememory?CPUload?
NodeResponse?
TFACollector
TFACollector
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
MissingNetworkHeartbeat(1)• ocssd.logfromnode1
• ===>sendingnetworkheartbeatsothernodes.Normally,thismessageisoutputonceevery5messages(seconds)
• 2016-08-1317:00:20.023:[CSSD][4096109472]clssnmSendingThread:sendingstatusmsg toallnodes
• 2016-08-1317:00:20.023:[CSSD][4096109472]clssnmSendingThread:sent5statusmsgs toallnodes
• ===>Thenetworkheartbeatisnotreceivedfromnode2(drrac2)for15consecutiveseconds.
• ===>Thismeansthat15networkheartbeatsaremissingandisthefirstwarning(50%threshold).
• 2016-08-1317:00:22.818:[CSSD][4106599328]clssnmPollingThread:nodedrrac2(2)at50%heartbeatfatal,removalin14.520seconds
• 2016-08-1317:00:22.818:[CSSD][4106599328]clssnmPollingThread:nodedrrac2(2)isimpendingreconfig,flag132108,misstime 15480
• ===>continuingtosendthenetworkheartbeatsandlogmessagesonceevery5messages
• 2016-08-1317:00:25.023:[CSSD][4096109472]clssnmSendingThread:sendingstatusmsg toallnodes
• 2016-08-1317:00:25.023:[CSSD][4096109472]clssnmSendingThread:sent5statusmsgs toallnodes
• ===>75%thresholdofmissingnetworkheartbeatisreached.Thisissecondwarning.
• 2016-08-1317:00:29.833:[CSSD][4106599328]clssnmPollingThread:nodedrrac2(2)at75%heartbeatfatal,removalin7.500seconds
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
MissingNetworkHeartbeat(2)• ===>continuingtosendthenetworkheartbeatsandlogmessagesonceevery5messages
• 2016-08-1317:00:30.023:[CSSD][4096109472]clssnmSendingThread:sendingstatusmsg toallnodes
• 2016-08-1317:00:30.023:[CSSD][4096109472]clssnmSendingThread:sent5statusmsgs toallnodes
• ===>continuingtosendthenetworkheartbeats,butthemessageisloggedafter4messages
• 2016-08-1317:00:34.021:[CSSD][4096109472]clssnmSendingThread:sendingstatusmsg toallnodes
• 2016-08-1317:00:34.021:[CSSD][4096109472]clssnmSendingThread:sent4statusmsgs toallnodes
• ===>Lastwarningshowsthat90%thresholdofthemissingnetworkheartbeatisreached.
• ===>Theevictionwilloccurin2.49seconds.
• 2016-08-1317:00:34.841:[CSSD][4106599328]clssnmPollingThread:nodedrrac2(2)at90%heartbeatfatal,removalin2.490seconds,seedhbimpd 1
• ===>Evictionofnode2(drrac2)started
• 2016-08-1317:00:37.337:[CSSD][4106599328]clssnmPollingThread:Removalstartedfornodedrrac2(2),flags0x2040c,state3,wt4c0
• ===>Thisshowsthatthenode2isactivelyupdatingthevotingdisks
• 2016-08-1317:00:37.340:[CSSD][4085619616]clssnmCheckSplit:Node2,drrac2,isalive,DHB(1281744040,1396854)morethandisktimeoutof27000afterthelastNHB(1281744011,1367154)
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
MissingNetworkHeartbeat(3)• ===>Evictingnode2(drrac2)
• 2016-08-1317:00:37.340:[CSSD][4085619616](:CSSNM00007:)clssnmrEvict:Evictingnode2,drrac2,fromtheclusterinincarnation169934272,nodebirthincarnation169934271,deathincarnation169934272,stateflags 0x24000
• ===>Reconfiguredtheclusterwithoutnode2
• 2016-08-1317:01:07.705:[CSSD][4043389856]clssgmCMReconfig:reconfigurationsuccessful,incarnation169934272with1nodes,localnodenumber1,masternodenumber1
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
MissingNetworkHeartbeat(4)• ocssd.logfromnode2:• ===>Loggingthemessagetoindicate5networkheartbeatsaresenttoothernodes
• 2016-08-1317:00:26.009:[CSSD][4062550944]clssnmSendingThread:sendingstatusmsg toallnodes
• 2016-08-1317:00:26.009:[CSSD][4062550944]clssnmSendingThread:sent5statusmsgs toallnodes
• ===>Firstwarningofreaching50%thresholdofmissingnetworkheartbeats
• 2016-08-1317:00:26.213:[CSSD][4073040800]clssnmPollingThread:nodedrrac1(1)at50%heartbeatfatal,removalin14.540seconds
• 2016-08-1317:00:26.213:[CSSD][4073040800]clssnmPollingThread:nodedrrac1(1)isimpendingreconfig,flag394254,misstime 15460
• ===>Loggingthemessagetoindicate5networkheartbeatsaresenttoothernodes
• 2016-08-1317:00:31.009:[CSSD][4062550944]clssnmSendingThread:sendingstatusmsg toallnodes
• 2016-08-1317:00:31.009:[CSSD][4062550944]clssnmSendingThread:sent5statusmsgs toallnodes
• ===>Secondwarningofreaching75%thresholdofmissingnetworkheartbeats
• 2016-08-1317:00:33.227:[CSSD][4073040800]clssnmPollingThread:nodedrrac1(1)at75%heartbeatfatal,removalin7.470seconds
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
MissingNetworkHeartbeat(5)• ===>Loggingthemessagetoindicate4networkheartbeatsaresent
• 2016-08-1317:00:35.009:[CSSD][4062550944]clssnmSendingThread:sendingstatusmsg toallnodes
• 2016-08-1317:00:35.009:[CSSD][4062550944]clssnmSendingThread:sent4statusmsgs toallnodes
• ===>Thirdwarningofreaching90%thresholdofmissingnetworkheartbeats
• 2016-08-1317:00:38.236:[CSSD][4073040800]clssnmPollingThread:nodedrrac1(1)at90%heartbeatfatal,removalin2.460seconds,seedhbimpd 1
• ===>Loggingthemessagetoindicate5networkheartbeatsaresenttoothernodes
• 2016-08-1317:00:40.008:[CSSD][4062550944]clssnmSendingThread:sendingstatusmsg toallnodes
• 2016-08-1317:00:40.009:[CSSD][4062550944]clssnmSendingThread:sent5statusmsgs toallnodes
• ===>Evictionstartedfornode1(drrac1)
• 2016-08-1317:00:40.702:[CSSD][4073040800]clssnmPollingThread:Removalstartedfornodedrrac1(1),flags0x6040e,state3,wt4c0
• ===>Node1isactivelyupdatingthevotingdisk,sothisisasplitbraincondition
• 2016-08-1317:00:40.706:[CSSD][4052061088]clssnmCheckSplit:Node1,drrac1,isalive,DHB(1281744036,1243744)morethandisktimeoutof27000afterthelastNHB(1281744007,1214144)
• 2016-08-1317:00:40.706:[CSSD][4052061088]clssnmCheckDskInfo:Mycohort:2
• 2016-08-1317:00:40.707:[CSSD][4052061088]clssnmCheckDskInfo:Survivingcohort:1
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
MissingNetworkHeartbeat(6)• ===>Node2isabortingitselftoresolvethesplitbrainandensuretheclusterintegrity
• 2016-08-1317:00:40.707:[CSSD][4052061088](:CSSNM00008:)clssnmCheckDskInfo:Abortinglocalnodetoavoidsplitbrain.Cohortof1nodeswithleader2,drrac2,issmallerthancohortof1nodesledbynode1,drrac1,basedonmaptype2
• 2016-08-1317:00:40.707:[CSSD][4052061088]###################################
• 2016-08-1317:00:40.707:[CSSD][4052061088]clssscExit:CSSDaborting fromthreadclssnmRcfgMgrThread
• 2016-08-1317:00:40.707:[CSSD][4052061088]###################################
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
MissingNetworkHeartbeat(7)• Observations1. Bothnodesreportedmissingheartbeatsatthesametime2. Bothnodessentheartbeatstoothernodesallthetime3. Node2aborteditselftoresolvesplitbrain
• Conclusion1. Thisislikelyanetworkproblem,engagenetworkteam2. CheckOSWatcher output(netstat andtraceroute)
1. Configureprivate.net file,notconfiguredbydefault
3. CheckCHMOS4. Checksystemlog
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
VotingDiskAccessProblem(1)
ocssd.log:
===>Thefirsterrorindicatingthatitcouldnotreadvotingdisk-- firstmessagetoindicateaproblemaccessingthevotingdisk
2016-08-1318:31:19.787:[SKGFD][4131736480]ERROR:-9(Error27072,OSError(LinuxError:5:Input/outputerror
Additionalinformation:4
Additionalinformation:721425
Additionalinformation:-1)
)
2016-08-1318:31:19.787:[CSSD][4131736480](:CSSNM00060:)clssnmvReadBlocks:readfailedatoffset529of/dev/sdb8
2016-08-1318:31:19.802:[CSSD][4131736480]clssnmvDiskAvailabilityChange:votingfile/dev/sdb8nowoffline
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
VotingDiskAccessProblem(2)====>Theerrormessagethatshowsaproblemaccessingthevotingdiskrepeatsonceevery4seconds
2016-08-1318:31:23.782:[CSSD][150477728]clssnmvDiskOpen:Opening/dev/sdb8
2016-08-1318:31:23.782:[SKGFD][150477728]Handle0xf43fc6c8fromlib:UFS::fordisk:/dev/sdb8:
2016-08-1318:31:23.782:[CLSF][150477728]Openedhdl:0xf4365708fordev:/dev/sdb8:
2016-08-1318:31:23.787:[SKGFD][150477728]ERROR:-9(Error27072,OSError(LinuxError:5:Input/outputerror
Additionalinformation:4
Additionalinformation:720913
Additionalinformation:-1)
)
2016-08-1318:31:23.787:[CSSD][150477728](:CSSNM00060:)clssnmvReadBlocks:readfailedatoffset17of/dev/sdb8
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
VotingDiskAccessProblem(3)
====>Thelasterrorthatshowsaproblemaccessingthevotingdisk.
====>Notethatthelastmessageis200secondsafterthefirstmessage
====>becausethelongdisktimeout is200seconds
2016-08-1318:34:37.423:[CSSD][150477728]clssnmvDiskOpen:Opening/dev/sdb8
2016-08-1318:34:37.423:[CLSF][150477728]Openedhdl:0xf4336530fordev:/dev/sdb8:
2016-08-1318:34:37.429:[SKGFD][150477728]ERROR:-9(Error27072,OSError(LinuxError:5:Input/outputerror
Additionalinformation:4
Additionalinformation:720913
Additionalinformation:-1)
)
2016-08-1318:34:37.429:[CSSD][150477728](:CSSNM00060:)clssnmvReadBlocks:readfailedatoffset17of/dev/sdb8
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
VotingDiskAccessProblem(4)====>Thismessageshowsthatocssd.bintriedaccessingthevotingdiskfor200seconds
2016-08-1318:34:38.205:[CSSD][4110736288](:CSSNM00058:)clssnmvDiskCheck:NoI/Ocompletionsfor200880msforvotingfile/dev/sdb8)
====>ocssd.binabortsitselfwithanerrormessagethatthemajorityofvotingdisksarenotavailable.Inthiscase,therewasonlyonevotingdisk,butifthreevotingdiskswereavailable,aslongastwovotingdisksareaccessible,ocssd.binwillnotabort.
2016-08-1318:34:38.206:[CSSD][4110736288](:CSSNM00018:)clssnmvDiskCheck:Aborting,0of1configuredvotingdisksavailable,need1
2016-08-1318:34:38.206:[CSSD][4110736288]###################################
2016-08-1318:34:38.206:[CSSD][4110736288]clssscExit:CSSDabortingfromthreadclssnmvDiskPingMonitorThread
2016-08-1318:34:38.206:[CSSD][4110736288]###################################
• ConclusionThevotingdiskwasnotavailable,engagestorageteam
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• Timesynchronisationissue• ClusterTimeSynchronisationServicesdaemon
– ProvidestimemanagementinaclusterforOracle.• ObservermodewhenVendortimesynchronisations/wisfound
– LogstimedifferencetotheCRSalertlog• ActivemodewhennoVendortimesyncs/wisfound
Node Eviction TriageTroubleshooting Scenarios
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• ClusterReadyServicesDaemon– TheCRSDdaemonisprimarilyresponsibleformaintainingtheavailabilityofapplicationresources,suchasdatabaseinstances.CRSDisresponsibleforstartingandstoppingtheseresources,relocatingthemwhenrequiredtoanothernodeintheeventoffailure,andmaintainingtheresourceprofilesintheOCR(OracleClusterRegistry).Inaddition,CRSDisresponsibleforoverseeingthecachingoftheOCRforfasteraccess,andalsobackinguptheOCR.
– LogfileisGI_HOME/log/<node>/crsd/crsd.log• Rotationpolicy10-50M• Retentionpolicy10logs• Dynamicin12.1andcanbechanged
Node Eviction TriageTroubleshooting Scenarios
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• CRSDoraagent– CRSD’soraagent manages
• alldatabase,instance,serviceanddiskgroupresources• nodelisteners• SCANlisteners,andONS
– IftheGridInfrastructureownerisdifferentfromtheRDBMShomeownerthenyouwouldhave2oraagents eachrunningasoneoftheinstallationowners.Thedatabase,andserviceresourceswouldbemanagedbytheRDBMShomeownerandotherresourcesbytheGridInfrastructurehomeowner.
– Logfileis• GI_HOME/log/<node>/agent/crsd/oraagent_<user>/oraagent_<user>.log
Node Eviction TriageTroubleshooting Scenarios
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• CRSDorarootagent– CRSD’srootagent manages
• GNSandit’sVIP• NodeVIP• SCANVIP• networkresources.
– Logfileis• GI_HOME/log/<node>/agent/crsd/orarootagent_root/oraagent_root.log
Node Eviction TriageTroubleshooting Scenarios
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• Agentreturncodes– Checkentrymustreturnoneofthefollowingreturncodes:
• ONLINE• UNPLANNED_OFFLINE
– Target=online,mayberecoveredfailedover• PLANNED_OFFLINE• UNKNOWN
– Cannotdetermine,ifpreviouslyonline,partialthenmonitor• PARTIAL
– Someofaresourcesservicesareavailable.Instanceupbutnotopen.• FAILED
– Requirescleanaction
Node Eviction TriageTroubleshooting Scenarios
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
§ Importantlogsandtraces§ 11.2– DatabasesonlyuseADR
• GridInfrastructurefilesin$GI_HOME/log/<node_name>/<component_name>– $GI_HOME/log/myHost/cssd– $GI_HOME/log/myHost/alertmyHost.log
§ 12.1– GridInfrastructureandDatabaseuseADR§ DifferentlocationsforGridInfrastructureandDatabases§ GridInfrastructure
• Alert.log,cssd.log,csrd.log,etc
§ Databases§ Alert.log,backgroundprocesstraces,foregroundprocesstraces
Automatic Diagnostic Repository (ADR)Troubleshooting Scenarios
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• Whatifissuesweredetectedbeforetheyhadanimpact?
• Whatifyouwerenotifiedwithaspecificdiagnosisandcorrectiveactions?
• WhatifresourcebottlenecksthreateningSLAswereidentifiedearly?
• Whatifbottleneckscouldbeautomaticallyrelievedjustintime?
• Whatifdatabasehangsandnoderebootscouldbeeliminated?
37
Oracle’sDatabaseandClusterwareTools
Cluster Verification
Utility
ORAchkCluster Health
Monitor
Trace File Analyzer
Quality of Service
Management
Hang Manager
EXAchk
Cluster Health
Advisor
Memory Guard
Confidential– OracleRestricted
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
–Automatedriskidentificationandproactivenotificationbeforebusinessisimpacted
–HealthChecksbasedonmostimpactfulreoccurringproblemsacrossOraclecustomerbase
–Runsinyourenvironment– noneedtosendanythingtoOracle
–ScheduledemailHealthCheckreports
–Findingscanbeintegratedintoothertoolsofchoice
Oracle EXAchk/Orachk (Proactive)
EngineeredSystems
NonEngineeredSystems
OracleEXAchk
OracleORAchk
CommonFramework
Lightweight&nonintrusiveOracleStackHealthChecks
38
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
1. IncludedinbaseimageandlatestOEDA
2. DownloadlatestversionfromMyOracleSupport(install<1min)1070954.1
3. Autoupdatewhenlaterversionavailable
RollOut&MaintainEXAchk
39
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
1. Downloadtheorachk.ziptoyourlocalmachinefromMOSNote1268927.2
2. TransfertoadirectoryonthetargetSystem
3. Unziporachk.zipo Asowneroforacle
databaseorgridhome
Installation
40
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• Profilesprovidelogicalgroupingofchecks whichareaboutsimilartopics
• Runonlychecksinaspecificprofile
• Runeverythingexceptchecksinaspecificprofile
Profiles
./exachk –profile <profile>
./exachk –excludeprofile <profile>
Profile Descriptionasm ASMChecksavdf AuditVaultConfigurationchecks
clusterware Oracleclusterwarecheckscontrol_VM ChecksonlyforControlVM(ec1-vm,ovmm,db,pc1,pc2).
Nocrossnodecheckscorroborate Exadatachecksneedsfurtherreviewbyusertodetermine
passorfaildba DBAChecksebs OracleE-BusinessSuitechecks
eci_healthchecks EnterpriseCloudInfrastructureHealthchecksecs_healthchecks EnterpriseCloudSystemHealthchecks
goldengate OracleGoldenGatecheckshardware HardwarespecificchecksforOracleEngineeredsystems
maa MaximumAvailabilityArchitectureChecksovn OracleVirtualNetworking
platinum Platinumcertificationcheckspreinstall Pre-installationchecksprepatch Checkstoexecutebeforepatchingsecurity Securitychecks
solaris_cluster SolarisClusterChecksstorage OracleStorageServerChecksswitch Infinibandswitchchecks
sysadmin Sysadminchecksuser_defined_checks Runuserdefinedchecksfromuser_defined_checks.xml
41
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• Profilesprovidelogicalgroupingofchecks whichareaboutsimilartopics
• Runonlychecksinaspecificprofile
• Runeverythingexceptchecksinaspecificprofile
Profiles
./orachk –profile <profile>
./orachk –excludeprofile <profile>
Profile Descriptionasm ASMChecks
bi_middleware OracleBusinessIntelligencechecksclusterware Oracleclusterware checks
dba DBAChecksebs OracleE-BusinessSuitechecks
emagent Cloudcontrolagentchecksemoms CloudControlmanagementserverem Cloudcontrolchecks
goldengate OracleGoldenGate checkshardware HardwarespecificchecksforOracleEngineeredsystems
oam OracleAccessManagerchecksoim OracleIdentifyManagerchecksoud OracleUnifiedDirectoryserverchecksovn OracleVirtualNetworking
peoplesoft Peoplesoft bestpracticespreinstall Pre-installationchecksprepatch Checkstoexecutebeforepatchingsecurity Securitycheckssiebel SiebelChecks
solaris_cluster SolarisClusterChecksstorage OracleStorageServerChecksswitch Infiniband switchchecks
sysadmin Sysadmin checksuser_defined_checks Runuserdefinedchecksfromuser_defined_checks.xml
42
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
EnterpriseManagerIntegration
•CheckresultsintegratedintoEMcomplianceframeworkviaplugin
•ViewresultsinnativeEMcompliancedashboards
•Relatedchecksgroupedintocompliancestandards
•Viewtargetschecked,violations&averagescore
•Drilldownintocompliancestandardtoseeindividualcheckresults
•Viewbreakdownbytarget
43
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• IntegrationisviatheEnterpriseManagerORAchkHealthchecksplugin withthefollowingSupport:
• Thefollowingprerequisitesmustbemetbeforeyoucandeploytheplug-in:o VerifythatyourEngineeredSystemshardwareandsoftwareareatthesupportedlevelasdescribedin SupportedHardwareandSoftwareVersions
o AllEngineeredSystemplug-insshouldbedeployed
o InfiniBand switchesandstoragecellsshouldbeanEnterpriseManager-managedtargetfortherespectiveengineeredsystem
o Expectpackageshouldbeinstalledonthehosts
EnterpriseManagerPluginPrerequisites
HardwareTypes SupportedByPlugin
Exadata(physicalconfigurationonly) YesExadata(virtualconfiguration) NoRecoveryappliance YesExalogic(physicalconfiguration) YesExalogic(virtualizedconfiguration) YesOracleSuperCluster NoOraclePrivateCloudMachine No
44
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
JSONOutputtoIntegratewithKibana,ElasticSearchetc
45
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
OracleHealthCheckCollectionManagerDashboard
46
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
OracleStackCoverage• EngineeredSystems
• OracleExadataDatabaseMachine• OracleSuperCluster• OraclePrivateCloudAppliance• OracleDatabaseAppliance• OracleBigDataAppliance• OracleExalogicElasticCloud• OracleExalyticsIn-MemoryMachine• OracleZeroDataLossRecoveryAppliance• OracleZFSStorageAppliance
• ASR
• Systems• OracleSolaris• Crossstackchecks• SolarisCluster• OVN
• Oracle Database• StandaloneDatabase• GridInfrastructure&RAC• Maximum AvailabilityArchitecture(MAA)
Scorecard• Upgrade ReadinessValidation• Golden Gate
• EnterpriseManagerCloudControl• Repository• Agent• OMS
• Middleware• ApplicationContinuity• OracleIdentifyandAccessManagement
Suite(OracleIAM)
• E-BusinessSuite• OraclePayables• OracleWorkflow• OraclePurchasing• OracleOrderManagement• OracleProcessManufacturing• OracleReceivables• OracleFixedAssets• OracleHCM• OracleCRM• OracleProjectBilling
• Siebel• Databasebestpractices
• PeopleSoft• Databasebestpractices
• SAP
• EXAdatabestpractices
47
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.| 48
GeneratesDiagnosticMetricsViewofClusterandDatabasesClusterHealthMonitor(CHM)
GIMR
ologgerd(master)
osysmond
osysmond
osysmond
osysmond
12cGridInfrastructureManagementRepository
• Alwayson- Enabledbydefault• ProvidesDetailed OSResourceMetrics• AssistsNodeevictionanalysis• Locallylogsallprocessdata• Usercandefinepinnedprocesses• ListenstoCSSandGIPCevents• Categorizesprocessesbytype• Supportsplug-incollectors(ex.traceroute,netstat,ping,etc.)
• NewCSVoutputforeaseofanalysis
OSData OSData
OSData
OSData
Confidential– OracleInternal/Restricted/HighlyRestrictedConfidential– OracleRestricted
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.| 49
Oclumon CLIorFullIntegrationwithEMCloudControlClusterHealthMonitor(CHM)
Confidential– OracleInternal/Restricted/HighlyRestrictedConfidential– OracleRestricted
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
WhyTFA?(Proactiveandreactive)
Providesoneinterfaceforalldiagnosticneeds
Collectsdataacrosstheclusterandconsolidatesitinoneplace
Collectsallrelevantdiagnosticdataatthetimeoftheproblem,withonlywhatisneededtodiagnosetheproblem
Reducestimerequiredtoobtaindiagnosticdata,whichsavesyourbusinessmoney
50OracleConfidential– Internal
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• AllmajorOperatingSystems aresupported– Linux(OEL,RedHat,SUSE,Itanium&zLinux)
–OracleSolaris(SPARC&x86-64)– AIX– HPUX(Itanium&PA-RISC)–Windows
• AllOracleDatabase&Gridversions10.2+aresupported
• YouprobablyalreadyhaveTFAinstalledasitisincludedwith:–OracleGridInfrastructure:
• 11.2.0.4+• 12.1.0.2+• 12.2.0.1+
–OracleDatabase:• 12.2.0.1+
• AlsoavailablefromDoc1513912.251
SupportedPlatformsandVersions
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
MonitoringByTFA&AutomatedCollections
52
Automaticallydetectevent
Collect&packagerelevant
diagnostics
NotifyrelevantDBAandorSysAdminby
UploadcollectiontoOracleSupportforfurtherhelp
Significantproblemoccurs
1
2
3
4
TFADBA(s)/SysAdmin(s)
OracleGridInfrastructure&Database(s)
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• Trim&collectallimportantlogfilesupdatedinthepast12hours:
• CollectaproblemspecificServiceRequestDataCollection(SRDC):
53
Collect
tfactl diagcollect
• Collectionsstoredintherepository directory• Changediagcollecttimeframewith–since<n>h|d• Forlistoftypesofsrdc collectionsusetfactldiagcollect-srdc help
tfactl diagcollect -srdc ora600
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
TFAdbglevel profiles• Example
– tfactl dbglevel -setnode_eviction
–wouldbeusedforenhancingdiagnosticswhennode evictions arethebeinginvestigatedandwouldperformthefollowingoperationinternally• crsctl setlogcss "CSSD=4"• crsctl setlogcss "CSSDNMC=4"• crsctl setlogcss "CLSF=4"• crsctl setlogcss "CSSDGMCC=4"• crsctl setlogcss "CSSDGMPC=4"
• Toreverttotheoriginalordefaultlogginglevelsthefollowingcommand– $tfactl dbglevel -unsetnode_eviction
• wouldperformthefollowingoperationsinternally• crsctl setlogcss "CSSD=2"• crsctl setlogcss "CSSDNMC=2"• crsctl setlogcss "CLSF=0"• crsctl setlogcss "CSSDGMCC=2"• crsctl setlogcss "CSSDGMPC=2"
• Inthiswayofsettingthelogginglevelsadegreeofautomationandsimplificationis
OracleConfidential– Internal/Restricted/HighlyRestricted 54
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
LogManagement• TFAwillbelogmanagementinterface forsoftwarestack– Rotatelogs– Archivelogs– Purgeoldlogs
• Intelligentlogmanagementbasedonunderstandingofwhatisinlogsandwhatisstillimportant
OracleConfidential– Internal/Restricted/HighlyRestricted 55
TFALogManagement AllLogsAcross
SoftwareStack
Rotate
Archive
Purge
Actionthroughpredictionoruser
input
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
SetEmailNotificationAddresses
56
tfactl set [email protected]
Automaticallydetectevent
Collect&packagerelevant
diagnostics
NotifyrelevantDBAandorSys
Adminbyemail
Uploadcollectionto
OracleSupportforfurtherhelp
Significantproblemoccurs
1
2
3
4
TFADBA(s)/SysAdmin(s)
OracleGridInfrastructure&Database(s)
tfactl set notificationAddress=oracle:[email protected]
• TFAcansendemailnotificationwhensignificantproblemsaredetected
• Tosetnotificationemailforanyproblemdetected:
• TosetnotificationemailforspecificORACLE_HOMEsincludetheOSowner:
OracleConfidential– Internal
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• Analyzeallimportantrecentlogentries: • Searchrecentlogentries:
57
Analyze
tfactl analyze –since 1d tfactl analyze -search “ora-00600" -since 8h
OracleConfidential– Internal
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• TFAincludesallkeydatabasesupporttools• tfactlprovidesasingleinterfacetothemall
Analyze
58
MostoftheseSupporttoolsareonlyavailableintheMyOracleSupportdownload,theyarenotincludedinthebaseGridorDatabaseinstall
Tool Description DetailsORAchk OracleStackHealthChecksonnon-engineered
systems1268927.2
EXAchk OracleStackHealthChecksonEngineeredSystems
1070954.1
oswatcher CollectandarchiveOSmetrics,usefulforinstance/nodeevictions&performanceIssues
301137.1
procwatcher Automate&capturedatabaseperformancediagnostics&sessionlevelhangs
459694.1
oratop Nearreal-timedatabasemonitoring 1500864.1sqlt CaptureSQLtracedateusefulfortuning 215187.1
alertsummary ProvidessummaryofeventsforoneormoredatabaseorASMalertfilesfromallnodes
ls ListsallfilesTFAknowsaboutforagivenfilenamepatternacrossallnodes
Tool Descriptionpstack Generateprocessstackforspecifiedprocessesacrossallnodes
grep Searchalertortracefileswithagivendatabaseandfilenamepattern,forasearchstring.
summary Highlevelsummaryoftheconfigurationvi Openalertortracefilesforviewingagivendatabaseandfile
namepatterninthevieditortail Runatailonanalertortracefilesforagivendatabaseandfile
namepatternparam ShowalldatabaseandOSparametersthatmatchaspecified
patterndbglevel SetandunsetmultipleCRStracelevelswithonecommandhistory Showtheshellhistoryforthetfactlshellchanges Reportanynotedchangesinthesystemsetupoveragiven
timeperiod.Thisincludesdatabaseaparameters,OSparameters,patchesappliedetc
OracleConfidential– Internal
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
• Usesrdc <incidenttype>:• Tospecifysid use–sid <oraclesid>• Tospecifydatabaseuse–db<dbname>• Tospecifyincidentdate&timeuse–inc_date <YYYY-MM-DD>-inc_time <HH:MM:SS>
• TouploaddirectlytotheSRuse–sr<SR#>
• Fordbperf usetheseparameterstospecifythegood&badperformanceperiodstocompare:
59
IncidentBasedCollectionswithSRDC
tfactl srdc ora4030
IncidentType Descriptionora4030 ForORA-04030errorsora4031 ForORA-04031errorsdbperf Forbasicdbperformanceproblemsora600 For ORA-00600errorsora700 For ORA-00700errorsora7445 For ORA-07445errors
tfactl srdc ora4030 -sid orcl –db RDBMS121 \-inc_date 2016-06-15 -inc_time 02:48:23 \-sr 3-123456789
Parameter Descriptionperf_base_sd Startdateforagoodperformanceperiodperf_base_st Starttimeforagoodperformanceperiodperf_base_ed Enddateforagoodperformanceperiodperf_base_et Endtimeforagoodperformanceperiodperf_comp_sd Startdateforabadperformanceperiodperf_comp_st Starttimeforabadperformanceperiodperf_comp_ed Enddateforabadperformanceperiodperf_comp_et Endtimeforabadperformanceperiod
tfactl srdc dbperf –db RDBMS121 \–perf_base_sd 2016-06-15 –perf_base_st 01:30:00 \–perf_base_ed 2016-06-15 –perf_base_et 02:00:00 \–perf_comp_sd 2016-06-16 –perf_comp_st 09:30:00 \–perf_comp_ed 2016-06-16 –perf_comp_et 10:00:00
OracleConfidential– Internal
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
ClusterHealthAdvisor(CHA)*DiscoversPotentialCluster&DBProblems- NotifieswithCorrectiveActions
60
OSData
GIMR
ochad
• Alwayson- Enabledbydefault• Detectsnodeanddatabaseperformanceproblems
• Provides early-warningalertsandcorrectiveaction
• Supports on-sitecalibrationtoimprovesensitivity
• Integrated intoEMCCIncidentManagerandnotifications
• StandaloneInteractiveGUITool
DBData
CHM
NodeHealth
PrognosticsEngine
DatabaseHealth
PrognosticsEngine
*RequiresandIncludedwithRACorR1NLicense
Confidential– OracleRestricted
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.| 61
Oracle12cHangManager
• Alwayson- Enabledbydefault
• Reliablydetectsdatabasehangsanddeadlocks
• Autonomouslyresolvesthem
• SupportsQoSPerformanceClasses,RanksandPoliciestomaintainSLAs
• Logsalldetectionsandresolutions
• NewSQLinterfacetoconfiguresensitivity(Normal/High)andtracefilesizes
AutonomouslyPreservesDatabaseAvailabilityandPerformance Session
DIA0
EVALUATE
DETECT
ANALYZE
Hung?
VERIFY
Victim
QoSPolicy
Confidential– OracleRestricted
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.| 62
FullResolutionDumpTraceFileandDBAlertLogAuditReportsOracle12cHangManager
Dump file …/diag/rdbms/hm6/hm62/incident/incdir_5753/hm62_dia0_12656_i5753.trcOracle Database 12c Enterprise Edition Release 12.2.0.0.0 - 64bit BetaWith the Partitioning, Real Application Clusters, OLAP, Advanced Analyticsand Real Application Testing optionsBuild label: RDBMS_MAIN_LINUX.X64_151013ORACLE_HOME: …/3775268204/oracleSystem name: LinuxNode name: slc05kyrRelease: 2.6.39-400.211.1.el6uek.x86_64Version: #1 SMP Fri Nov 15 13:39:16 PST 2013Machine: x86_64VM name: Xen Version: 3.4 (PVM)Instance name: hm62Redo thread mounted by this instance: 2Oracle process number: 19Unix process pid: 12656, image: oracle@slc05kyr (DIA0)
*** 2015-10-13T16:47:59.541509+17:00*** SESSION ID:(96.41299) 2015-10-13T16:47:59.541519+17:00*** CLIENT ID:() 2015-10-13T16:47:59.541529+17:00*** SERVICE NAME:(SYS$BACKGROUND) 2015-10-13T16:47:59.541538+17:00*** MODULE NAME:() 2015-10-13T16:47:59.541547+17:00*** ACTION NAME:() 2015-10-13T16:47:59.541556+17:00*** CLIENT DRIVER:() 2015-10-13T16:47:59.541565+17:00
2015-10-13T16:47:59.435039+17:00Errors in file /oracle/log/diag/rdbms/hm6/hm6/trace/hm6_dia0_12433.trc (incident=7353):ORA-32701: Possible hangs up to hang ID=1 detectedIncident details in: …/diag/rdbms/hm6/hm6/incident/incdir_7353/hm6_dia0_12433_i7353.trc2015-10-13T16:47:59.506775+17:00DIA0 requesting termination of session sid:40 with serial # 43179 (ospid:13031) on instance 2
due to a GLOBAL, HIGH confidence hang with ID=1.Hang Resolution Reason: Automatic hang resolution was performed to free a
significant number of affected sessions.DIA0: Examine the alert log on instance 2 for session termination status of hang with ID=1.
In the alert log on the instance local to the session (instance 2 in this case), we see the following:
2015-10-13T16:47:59.538673+17:00Errors in file …/diag/rdbms/hm6/hm62/trace/hm62_dia0_12656.trc (incident=5753):ORA-32701: Possible hangs up to hang ID=1 detectedIncident details in: …/diag/rdbms/hm6/hm62/incident/incdir_5753/hm62_dia0_12656_i5753.trc
2015-10-13T16:48:04.222661+17:00DIA0 terminating blocker (ospid: 13031 sid: 40 ser#: 43179) of hang with ID = 1
requested by master DIA0 process on instance 1Hang Resolution Reason: Automatic hang resolution was performed to free a
significant number of affected sessions.by terminating session sid:40 with serial # 43179 (ospid:13031)
Confidential– OracleRestricted
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.|
Oracle12cDomainServicesCluster(DSC)
63
• HostsFrameworkasServices• Reduceslocalresourcefootprint• Centralizesmanagement• Speedsdeploymentandpatching• OptionalSharedStorage• Supportsmultipleversionsandplatformsgoingforward
DeployswithMinimumFootprintandMaximumManageability
ApplicationMemberCluster
DatabaseMemberCluster
DatabaseMemberCluster
OracleDomainServicesCluster
DatabaseMemberCluster
ApplicationMemberCluster
DatabaseMemberCluster
ORACLECLUSTERDOMAIN
Management Repository ServiceTrace File Analyzer ReceiverORAchk Collection ServiceGrid Names ServiceStorage ServicesRapid Home Provisioning Service
Confidential– OracleRestricted
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.| 64
OracleDomainServicesCluster
OracleClusterDomain
IOServiceACFSServices
ASMService
DatabaseMemberCluster
UsesASMService
DatabaseMemberCluster
UsesIO&ASMServiceofDSC
MgmtRepository(GIMR)Service
ApplicationMemberCluster
GIonly
DatabaseMemberCluster
UseslocalASM
SharedASM
AdditionalOptionalServices
RapidHomeProvisioning
(RHP)Service
PrivateNetwork
SAN
NAS
Confidential– OracleRestricted
Copyright©2016, Oracleand/oritsaffiliates.Allrightsreserved.| 65Confidential– OracleRestricted