rac+asm: stories to share

60
RAC+ASM 3 years in production Stories to share Presented by: Christo Kutrovsky

Upload: kutrovsky

Post on 13-May-2015

9.016 views

Category:

Technology


1 download

DESCRIPTION

RAC+ASM: Lessons learned after 2 years in productionManaging over 70 databases for 4 major customers, I have some good stories to share. Running almost all possible combinations of ASM, RAC, NETAPP and NFS. Success, failure and gotchas. This presentation is the equivalent of years of experience, condensed in major highlights in 45 minutes. To list a few stories:

TRANSCRIPT

Page 1: RAC+ASM: Stories to Share

RAC+ASM 3 years in production Stories to sharePresented by: Christo Kutrovsky

Page 2: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company2

Who Am I

•Oracle ACE•10 years in Oracle field• Joined Pythian 2003•Part of Pythian Consulting Team

•Special projects•Performance tuning•Critical services

“oracle pinup”

Page 3: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company©The Pythian Group 

Pythian Facts

Founded in 1997

90 employees

120 customers worldwide

20 customers more than $1 billion in revenue

5 offices in 5 countries

10 years profitable private company

Page 4: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company©The Pythian Group 

What Pythian does

Pythian provides database and application infrastructure services.

Page 5: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company5

Agenda

•2 nodes RAC•ASMLIB with multipathing•Migrating to new servers with ASM•Thin provisioning•ASM + restores = danger•Device naming conventions•spfile location• JBOD configuration

Page 6: RAC+ASM: Stories to Share

6

2 Node RAC for High Availability

Page 7: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company7

2 Node RACs for HA

•Two node RAC nodes•13 databases•Dev databases

•Shutdown databases (and ASM) on node1•Perform maintenance

•Unplug the interconnect cable•What happens?

Page 8: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company8

2 nodes RAC

Interconnect

ASMDG

OCR/V

VIP VIP

Node 1

Node 2

SID_A1

SID_B1

SID_A2

SID_B2

Fibre Channel

Page 9: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company9

2 nodes RAC

Interconnect

ASMDG

OCR/V

VIP VIP

Node 1

Node 2

SID_A1

SID_B1

SID_A2

SID_B2

Fibre Channel

Page 10: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company1

0

Interconnect

ASMDG

OCR/V

VIP VIP

Node 1

Node 2

SID_A1

SID_B1

SID_A2

SID_B2

Fibre Channel

I can’tSee Node 1

I can’tSee Node 2

Page 11: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company1

1

One is not Quorum

•50% chance your working node gets restarted

•Depends on clusterware version•Who will shoot the other guy first

Page 12: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company1

2

One is not Quorum

•Conclusion?•Turn off clusterware when you have only 2 nodes and performing maintenance

•Upgrade to a more predicable clusterware

•Lowest ‘leader’ always survives•Add a 3th tie-breaker node

•doesn’t have to run a database, just clusterware (observer)

Page 13: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company1

3

One is not Quorum

Production cases, what happens if•All Network dies on one node?•All disk dies on one node?

Page 14: RAC+ASM: Stories to Share

14

ASMLIBwith Multi Pathing

Page 15: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company1

5

Building ASMLIB devices when multipathing is present•Devices used for creating asmlib

•/dev/emcpowerc1•/dev/mapper/raid10_data_disk

•Devices used to create asm diskgroup•ASMLIB

•The reboot changes everything•ASMLIB re-discovers the devices without multipath

•Difficult to diagnose

Page 16: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company1

6

Visual

LUN_1 LUN_2

/dev/sdb

/dev/sdc

/dev/sdd

/dev/sde

HBA1

HBA2

/dev/mapper/data1

/dev/mapper/data2

Page 17: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company1

7

Building ASMLIB devices when multipathing is present•Do not use ASMLIB

• If you have to (why?)•Must setup “ORACLEASM_SCANORDER”

•asm_diskstring parameter•Permissions

•Udev files•Boot/startup script

Page 18: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company1

8

Removing ASMLIB

•Why•Extra layer•Requires new driver for every new kernel

•Can cause downtime if not careful•ASMLIB header is the same as ASM DISK header

• Just has extra field for ASMLIB name•Disks can be accessed directly, without ASMLIB without having to drop/recreate them

Page 19: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company1

9

Removing ASMLIB

•Unmount all affected diskgroups•Change or set asm_diskstring•Remount diskgroups via new paths

•Can be done in rolling fashion in RAC

Page 20: RAC+ASM: Stories to Share

20

SAN Migration

Page 21: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company2

1

Migrating from EMC to 3PAR

•New SAN•New concept

•Thin provisioning•A big project

•Or not

Page 22: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company2

2

Add/drop/go home

•No brainer•Thin provisioning rocks

•SA adds disks•Add disk to diskgroup•Drop all old disks•Wait•Never be paged on space

Page 23: RAC+ASM: Stories to Share

23

Server Migration

Page 24: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company2

4

Server migration

•Current setup•2 nodes RAC with ASM

•New servers•Better, Faster, Stronger

•Fastest (effort wise) way to migrate, with minimal downtime

•Possible with zero downtime

Page 25: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company2

5

Server migration options

•Create standby on new server•Requires extra copy of data

•Add the new nodes, drop existing ones•Possible clusterware issues

•Move the LUNs•Easy•New servers tested

Page 26: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company2

6

Lun Migration

• Install clusterware and create RAC database with same name

•Test hardware / wiring / configuration•Migrate

•Stop production•Re-assigning LUNs•Start production

Page 27: RAC+ASM: Stories to Share

27

ASM Restore creates database black hole

Page 28: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company2

8

ASM + Same host restore = DANGER

•Production database•Diskgroup +PROD

•Snapshot database•Diskgroup +SNAP

•Rebuild monthly via duplicate database•Except this one time…

Page 29: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company2

9

The concept

•“SNAP” backups not taken• If a given “SNAP” backup is to be restored, simply re-create the given “PROD” backup

• Independent from Production

Page 30: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company3

0

Restore with ASM

•Restore FRA files into separate directory•Startup SNAP instance•Catalog backup files•Restore into SNAP diskgroup

•The missing piece?“restore” writes into original backup file location

•Must use “set new name for datafile” in run block

Page 31: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company3

1

Restore with ASM – the result

•Unrecoverable corruption on production database

•Lost about 3-4 hours of changes• If this was filesystem and not ASM, no corruption would have occured

Page 32: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company3

2

Corruption – what happened

SGA

Disk

REDO

5row

s

5row

s

5row

s

5row

s

2row

s

2row

s

2row

s

Disk5

rows

5row

s

5 row

s

5 row

s

BLK1 add Row 6BLK3 add Row 3

Original datafile

Partially overwritten datafile

Page 33: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company3

3

Corruption – what should’ve hap.

SGA

Disk

REDO

5row

s

5row

s

5row

s

5row

s

2row

s

2row

s

Disk5

rows

5row

s

5 row

s

5 row

s

BLK1 add Row 6BLK3 add Row 3

Original datafile

Partially overwritten datafile

BLK3 add Row 6

5row

s

Page 34: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company3

4

Corruption – what happened

Page 35: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company3

5

Corruption

•Why this wouldn’t have happened with filesystem?

•File names are just pointers to data stream

• If a file is re-created, a new data streams is associated with it

•Processes that have the file currently open still use the old data stream

•This is why “undelete” is possible•My blog about undeleting files

Page 36: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company3

6

Open “file 1”Corruption

Stream X1

File 1 Process 1

Page 37: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company3

7

Open “file 1”Corruption – recreate File 1

Stream X1

File 1 Process 1

Stream X2

Page 38: RAC+ASM: Stories to Share

38

Device names convention causes user error

Page 39: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company3

9

Device naming conventions

•Using /dev/mapper/<name>•Asm uses <name>p1 – first partition

•Permissions set script uses: “*p1” •Then came /dev/mapper/backup1

•First partition is: /dev/mapper/backup1p1

Page 40: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company4

0

Device naming conventions

•V$ASM_DISKPATH HEADER_STATUS

--------------------------- -------------

/dev/mapper/backup1 CANDIDATE

/dev/mapper/redop1 MEMBER

/dev/mapper/backup1p1 MEMBER

/dev/mapper/data2p1 MEMBER

/dev/mapper/data1p1 MEMBER

Page 41: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company4

1

Naming conventions

DISK

Partition 1

IN USE

ADDED

Page 42: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company4

2

New convention

•Now we use generic names, as we do re-assign disks

•We also use prefix and suffix with a clear dilimiter

/dev/mapper/asm-raid5-dev01-part1

Page 43: RAC+ASM: Stories to Share

43

spfile location in RAC

Page 44: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company4

4

spfile location

• Intended configuration• init.oraspfile=‘+ASM_DSKGRP/dbname.spfile’

•no spfile

Page 45: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company4

5

Changing parameters in masses

•create pfile=‘your_initials.ora’ from spfile;•edit•create spfile=‘+ASM_DSK/spfile’ from pfile=‘ck.ora’

Page 46: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company4

6

What not to do

•create pfile from spfile•edit•create spfile from pfile;

Page 47: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company4

7

Result

•One node uses local spfile•Other(s) uses global spfile•Parameter changes to “BAD” node are sent to other nodes

•not persistent on GOOD nodes•persistent on BAD nodes

•Paramer changes on GOOD nodes have reversed behaviour

Page 48: RAC+ASM: Stories to Share

48

Adding ASM disks crashes databases

Page 49: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company4

9

Adding disks

•Must be visible on all servers•Otherwise your diskgroup gets dismounted on the nodes that don’t see the disk

•All databases using this diskgroup crash

Page 50: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company5

0

ASM add disk process

1. Is the disk visible locally?2. Initialize disk header, add it to

diskgroup3. Notify all nodes to rescan disks and

add the new disk4. If one or more nodes cannot see the

disk, raise error5. Dismount diskgroup on all nodes not

seeing the new disk

Page 51: RAC+ASM: Stories to Share

51

ASM with JBODwelcomes simplicity

Page 52: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company5

2

JBOD Configuration

•Linux Datawarehouse•10 TB space•28 disks of 430/285 GB•All redundancy/striping provided by ASM

Page 53: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company5

3

JBOD Configuration

•Simplicity•No ASMLIB•Straight devices

•Naming convention – use only 1 partition, and use partition 4

•/dev/sd*4 • is ASM partition• is permissions wildcard• is asm_diskstring

Page 54: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company5

4

Testing your speed

•Verify read speed of each device•Verifies each device is performing as expected

•Verify read speed from all devices•Verify your total bandwith

•Verify read speed from all devices, towards the end of the device

•Disk read speed is not linear

Page 55: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company5

5

Read Speed of a single disk

* Courtesy google image search

Page 56: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company5

6

Testing your speed

•One device at a timefor dsk in /dev/sd[c-q]; do echo $dsk; dd if=$dsk of=/dev/null iflag=direct bs=2M count=100; done• All devices (total bandwith)for dsk in /dev/sd[c-q]; do echo $dsk; dd if=$dsk of=/dev/null iflag=direct bs=2M count=100 &; done• Test end speed

• Add SKIP=x

Page 57: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company5

7

Sample output/dev/sdc100+0 records in100+0 records out209715200 bytes (210 MB) copied, 1.60325 seconds, 131 MB/s/dev/sdd100+0 records in100+0 records out209715200 bytes (210 MB) copied, 1.60188 seconds, 131 MB/s/dev/sde100+0 records in100+0 records out209715200 bytes (210 MB) copied, 1.60067 seconds, 131 MB/s/dev/sdf100+0 records in100+0 records out209715200 bytes (210 MB) copied, 1.59928 seconds, 131 MB/s/dev/sdg100+0 records in100+0 records out

Page 58: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company5

8

JBOD configuration

•Disk adding/removal is very easy•Add disks in bulk:alter diskgroup XXX add disk ‘/dev/sd[c-q]4’;

•Performance rocks•controller speed

•Diagnostic is easy• Iostat –x 5 /dev/sd*4

•Manageability is easy•1 diskgroup – no filenames, no mountpoints

Page 59: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company5

9

Final Thoughts

•RAC for HA requires 3 nodes•ASM

•Keep it simple•Reduce layers•Runs fast•Still need to be carefull

Page 60: RAC+ASM: Stories to Share

© 2009/2010 Pythian - Presentation for ABC Company6

0

The End

Thank You

Questions?

I blog at

http://www.pythian.com/news/author/kutrovsky/