102006-rlp open world 2006

27
<Insert Picture Here> R.L. Polk – Moving from the Mainframe to 10g RAC Douglas Miller – Director Database Development R.L. Polk & Company Suresh Yarlagadda - Lead DBA R.L. Polk Darrin Deeter – Manager Data Architecture RLPTechnologies Wayne Taylor – Senior DBA R.L. Polk

Upload: darrin-deeter-mba

Post on 23-Jan-2017

72 views

Category:

Documents


0 download

TRANSCRIPT

<Insert Picture Here>

R.L. Polk – Moving from the Mainframe to 10g RACDouglas Miller – Director Database Development R.L. Polk & CompanySuresh Yarlagadda - Lead DBA R.L. PolkDarrin Deeter – Manager Data Architecture RLPTechnologiesWayne Taylor – Senior DBA R.L. Polk

<Insert Picture Here>

Program Agenda

• R. L. Polk & Company• Project Overview• Project Architecture • Oracle RAC/Grid Footprint• Technical Achievements• Lessons Learned• Future Plans

R. L. Polk & CompanyGlobal Operations

• R.L. Polk & Company is a global company based in Southfield, MI with 1,300 employees and offices in 16 countries.

• Polk is the gold standard for automotive intelligence• What’s selling, Who’s Buying, and How to reach them

• RLPTechnologies is a wholly owned subsidiary of R.L Polk & Co.• RLPTechnologies develops and sells software solutions to revolutionize

the way data is collected, standardized, enhanced and compiled for use in analytical and operational applications.

Polk’s Data Assets• Polk manages a complex set of online vehicle data to

support the automotive industry• ~2.6 Billion

Transactions

• Titles• Registrations• Sales• Lease / Lien

• ~500 Million Unique Vehicles

• Passenger• Commercial

• ~250 Million Unique Households

• Personal• Firms• Financial Institutions• Vehicle Manufacturers

<Insert Picture Here>

Project Overview

Darrin Deeter

Project Overview • Re-engineer Polk’s Data Collection, Standardization,

Enhancement, Storage and Data Assembly Functions• Migrate from a Mainframe environment to a commodity

GRID based infrastructure• Support 1 Billion transactions per year• 5 TB warehouse

• Real-time updates• Concurrent updates and extracts

Business Drivers

• 50 Percent More Efficient • Lower Total Cost of Ownership• Eliminate the Mainframe

• 50 Percent Faster • Improve data processing and timeliness and availability

• 100 Percent Quality• Protect Polk’s rich heritage as the industry standard, and

provide improvements in identifying problems earlier in the process

The scale of the solution introduced a unique set of challenges

Design Principles• Apply the concepts of

lean manufacturing to the discipline of data processing• Continuous Material Flow,

and Just-In-Time Delivery• Quality Measurement• Standard Processes• Eliminate Waste

• Architecture principles must provide a flexible and agile environment• Scalable platform• Provide High-availability• Support high-transaction

volume

Data

Leaving the Mainframe What choice do you have?

• Large SMP server• Blade Servers• Specialized warehouse appliance• RAC cluster of SMP servers• 10g RAC on Commodity

SSOTSingle Source

of Truth

Business Rules

Data Sources

Source ProfilesFile Profiles

Receive Data

Capture

Data Management

Manage Mappings

Source Mapping

Name

Decode

Address

Geo

Dealer

Other

Business Rules

Enhancement Services

Standardize

Enhance Assemble AssembledData

Exception HandlingTask Management Metrics ReportsJob ConsolesOperations Management

Publish

Extract &

Format

Data Dictionary Quality Standards Data UsageData StandardsData Model

Manage Reference Data

Archive File

Assembly Profile

Gat

ekee

per

Validate File

DALData Access Layer

Business Benefits Achieved

• Provided world-class data collection & compilation services• Productivity improvements of up to 70% • Hardware costs lowered by >50%• Lowered IT Total Cost of Ownership by > 30%

• Enabled a Single Source of the Truth• Ensure industry-leading data quality

• Shifted to a grid based computing model to empower capacity on demand• Flexible and open environment providing future computing power,

at significantly reduced costs• Real Time Data Availability

<Insert Picture Here>

Project Architecture

Suresh Yarlagadda

Oracle RAC Physical Architecture

EMC DMX 3000

Dell 6850Node 1

Dell 6850Node 2

Dell 6850Node3

Dell 6850Node 4

Virtual IP

SAN Fabric

Interconnect

Oracle RAC Logical Architecture

Node 1

EMC DMX 3000

InstanceODSP1

ASM : +ASM1

Service : NAS

Service : DAL

Group : DG_ARCH_01

Group : DG_DATA_01

140GB 140GB 140GB

140GB 140GB

140GB 140GB

140GB 140GB

140GB

140GB

140GB

140GB140GB 140GB 140GB

140GB 140GB 140GB 140GB 140GB

140GB140GB 140GB 140GB

140GB

140GB

140GB 140GB 140GB 140GB

140GB

140GB 140GB

140GB 140GB

140GB 140GB 140GB

140GB

140GB

140GB

140GB

140GB

Node 2

InstanceODSP2

ASM : +ASM2

Service : NAS

Service : DAL

Group : DG_ARCH_01

Group : DG_DATA_01

Node 3

InstanceODSP3

ASM : +ASM3

Service : ASMBL

Service : DAL

Group : DG_ARCH_01

Group : DG_DATA_01

Node 4

InstanceODSP4

ASM : +ASM4

Service : ASMBL

Service : DAL

Group : DG_ARCH_01

Group : DG_DATA_01

Physical Storage for ASM Volumes shared

across all cluster nodes

DG_DATA_01Composed of 38 x

148GB LUNs

DG_ARCH_01Composed of 38 x

148GB LUNs

DAL Service Name and Address Service

Assembly Service

Oracle RAC Logical Architecture- with failed node

Node 1

EMC DMX 3000

InstanceODSP1

ASM : +ASM1

Service : NAS

Service : DAL

Group : DG_ARCH_01

Group : DG_DATA_01

140GB 140GB 140GB

140GB 140GB

140GB 140GB

140GB 140GB

140GB

140GB

140GB

140GB140GB 140GB 140GB

140GB 140GB 140GB 140GB 140GB

140GB140GB 140GB 140GB

140GB

140GB

140GB 140GB 140GB 140GB

140GB

140GB 140GB

140GB 140GB

140GB 140GB 140GB

140GB

140GB

140GB

140GB

140GB

Node 2

InstanceODSP2

ASM : +ASM2

Service : NAS

Service : DAL

Group : DG_ARCH_01

Group : DG_DATA_01

Node 3

InstanceODSP3

ASM : +ASM3

Service : ASMBL

Service : DAL

Group : DG_ARCH_01

Group : DG_DATA_01

Node 4

InstanceODSP4

ASM : +ASM4

Service : ASMBL

Service : DAL

Group : DG_ARCH_01

Group : DG_DATA_01

Physical Storage for ASM Volumes shared

across all cluster nodes

DG_DATA_01Composed of 38 x

148GB LUNs

DG_ARCH_01Composed of 38 x

148GB LUNs

DAL ServiceName and

Address Service Assembly Service

Service : NAS

Warehouse Design OverviewCore WarehouseSupports OLTP style updates

and Batch style extracts

• Fact tables are Range Partitioned

• Dimension Tables are Hash Partitioned

• Compression is used on older data

Oracle RAC/Grid Footprint• Oracle 10g Release 2 (10.2.0.2 patch)• Oracle Real Application Clusters (RAC)• Oracle Cluster-ware• Oracle Automatic Storage Manager (ASM)• Oracle Enterprise Manager - Grid Control (10.2.0.2)• Dell 6850 – 4 Node cluster

• 4 Dual Core Xeon CPU 3.0GHz• 12GB RAM • Intel Gb Ethernet Copper (dual port bonded)• Emulex LP10000 2GB Fiber HBA

• Red Hat Advanced Server 3U7• EMC Storage (DMX 3000)

• Database Allocated 38x140GB LUNs (5.2TB)• Flashback area for Archiving and mirrors 2x140GB LUNs (256GB)• CRS/OCR Multiplexed /5 1.5GB Disk

• Cisco 9509 Fiber Switch

Technical Achievements• Successful implementation of cutting edge Oracle 10gR2 RAC • Entire environment composed of commodity Intel based

architecture• Commodity Intel based architecture allows more frequent

upgrades as newer hardware becomes available• Upgraded all servers with dual Core servers with little to no

downtime• Oracle RAC Cluster can be easily expanded• All monitoring using Oracle Enterprise Manager Grid Control• ASM significantly simplifies storage management• Migrated 3TB of data from legacy data warehouse• XML Documents processed at a rate of 400,000 per hour.

<Insert Picture Here>

Lessons Learned

Wayne Taylor

Lessons Learned - Configuration

• Verify that entire architecture is Oracle certified• Install latest patches• Baseline performance of Cluster architecture• Oracle RAC sensitive to excessive latency • Temporary Tablespace• Define Service Names with Oracle Database Creation

Assistant• Red Hat 3 Kernel Hyper-threading issues

Lessons Learned - Management• RMAN issues backing up big-file tablespaces • Use separate ASM disk groups for DATA and flash

recovery• Manage Cluster Ready Services log/traces• Verify Instance Listener Cross Registration• Problems shrinking temporary and UNDO

tablespaces online

Lessons Learned - Performance• Partitioning Issues• Tablespace Segment Sizing• ASSM vs Manual• Concurrent DML• Parallel Queries• Work Area Size Policy• Referential Integrity / Constraints• VLM on Linux

Lessons Learned – What went well• ASM

• Performance is very close to RAW devices• Online rebalancing works (with 10.2.0.2 patch-set)• Online addition / removal of disks

• Oracle Enterprise Manager• Diagnostic Reports

• Automated Database Diagnostic Monitor• Active Session History• Active Workload Monitor

• ASM Management now included• Useful performance metrics for I/O and Interconnect

• Adding Nodes to RAC cluster• Server and Client load balancing• Saw no performance overhead adding two cluster nodes• Additional nodes gave near linear improvement in scalability

<Insert Picture Here>

Future Plans

Future Plans• Red Hat Advanced Server 4 / 5• Intel / AMD 64bit / Blade Servers• Expanding Oracle RAC grid and incorporating

additional Data Warehouses• Consolidating Clusters into single GRID• Evaluation of commodity storage arrays• Adding additional fiber HBAs• Fully utilize Oracle Enterprise Manager for

provisioning and patching• Expanding use of Oracle Service Names

AQ&Q U E S T I O N SQ U E S T I O N SA N S W E R SA N S W E R S

• Douglas Miller – Director Database Development R.L. Polk & Company

[email protected]• Suresh Yarlagadda - Lead DBA R.L. Polk

[email protected]• Darrin Deeter – Manager Data Architecture

RLPTechnologies [email protected]

• Wayne Taylor – Senior DBA R.L. Polk [email protected]