102006-rlp open world 2006
TRANSCRIPT
<Insert Picture Here>
R.L. Polk – Moving from the Mainframe to 10g RACDouglas Miller – Director Database Development R.L. Polk & CompanySuresh Yarlagadda - Lead DBA R.L. PolkDarrin Deeter – Manager Data Architecture RLPTechnologiesWayne Taylor – Senior DBA R.L. Polk
<Insert Picture Here>
Program Agenda
• R. L. Polk & Company• Project Overview• Project Architecture • Oracle RAC/Grid Footprint• Technical Achievements• Lessons Learned• Future Plans
R. L. Polk & CompanyGlobal Operations
• R.L. Polk & Company is a global company based in Southfield, MI with 1,300 employees and offices in 16 countries.
• Polk is the gold standard for automotive intelligence• What’s selling, Who’s Buying, and How to reach them
• RLPTechnologies is a wholly owned subsidiary of R.L Polk & Co.• RLPTechnologies develops and sells software solutions to revolutionize
the way data is collected, standardized, enhanced and compiled for use in analytical and operational applications.
Polk’s Data Assets• Polk manages a complex set of online vehicle data to
support the automotive industry• ~2.6 Billion
Transactions
• Titles• Registrations• Sales• Lease / Lien
• ~500 Million Unique Vehicles
• Passenger• Commercial
• ~250 Million Unique Households
• Personal• Firms• Financial Institutions• Vehicle Manufacturers
Project Overview • Re-engineer Polk’s Data Collection, Standardization,
Enhancement, Storage and Data Assembly Functions• Migrate from a Mainframe environment to a commodity
GRID based infrastructure• Support 1 Billion transactions per year• 5 TB warehouse
• Real-time updates• Concurrent updates and extracts
Business Drivers
• 50 Percent More Efficient • Lower Total Cost of Ownership• Eliminate the Mainframe
• 50 Percent Faster • Improve data processing and timeliness and availability
• 100 Percent Quality• Protect Polk’s rich heritage as the industry standard, and
provide improvements in identifying problems earlier in the process
The scale of the solution introduced a unique set of challenges
Design Principles• Apply the concepts of
lean manufacturing to the discipline of data processing• Continuous Material Flow,
and Just-In-Time Delivery• Quality Measurement• Standard Processes• Eliminate Waste
• Architecture principles must provide a flexible and agile environment• Scalable platform• Provide High-availability• Support high-transaction
volume
Data
Leaving the Mainframe What choice do you have?
• Large SMP server• Blade Servers• Specialized warehouse appliance• RAC cluster of SMP servers• 10g RAC on Commodity
SSOTSingle Source
of Truth
Business Rules
Data Sources
Source ProfilesFile Profiles
Receive Data
Capture
Data Management
Manage Mappings
Source Mapping
Name
Decode
Address
Geo
Dealer
Other
Business Rules
Enhancement Services
Standardize
Enhance Assemble AssembledData
Exception HandlingTask Management Metrics ReportsJob ConsolesOperations Management
Publish
Extract &
Format
Data Dictionary Quality Standards Data UsageData StandardsData Model
Manage Reference Data
Archive File
Assembly Profile
Gat
ekee
per
Validate File
DALData Access Layer
Business Benefits Achieved
• Provided world-class data collection & compilation services• Productivity improvements of up to 70% • Hardware costs lowered by >50%• Lowered IT Total Cost of Ownership by > 30%
• Enabled a Single Source of the Truth• Ensure industry-leading data quality
• Shifted to a grid based computing model to empower capacity on demand• Flexible and open environment providing future computing power,
at significantly reduced costs• Real Time Data Availability
Oracle RAC Physical Architecture
EMC DMX 3000
Dell 6850Node 1
Dell 6850Node 2
Dell 6850Node3
Dell 6850Node 4
Virtual IP
SAN Fabric
Interconnect
Oracle RAC Logical Architecture
Node 1
EMC DMX 3000
InstanceODSP1
ASM : +ASM1
Service : NAS
Service : DAL
Group : DG_ARCH_01
Group : DG_DATA_01
140GB 140GB 140GB
140GB 140GB
140GB 140GB
140GB 140GB
140GB
140GB
140GB
140GB140GB 140GB 140GB
140GB 140GB 140GB 140GB 140GB
140GB140GB 140GB 140GB
140GB
140GB
140GB 140GB 140GB 140GB
140GB
140GB 140GB
140GB 140GB
140GB 140GB 140GB
140GB
140GB
140GB
140GB
140GB
Node 2
InstanceODSP2
ASM : +ASM2
Service : NAS
Service : DAL
Group : DG_ARCH_01
Group : DG_DATA_01
Node 3
InstanceODSP3
ASM : +ASM3
Service : ASMBL
Service : DAL
Group : DG_ARCH_01
Group : DG_DATA_01
Node 4
InstanceODSP4
ASM : +ASM4
Service : ASMBL
Service : DAL
Group : DG_ARCH_01
Group : DG_DATA_01
Physical Storage for ASM Volumes shared
across all cluster nodes
DG_DATA_01Composed of 38 x
148GB LUNs
DG_ARCH_01Composed of 38 x
148GB LUNs
DAL Service Name and Address Service
Assembly Service
Oracle RAC Logical Architecture- with failed node
Node 1
EMC DMX 3000
InstanceODSP1
ASM : +ASM1
Service : NAS
Service : DAL
Group : DG_ARCH_01
Group : DG_DATA_01
140GB 140GB 140GB
140GB 140GB
140GB 140GB
140GB 140GB
140GB
140GB
140GB
140GB140GB 140GB 140GB
140GB 140GB 140GB 140GB 140GB
140GB140GB 140GB 140GB
140GB
140GB
140GB 140GB 140GB 140GB
140GB
140GB 140GB
140GB 140GB
140GB 140GB 140GB
140GB
140GB
140GB
140GB
140GB
Node 2
InstanceODSP2
ASM : +ASM2
Service : NAS
Service : DAL
Group : DG_ARCH_01
Group : DG_DATA_01
Node 3
InstanceODSP3
ASM : +ASM3
Service : ASMBL
Service : DAL
Group : DG_ARCH_01
Group : DG_DATA_01
Node 4
InstanceODSP4
ASM : +ASM4
Service : ASMBL
Service : DAL
Group : DG_ARCH_01
Group : DG_DATA_01
Physical Storage for ASM Volumes shared
across all cluster nodes
DG_DATA_01Composed of 38 x
148GB LUNs
DG_ARCH_01Composed of 38 x
148GB LUNs
DAL ServiceName and
Address Service Assembly Service
Service : NAS
Warehouse Design OverviewCore WarehouseSupports OLTP style updates
and Batch style extracts
• Fact tables are Range Partitioned
• Dimension Tables are Hash Partitioned
• Compression is used on older data
Oracle RAC/Grid Footprint• Oracle 10g Release 2 (10.2.0.2 patch)• Oracle Real Application Clusters (RAC)• Oracle Cluster-ware• Oracle Automatic Storage Manager (ASM)• Oracle Enterprise Manager - Grid Control (10.2.0.2)• Dell 6850 – 4 Node cluster
• 4 Dual Core Xeon CPU 3.0GHz• 12GB RAM • Intel Gb Ethernet Copper (dual port bonded)• Emulex LP10000 2GB Fiber HBA
• Red Hat Advanced Server 3U7• EMC Storage (DMX 3000)
• Database Allocated 38x140GB LUNs (5.2TB)• Flashback area for Archiving and mirrors 2x140GB LUNs (256GB)• CRS/OCR Multiplexed /5 1.5GB Disk
• Cisco 9509 Fiber Switch
Technical Achievements• Successful implementation of cutting edge Oracle 10gR2 RAC • Entire environment composed of commodity Intel based
architecture• Commodity Intel based architecture allows more frequent
upgrades as newer hardware becomes available• Upgraded all servers with dual Core servers with little to no
downtime• Oracle RAC Cluster can be easily expanded• All monitoring using Oracle Enterprise Manager Grid Control• ASM significantly simplifies storage management• Migrated 3TB of data from legacy data warehouse• XML Documents processed at a rate of 400,000 per hour.
Lessons Learned - Configuration
• Verify that entire architecture is Oracle certified• Install latest patches• Baseline performance of Cluster architecture• Oracle RAC sensitive to excessive latency • Temporary Tablespace• Define Service Names with Oracle Database Creation
Assistant• Red Hat 3 Kernel Hyper-threading issues
Lessons Learned - Management• RMAN issues backing up big-file tablespaces • Use separate ASM disk groups for DATA and flash
recovery• Manage Cluster Ready Services log/traces• Verify Instance Listener Cross Registration• Problems shrinking temporary and UNDO
tablespaces online
Lessons Learned - Performance• Partitioning Issues• Tablespace Segment Sizing• ASSM vs Manual• Concurrent DML• Parallel Queries• Work Area Size Policy• Referential Integrity / Constraints• VLM on Linux
Lessons Learned – What went well• ASM
• Performance is very close to RAW devices• Online rebalancing works (with 10.2.0.2 patch-set)• Online addition / removal of disks
• Oracle Enterprise Manager• Diagnostic Reports
• Automated Database Diagnostic Monitor• Active Session History• Active Workload Monitor
• ASM Management now included• Useful performance metrics for I/O and Interconnect
• Adding Nodes to RAC cluster• Server and Client load balancing• Saw no performance overhead adding two cluster nodes• Additional nodes gave near linear improvement in scalability
Future Plans• Red Hat Advanced Server 4 / 5• Intel / AMD 64bit / Blade Servers• Expanding Oracle RAC grid and incorporating
additional Data Warehouses• Consolidating Clusters into single GRID• Evaluation of commodity storage arrays• Adding additional fiber HBAs• Fully utilize Oracle Enterprise Manager for
provisioning and patching• Expanding use of Oracle Service Names
• Douglas Miller – Director Database Development R.L. Polk & Company
[email protected]• Suresh Yarlagadda - Lead DBA R.L. Polk
[email protected]• Darrin Deeter – Manager Data Architecture
RLPTechnologies [email protected]
• Wayne Taylor – Senior DBA R.L. Polk [email protected]