presentation - oracle exadata v2 - technical deep dive

38
<Insert Picture Here> Next Generation Oracle Exadata Storage Server and Oracle Database Machine - V2 Copyright © 2009, Oracle Corporation and/or its affiliates

Upload: kinankazuki104

Post on 20-Oct-2015

61 views

Category:

Documents


2 download

DESCRIPTION

Presentation - Oracle Exadata V2 - Technical Deep Dive

TRANSCRIPT

  • Next Generation Oracle Exadata Storage Server and

    Oracle Database Machine -V2

    Copyright 2009, Oracle Corporation and/or its affiliates

  • Exadata Success

    Exadata is succeeding in all geographies and industries

    against every competitor

    Copyright 2009, Oracle Corporation and/or its affiliates

    2

  • The Products

    Exadata Storage Server & Database Machine

    Exadata Storage Server

    Storage Product Optimized for Oracle

    Database

    Extreme I/O and SQL Processing

    performance

    Combination of hardware and

    software

    Sun Oracle Database Machine

    Pre-Configured High Performance

    Balanced performance

    configuration

    Takes the guess work out of

    building an Oracle deployment

    Exadata Storage Server Software

    Copyright 2009, Oracle Corporation and/or its affiliates

    3

    Exadata Storage Server Software

    Exadata Storage Server Software

    Oracle Database 11.2

  • Why Exadata

    Extreme Performance:

    Up to 10X-100X for data warehouses

    Up to 20X for OLTP applications

    Linear Scalability:

    Performance scales linearly with increase in data volumes.

    Enterprise Ready:

    Copyright 2009, Oracle Corporation and/or its affiliates

    4

    Enterprise Ready:

    Complete system ready to deploy on day one

    Single point of support from Oracle for hardware and software

    Standard:

    Works transparently with existing applications.

    Manage your databases and applications the same way you do

    today.

  • Exadata Smart Storage

    Breaks Data Bandwidth and Random I/O Bottleneck

    Oracle addresses data bandwidth bottleneck 3

    ways

    Massively parallel storage gridof high performance

    Exadata storage servers (cells).

    Data bandwidth scales with data volume

    Data intensive processingruns in Exadata storage.

    Smart Scans-Queries run in storage as data

    streams from disk, offloading database server

    CPUs

    Exadata Storage Cells

    Copyright 2009, Oracle Corporation and/or its affiliates

    5

    CPUs

    Storage IndexesEliminate unnecessary I/Os to

    disk

    Exadata Hybrid Columnar Compression reduces data

    volume up to 10x

    10x lower cost, 10x higher performance

    Oracle solves random I/O bottlenecks using Exadata

    Smart Flash Cache

    Smart caching increases random I/Os by factor of 20X

    Sub-millisecond response times

  • Exadata V2 Hardware

    Copyright 2009, Oracle Corporation and/or its affiliates

    6

  • Sun Oracle Database Machine

    Grid is the architecture of the future

    Highest performance, lowest cost, redundant, incrementally scalable

    Sun Oracle Database Machine delivers the first and only complete

    grid architecture for all data management needs

    RAC Database Server Grid

    Copyright 2009, Oracle Corporation and/or its affiliates

    7

    Exadata Storage Server Grid

    14 High-performance low-cost

    storage servers

    100 TB raw SAS disk storage

    or

    336 TB raw SATA disk storage

    5TB+ flash storage!

    RAC Database Server Grid

    8 High-performance low-cost

    compute servers

    2 Intel quad-core Xeons each

    InfiniBand Network

    40 Gb/sec fault-tolerant unified

    server and storage network

    3 36-port QDR Sun Datacenter

    InfiniBand Switches

  • Sun Exadata Storage Server Hardware

    24 GB DRAM (6 x 4GB)

    12 x 3.5 Disk Drives

    (600GB SAS or 2TB SATA)

    Disk Controller

    HBA with 512M

    battery backed

    cacheILOMDual-redundant, hot-

    swappable power supplies

    Copyright 2009, Oracle Corporation and/or its affiliates

    8

    (600GB SAS or 2TB SATA)

    2 Quad-Core Intel

    XeonE5540 Processors

    Installed Software:

    Oracle Exadata Storage Server Software

    Oracle Enterprise Linux

    Drivers

    cache

    InfiniBand QDR

    (40Gb/s) dual

    port card

    4 x 96GB Sun

    Flash PCIe

    Cards

  • X4170 Database Server Hardware

    2 Quad-Core Intel

    4 x 2.5 146GB Disk Drives

    Dual-redundant, hot-

    swappable power supplies

    72 GB DRAM (18 x 4GB)

    ILOM

    4 x 1GbE Interfaces

    Copyright 2009, Oracle Corporation and/or its affiliates

    9

    2 Quad-Core Intel

    XeonE5540 Processors

    InfiniBand QDR

    (40Gb/s) dual

    port card

    Disk Controller

    HBA with 512M

    battery backed

    cache

    Installed Software:

    Oracle Enterprise Linux

    Oracle Database 11.2 Software

    Drivers

  • InfiniBand Network

    Unified InfiniBand Network

    Storage Network

    RAC Interconnect

    External Connectivity (optional)

    High Performance, Low Latency

    Network

    80 Gb/s bandwidth per link (40 Gb/s

    each direction)

    Sun Datacenter

    InfiniBand Switch 36

    Copyright 2009, Oracle Corporation and/or its affiliates

    10

    each direction)

    SAN-like Efficiency (Zero copy, buffer

    reservation)

    Simple manageability like IP network

    Protocols

    Zero-copy Zero-loss Datagram Protocol

    (ZDP RDSv3)

    Internet Protocol over InfiniBand (IPoIB)

  • Start Small and Grow

    Full

    Rack

    Half

    Rack

    Quarter

    Rack

    Copyright 2009, Oracle Corporation and/or its affiliates

    11

    Raw Disk SAS

    SATA

    Raw Flash

    21 TB

    72 TB

    1.1 TB

    50 TB

    168 TB

    2.6 TB

    100 TB

    336 TB

    5.3 TB

  • Scale to Multiple Racks

    Copyright 2009, Oracle Corporation and/or its affiliates

    12

    Multi-Petabyte Databases-Scale up to Hundreds of

    Storage Servers

    Scale up to 8 Database Machine Full Racks by using more

    InfiniBand cables

    Scale to more Racks by using External InfiniBand Switches

  • Exadata Product Performance

    Full Rack

    Half Rack

    Quarter

    Rack

    Raw Disk Data

    Bandwidth1,3

    SAS

    21 GB/s

    10.5 GB/s

    4.5 GB/s

    SATA

    12 GB/s

    6 GB/s

    2.5 GB/s

    Copyright 2009, Oracle Corporation and/or its affiliates

    13

    Raw Flash Data Bandwidth1,3

    50 GB/s

    25 GB/s

    11 GB/s

    Flash IOPS2,3

    1,000,000

    500,000

    225,000

    1 Bandwidth is peak physical disk scan bandwidth, assuming no compression. .

    2 IOPs Based on IO requests of size 8K

    3 -Actual performance will vary by application.

  • Exadata V2 vs. V1 Hardware Comparisons

    Same architecture as previous Database Machine

    Same number and type of Servers, CPUs, Disks

    New

    Latest Technologies

    Faster

    80% Faster CPUs

    100% Faster Networking

    Xeon 5500 Series

    40 Gb InfiniBand

    Copyright 2009, Oracle Corporation and/or its affiliates

    14

    Plus Flash Storage!

    Better

    33% More SAS Disk Capacity

    100% More SATA Disk Capacity

    50% Faster Disk Throughput

    125% More Memory

    200% Faster Memory

    100% More Ethernet Connectivity

    600 GB SAS Disks

    2 TB SATA Disks

    6 Gb SAS Links

    72 GB per DB Node

    DDR3 DRAM

    4 Ethernet links per DB Node

    New

  • Exadata V2 Flash

    Copyright 2009, Oracle Corporation and/or its affiliates

    15

  • Sun FlashFire technology

    Based on Open Flash Module

    Minimum size / Maximum performance

    NOT an SSD drive... This is Flash memory.

    Two products announced at OOW

    F5100 flash array Used in TPC-C benchmark announced in

    Copyright 2009, Oracle Corporation and/or its affiliates

    16

    F5100 flash array Used in TPC-C benchmark announced in

    Larry's keynote Sunday eve.

    Sun Flash Accelerator F20 PCIe Card

  • NAND Basics

    Copyright 2009, Oracle Corporation and/or its affiliates

    17

    Single Level Cell(SLC)

    >30MB/sec per Die (Write)

    >Data Retention: 10 years

    >Endurance: 100,000 cycles

    >1.8-2.0x price of MLC

    Multi Level Cell MLC

    >10MB/sec per Die (Write)

    >Data Retention: 3-4 years

    >Endurance: 1,500 cycles

    >2x Capacity of SLC

  • True Enterprise Flash

    I/O pattern and wear specifications confusing

    Customer must always design for worst case

    Service level agreements are everything

    Sun/Samsung Any I/O standard

    Designed for worst case I/O pattern

    Copyright 2009, Oracle Corporation and/or its affiliates

    18

    Designed for worst case I/O pattern

    100% Write Duty

    100% Read Duty

    Any combination including data retention

  • Sun Flash Accelerator F20 PCIe Card

    96GB Storage Capacity

    4 x 24GB Flash modules/doms

    6GB reserved for failures

    x8 PCIe card

    Super Capacitor backup

    Optimized for Database caching

    Copyright 2009, Oracle Corporation and/or its affiliates

    19

    Fully integrated into the Exadata V2

    database machine.

    Measured end-to-end performance

    3.6GB/sec/cell

    75,000 IOPs/cell

  • FlashFire in Exadata V2

    Cell cache on the storage cells.

    Write-though cache, transparently used to accelerate

    reads

    4 x Cards (384GB/cell) used to create a Cache on the cell

    level.

    Copyright 2009, Oracle Corporation and/or its affiliates

    20

    level.

    Able to pull 3.6GB/sec total bandwidth from 4 Cards on

    each storage cell.

    50GB/sec total from flash.

    21GB/sec from disk.

  • Monitoring Flash Response Time

    Use standard Oracle tools

    AWR

    Enterprise Monitor

    End-to-End

    Random Reads

    db_file_sequential_reads become....

    Copyright 2009, Oracle Corporation and/or its affiliates

    21

    db_file_sequential_reads become....

    cell single block physical reads

    Avg

    wait % DB

    Event Waits Time(s) (ms) time Wait Class

    ------------------------------

    -----------------------

    ------

    ------

    ----------

    cell single block physical rea 2,029,590 1,223 1 31.7 User I/O

    DB CPU 822 21.3

    gc cr grant 2-way 2,161,346 805 0 20.9 Cluster

    cell multiblock physical read 170,707 719 4 18.7 User I/O

    gc cr multi block request 2,533,564 340 0 8.8 Cluster

  • Exadata Smart Flash Cache

    Performance -50GB/s and 1 million IOPs

    Use PCIe cards instead of SSDs to avoid slow disk interface

    Exadata storage, InfiniBand and PCIe can drive higher levels of

    performance

    Traditional Storage Arrays and SANs already have internal bottlenecks

    which prevent them from exploiting the full spinning disk performance and

    hence are unable to leverage the higher performance of flash technology

    Capacity

    Linearly scalable no bottlenecks as you add more storage

    Efficient Compression increases effective performance and capacity by up

    Copyright 2009, Oracle Corporation and/or its affiliates

    22

    Efficient Compression increases effective performance and capacity by up

    to 10X

    Smart Caching

    Integrated Database and Exadata Storage Server software ensures only

    frequently accessed data in cached

    Automatically skips caching of data that will not be frequently accessed

    or avoid caching data that will not fit in the cache

    Database awareness enables caching only data likely to be accessed

    again

    User can fine-tune caching policies online

  • Exadata V2 Software

    Copyright 2009, Oracle Corporation and/or its affiliates

    23

  • Exadata v1 Software Features

    Exadata Smart Scans

    Exadata cells implement smart scans to greatly

    reduce the data that needs to be processed by

    database

    Only return relevant rows and columns to

    database

    Offload predicate evaluation

    Copyright 2009, Oracle Corporation and/or its affiliates

    24

    Offload predicate evaluation

    Data reduction is usually very large

    Column and row reduction often decrease data to

    be returned to the database by 10x

    Join Filtering

    Bloom filters used for join filtering in storage

    Copyright 2009, Oracle Corporation and/or its affiliates

    24

  • Exadata v1 Software Features

    Smart Incremental Backup

    Block Change Tracking maintains the set of blocks changed with a bitmap

    that has 1 bit per 32k and does a large (approx 1M) IO if needed.

    When a large IO for incremental backup is done at exadata, exadata filters

    out most of the data and returns only the data that needs to be a part of

    the incremental backup.

    Change Tracking File Content for 1MB

    Copyright 2009, Oracle Corporation and/or its affiliates

    25

    Change Tracking File Content for 1MB

    0 0 1 0 1 0 1 1 0 0 1 0 1 0 1 1 0 0 1 0 1 0 1 1 0 0 1 0 1 0 0 0 0

    Copyright 2009, Oracle Corporation and/or its affiliates

    25

    Smart Incremental backup Request

  • Exadata v1 Software Features

    Fast File Creation

    Files created by the database are initialized

    Full blocks initialized by database and written to storage

    With Exadata, only metadata is sent by Database to

    Exadata

    Initialization is done by the Exadata storage server

    Copyright 2009, Oracle Corporation and/or its affiliates

    26

    Initialization is done by the Exadata storage server

    software on the drives

    Tremendous reduction in IO between database to storage

    Copyright 2009, Oracle Corporation and/or its affiliates

    26

  • Exadata v2 Software Features

    Hybrid Columnar Compression

    Smart Flash Cache

    Storage Index

    Smart Scan Additions

    Manageability Enhancements

    Copyright 2009, Oracle Corporation and/or its affiliates

    27

    Manageability Enhancements

    Copyright 2009, Oracle Corporation and/or its affiliates

    27

  • Hybrid Columnar Compression

    Useful for data that is bulk loaded and

    queried

    Tables are organized into Compression

    Units (CUs)

    CUs are larger than database blocks

    Usually around 32K

    Compression

    Unit

    Copyright 2009, Oracle Corporation and/or its affiliates

    28

    Usually around 32K

    Within Compression Unit, data is

    Organized by Column instead of by row

    Column organization brings similar values

    close together, enhancing compression

    Red

    uces

    Tab

    le S

    ize

    4x t

    o 4

    0x

    10x to 50x

    Reduction

    Copyright 2009, Oracle Corporation and/or its affiliates

    28

  • Hybrid Columnar Compression

    Modes of Compression

    Query Mode 10x average savings

    Archive Mode 15x average savings

    Smart Scans on HCC tables in Exadata

    Copyright 2009, Oracle Corporation and/or its affiliates

    29

    Smart Scans on HCC tables in Exadata

    Column Projection

    Decompression

    Row Filtering

    Copyright 2009, Oracle Corporation and/or its affiliates

    29

  • Smart Flash Cache

    Understands different types of I/Os from

    database

    Skips caching I/Os to mirror copies

    Skips caching backups

    Skips caching data pump I/O

    Copyright 2009, Oracle Corporation and/or its affiliates

    30

    Skips caching tablespace formatting

    Resistant to table scans

    Control File Reads and Writes are cached

    File header reads and writes are cached

    Data Blocks and Index blocks are cached

    Copyright 2009, Oracle Corporation and/or its affiliates

    30

  • Smart Flash Cache Keep Directive

    DBA can enforce that an object is kept in flash cache

    ALTER TABLE calldetail STORAGE (CELL_FLASH_CACHE KEEP)

    Can be set like other storage clause values

    At table level, partition level, during creation time etc.

    Table scans on objects marked with cell_flash_cache keep run

    Copyright 2009, Oracle Corporation and/or its affiliates

    31

    through the flash cache

    Disk bandwidth full rack 21GB/s

    Flash bandwidth full rack 50GB/s

    Copyright 2009, Oracle Corporation and/or its affiliates

    31

  • Smart Flash Cache Benefits

    Smart Flash Cache w/ HCC compressed table

    Converts 5TB of flash into 50TB of flash cache

    Flash Cache does not use space for redundancy

    Better utilization of premium storage

    Scans through flash cache take advantage of disks too!

    Copyright 2009, Oracle Corporation and/or its affiliates

    32

    Scans through flash cache take advantage of disks too!

    Disks 21GB/s Flash 50GB/s

    Flash cache > 69GB/s (featured in exadata demo)

    1 Million 8k I/Os per second on a full rack at sub

    millisecond latency

    Copyright 2009, Oracle Corporation and/or its affiliates

    32

  • Storage Index Basic example

    Exadata Storage Indexes maintain summary

    information about table data in memory

    Store MIN and MAX values of columns

    Typically one index entry for every MB of disk

    Eliminates disk I/Os if MIN and MAX can

    AB

    CD

    1 3Min B = 1

    Max B =5

    Table

    Index

    Copyright 2009, Oracle Corporation and/or its affiliates

    33

    Copyright 2009, Oracle Corporation and/or its affiliates

    33

    Eliminates disk I/Os if MIN and MAX can

    never match where clause of a query

    Completely automatic and transparent

    5 5 8 3

    Max B =5

    Min B = 3

    Max B =8

  • Storage Index with Partitions Example

    Order#

    Order_Date

    Partitioning Column

    Ship_Date

    Item

    12007

    2007

    22008

    2008

    32009

    2009

    Orders Table

    Copyright 2009, Oracle Corporation and/or its affiliates

    34

    Queries on Ship_Date do not benefit from Order_Date partitioning

    However Ship_date and Order# are highly correlated with Order_Date

    e.g. Ship dates are usually near Order_Dates and are never less

    Storage index provides partition pruning like performance for queries on

    Ship_Date and Order#

    Takes advantage of ordering created by partitioning or sorted loading

    Copyright 2009, Oracle Corporation and/or its affiliates

    34

  • Smart Scan Additions

    Features

    Smart Scans on Encrypted Tablespaces

    Smart Scans on Encrypted Columns

    Smart Scans for Data Mining Scoring

    Copyright 2009, Oracle Corporation and/or its affiliates

    35

    Benefits

    CPU utilization on database node dramatically improves

    Less data shipped to database nodes

    Copyright 2009, Oracle Corporation and/or its affiliates

    35

  • Manageability Improvements

    OCR and Voting Disk in ASM Diskgroup

    OCR and voting disk can reside on exadata

    Firmware upgrades of Components automatically handled

    on replacements

    Copyright 2009, Oracle Corporation and/or its affiliates

    36

    Copyright 2009, Oracle Corporation and/or its affiliates

    36

  • Resources

    Oracle.com:

    http://www.oracle.com/exadata

    Oracle Exadata Technology Portal on

    OTN:

    Copyright 2009, Oracle Corporation and/or its affiliates

    37

    http://www.oracle.com/technology/products/bi/db/exada

    ta

  • Copyright 2009, Oracle Corporation and/or its affiliates

    38