a hybrid row-column oltp database architecture for operational reporting jan schaffner, anja bog,...

14
A Hybrid Row-column OLTP Database Architecture for Operational Reporting Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier

Upload: sarah-taylor

Post on 29-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

A Hybrid Row-column OLTP Database Architecture for Operational Reporting

Jan Schaffner, Anja Bog, Jens Krüger, Alexander Zeier

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Agenda

■ Operational Reporting

■ Related Work

■ Architecture of Hybrid System

■ Virtual Cube

■ Outlook and Discussion

2

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Operational Reporting

Dinstinction according to Inmon:

■ Informational Reporting

□ Supports long-term, strategic decisions

□ Summarized data

□ Long-term horizons

Typically done using a data warehouse (DW)

■ Operational Reporting

□ Supports day-to-day decisions

□ Data on a more detailed level

□ Takes up-to-the-minute data into account

Done using a DW or an OLTP system?

3

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Operational Reporting (contd.)

■ Using a DW for Operational Reporting

□ DW must be designed to the same level of granularity as the OLTP systems huge data volumes

□ Updates are required to frequently be replicated into the DW endless optimization

■ Using an OLTP Store for Operational Reporting

□ Operational reporting queries are relatively long-running in comparison to pure OLTP workloads

□ Resource contention:Locks of long-running queries block the short-running ones

□ Different data model:Not optimized for reporting (i.e. no star-schema)

4

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Common Data Warehouse Architecture

■ DW contains ETL processor which

□ ...extracts data from various OLTP sources into a staging area

□ ...applies transformations for cleansing and integration

□ ...stores data in a dimensional layout

■ OLAP engine runs queries against dimensional data store

5

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

“Real-Time” DW Architectures

■ Microbatch

□ Configure ETL process to run in very short intervals

□ Up-to-date data but very resource intensive

■ Push Architectures

□ Handling of deltas on a business or database transaction level

□ Up-to-date data but still resource intensive

■ Operational Data Store (ODS)

□ Store copy of the OLTP data using an integrated schema

□ High data granularity but no up-to-date data

6

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

“Real-Time” DW Architectures (contd.)

■ ELT

□ Data is extracted from the OLTP sources and loaded into the ODS

□ Transformations are done in the warehouse at query-runtime

□ High granularity (transactional data) but no up-to-date data

■ Virtual ODS

□ Virtual in the sense that queries are redirected against OLTP system

□ High granularity (transactional data) and up-to-date data

□ Performs ETL on-the-fly

□ Affects performance of OLTP system

7

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

8

Column-Stores: New “Trend” for OLAP

□ Column-store databases:

◊ Vertical fragmentation

◊ Fast aggregations (sum, min, max, avg, …) more flexibility for ad-hoc reporting

◊ Each column can be compressed individually

□ Both disk-based …

◊ Vertica

◊ Greenplum

□ … and in-memory:

◊ SAP BIA

◊ MonetDB

◊ Exasol

c1

v11

v21

v31

c2

v12

v22

v32

c3

v13

v23

v33

sID

1

2

3

c1

v11

v21

v31

sID

1

2

3

c2

v12

v22

v32

sID

1

2

3

c3

v13

v23

v33

row-oriented column-oriented

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Encoding Schemes

9

Ordered

Unordered

Few distinct values Many distinct values

Delta representationDelta representation

Sequence of triples:• value• offset position• # occurrences

Sequence of triples:• value• offset position• # occurrences

Sequence of tuples:• value• bitmap for positional occurence

Sequence of tuples:• value• bitmap for positional occurence

??

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Architecture of Hybrid System

10

■ Essentially integration between row- and column store DBs

■ MaxDB is used as the row store

□ Database underlying SAP Business ByDesign

□ Supports ACID transactions

■ TREX is used as the column store

□ Main memory

□ Engine underlying SAP BIA

□ Has a copy of (some of) the OLTP data

□ Primary OLTP system and main-memory database (MMDB) aregoverned using a single resource manager

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Architecture of Hybrid System (contd.)

11

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Virtual Cube

■ Similar architecture as virtual ODS

□ Virtual Cube provides the same interfaceas a typical cube (slice, dice, drill-down, …)

□ Virtual Cube rewrites queries and issues them against the MMDB (TREX in our case)

□ TREX has a copy of the OLTP data

□ Primary OLTP system and MMDB aretied together as described above

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Outlook

13

■ Build a “real” hybrid database in-memory as part of ChunkyStore

■ Data can be stored as either:

□ Rows

□ Columns

□ Chunks (adjacent fragments of rows and columns)

■ DB decides which physical storage alternative is most suitable

■ Main-memory implementation will cater for fast updates as well as fast operational reporting capabilities

August 24, 2008 | A Hybrid Row-column OLTP Database Architecture

Thank you

■ Questions?

14