an introduction to column store indexes and batch mode

28
An Introduction to Column Store Indexes and Batch Mode CPU + DBA Level 300

Upload: chris-adkin

Post on 28-Nov-2014

73 views

Category:

Data & Analytics


2 download

DESCRIPTION

An introduction to SQL Server column store indexes and batch mode, including a look at the new features that SQL Server 2014 introduces in this area.

TRANSCRIPT

Page 1: An introduction to column store indexes and batch mode

An Introduction toColumn Store Indexes

andBatch Mode

CPU

+

DBA Level 300

Page 2: An introduction to column store indexes and batch mode

About me

An independent SQL ConsultantA user of SQL Server from version 2000 onwards with 12+ years

experience. I have a passion for understanding how the database engine works

at a deep level.

Page 3: An introduction to column store indexes and batch mode

A Brief History Of Column Store Technology

The lineage of column store databases can be traced back to the MonetDb and VectorWise projects from Holland, developed at around the turn of the millennium.

Store is column oriented.

Column store technology aims to exploit modern CPU architectures.

Virtually all database vendors now have a column store database offering.

Many people predict a future where all OLAP workloads will be serviced by column oriented databases.

Page 4: An introduction to column store indexes and batch mode

ColourRedRedBlueBlueGreenGreenGreen

DictionaryLookup ID Label1 Red2 Blue3 Green

SegmentLookup ID Run Length1 22 23 3

Compressing data going down the column using run length compression.

Global and local dictionaries are used to store compression metadata.

Column Store Compression Schemes

Page 5: An introduction to column store indexes and batch mode

Column store segments

Local Dictionary

Global dictionary

Deletion Bitmap

Column Store Index ‘Anantomy’

Page 6: An introduction to column store indexes and batch mode

Heap Row Compression Page compression Clustered column store index

Clustered column store index archive

compression

0

50

100

150

200

250

300

350

* Posts tables from the four largest stack exchanges combined ( superuser, serverfault, maths and Ubuntu )

59 %53 % 64 % 72 %

What Levels Of Compression Can Be Achieved ?

Page 7: An introduction to column store indexes and batch mode

Demonstration 1: The Difference Batch Mode MakesTest Data Creation

Page 8: An introduction to column store indexes and batch mode

Demonstration 1: The Difference Batch Mode MakesTest Queries

Page 9: An introduction to column store indexes and batch mode

How Queries are Executedlans Run

Row by row Row by row

Row by row Row by row

How do rows travel betweenIterators ?

Control flow

Data Flow

Page 10: An introduction to column store indexes and batch mode

Core

Modern CPU Architecture

L3 Cache

L1 Instruction Cache 32KB

L0 UOP cache

L2 Unified Cache 256K

Power and

ClockQPIMemory

Controller

L1 Data Cache32KB

Core

CoreL1 Instruction Cache 32KB

L0 UOP cache

L2 Unified Cache 256K

L1 Data Cache32KB

Core

Bi-directional ring bus

IOTLBMemory bus

system-on-chip ( SOC ) design with CPU cores as the basic building block.

Utility services are provisioned by the ‘Un-core’ part of the CPU die.

Four level cache hierarchy.

C P U

QPI…

Un-core

Page 11: An introduction to column store indexes and batch mode

L1 Cache sequential access

L1 Cache In Page Random access

L1 Cache In Full Random access

L2 Cache sequential access

L2 Cache In Page Random access

L2 Cache Full Random access

L3 Cache sequential access

L3 Cache In Page Random access

L3 Cache Full Random access

Main memory

0 20 40 60 80 100 120 140 160 180

4

4

4

11

11

11

14

18

38

167

Memory Is The “New Disk”Memory

Batch mode is about working in the 4 ~ 38 clock cycle range and NOT the 167 cycle “CPU stall” range.

Page 12: An introduction to column store indexes and batch mode

C P U

How Can A Column Store Index Fit Inside The CPU Cache ?

Column store object pool

SegmentBatches

Page 13: An introduction to column store indexes and batch mode

The Column Store Object Pool

Page 14: An introduction to column store indexes and batch mode

Batch Mode Pre-Requisites

Feature SQL Server 2012SQL

Server 2014

Presence of column store indexes Yes Yes

Parallel execution plan Yes Yes

No outer joins, NOT Ins or UNION ALLs Yes No

Hash joins do not spill from memory Yes No

Scalar aggregates cannot be used Yes No

Page 15: An introduction to column store indexes and batch mode

SQL Server 2012 / 2014 Column Store Comparison

Feature SQL Server 2012 SQL Server 2014

Column store indexes Yes Yes

Clustered column store indexes No Yes

Updateable column store indexes No Yes

Column store archive compression No Yes

Columns in a column store index can be dropped No Yes

Support for GUID, binary, datetimeoffset precision > 2, numeric precision > 18. No Yes

Enhanced compression by storing short strings natively ( instead of 32 bit IDs ) No Yes

Bookmark support ( row_group_id:tuple_id) No Yes

Mixed row / batch mode execution No Yes

Optimized hash build and join in a single iterator No Yes

Hash memory spills cause row mode execution No Yes

Iterators supported Scan, filter, project, hash (inner) join and (local) hash aggregate

Yes

Page 16: An introduction to column store indexes and batch mode

RowGroups

Columns

A B C

Encode andCompress

Segments

Store

Blobs

Encode & Compress

Delta stores

< 1,048,576 rows

How Column Store Index Updates Are Handled

Tuple mover

Page 17: An introduction to column store indexes and batch mode

aDemonstration 2: Delta Stores In Action

Page 18: An introduction to column store indexes and batch mode

Demonstration 3: Pre-sorting and Segment EliminationTest Data Creation

Page 19: An introduction to column store indexes and batch mode

Demonstration 3: Pre-sorting and Segment EliminationTest Queries

Page 20: An introduction to column store indexes and batch mode

Demonstration 4: Pre-sorting and Hash Aggregate Performance

Page 21: An introduction to column store indexes and batch mode

Test Setup

CPU6 core 2.0 Ghz (Sandybridge)

Warm large object cache used in all tests to remove storage as a factor.

CPU6 core 2.0 Ghz (Sandybridge)

48 Gb quad channel 1333 Mhz DDR3 memory

Hyper-threading enabled, unless specified otherwise.

Page 22: An introduction to column store indexes and batch mode

Atypical Data Warehouse Query On Extra Large Non Sorted Data

1095500000 rows

1,798MB in size

Page 23: An introduction to column store indexes and batch mode

Atypical Data Warehouse Query On Extra Large Pre-Sorted Data

1095500000 rows

8,555MB in size

Page 24: An introduction to column store indexes and batch mode

Elapsed Time (ms) / Degree of Parallelism

2 4 6 8 10 12 14 16 18 20 22 240

10000

20000

30000

40000

50000

60000

70000

80000

Non-sorted column store Sorted column store

Degree of Parallelism

Tim

e (m

s)

Page 25: An introduction to column store indexes and batch mode

Lowering Clock Cycles Per Instruction By Leveraging SIMD

1 2 3 4

2 3 4 5

+

3 5 7 9

=

1 2+ 3=Scalar instructionC = A + B

SIMD instruction

Vector C = Vector A + Vector B

Page 26: An introduction to column store indexes and batch mode

Takeaways

Column store indexes are only half the story, its column store index and batch mode that make the real difference to performance.

Pre-sort data where applicable and possible to encourage segment elimination.

Pre-sort data on fact table key column subject to the heaviest hash join / aggregate activity.

Column Store indexes and batch mode is fast, but not scalable.

Many other vendors leverage SIMD, Microsoft are yet to do this, this can result in another step change in performance.

Page 27: An introduction to column store indexes and batch mode

Questions ?