![Page 1: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/1.jpg)
Who Moved My Tuple—Columnstore Indexes in SQL Server 2014
Joe D’Antoni Philadelphia SQL Server Users Group25 March 2014
![Page 2: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/2.jpg)
Joe D’Antoni
Joe has over 15 years of experience with a wide variety of data platforms, in both Fortune 50 companies as well as smaller organizations
He is a frequent speaker on database administration, big data, and career management
He is the co-president of the Philadelphia SQL Server User’s Group
He wants you to make sure you can restore your data
Joedantoni.wordpress.com – Blog, Slides
http://bit.ly/SQLColumnstore -- Slides, Resources
![Page 3: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/3.jpg)
AgendaIndexes—a basic overview
Columnstore—an introduction
Query Performance—Demo
2012 and 2014—What’s Changing?
2014—Demo
Questions
![Page 4: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/4.jpg)
Indexes• Data Structure that allows us
to speed data retrieval, by maintaining an extra copy of data
• Can be filtered
• Can be function based, or ordered
• Penalty is that writes become more expensive
• More storage required
![Page 5: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/5.jpg)
Indexes in SQL Server• Clustered vs. Nonclustered
• Clustered Index—Index Organized Table
• Non-clustered index “just an index”
![Page 6: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/6.jpg)
Clustered Index• Data is ordered as is inserted
into pages• Data in clustered index is only
stored on disk once (it’s the data from the tables)
• Table without a clustered index is called a heap—no order at all
![Page 7: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/7.jpg)
LastName FirstName Address PhoneNumber
Gates Bill 101 Money Ln (206)-555-1111
Smith John 101 Anywhere Rd
(212)-566-1112
Smith John 181 Uphill Way (215)-555-2425
Zuckerberg Mark 1 Hacker Way (650)-555-9999
Clustered Index Layout
Ellison Larry 1 Oracle Way (650)-555-1245New Record to be inserted
LastName FirstName Address PhoneNumber
Ellison Larry 1 Oracle Way (650)-555-1245
Gates Bill 101 Money Ln (206)-555-1111
Smith John 101 Anywhere Rd
(212)-566-1112
Smith John 181 Uphill Way (215)-555-2425
Zuckerberg Mark 1 Hacker Way (650)-555-9999
![Page 8: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/8.jpg)
Non-Clustered Index• Duplicate copy of the data in table
• Provides point from index to table data
• No specific order of data in index
![Page 9: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/9.jpg)
LastName FirstName Address PhoneNumber
Gates Bill 101 Money Ln (206)-555-1111
Smith John 101 Anywhere Rd
(212)-566-1112
Smith John 181 Uphill Way (215)-555-2425
Zuckerberg Mark 1 Hacker Way (650)-555-9999
Non-Clustered Index Layout
Ellison Larry 1 Oracle Way (650)-555-1245New Record to be inserted
LastName FirstName Address PhoneNumber
Gates Bill 101 Money Ln (206)-555-1111
Smith John 101 Anywhere Rd
(212)-566-1112
Smith John 181 Uphill Way (215)-555-2425
Zuckerberg Mark 1 Hacker Way (650)-555-9999
Ellison Larry 1 Oracle Way (650)-555-1245
![Page 10: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/10.jpg)
So Why All This Talk About Indexes?
![Page 11: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/11.jpg)
Data Warehouse Queries• Data Warehouses have a lot of data
• Querying lots of a data can take a really long time
• Processing data row by row—may not be the most efficient way to perform aggregations
![Page 12: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/12.jpg)
Traditional Approaches To Improving Performance• Partitioned Tables• Indexed Views• Data Compression
![Page 13: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/13.jpg)
LastName FirstName Address PhoneNumber
Ellison Larry 1 Oracle Way (650)-555-1245
Gates Bill 101 Money Ln (206)-555-1111
Smith John 101 Anywhere Rd
(212)-566-1112
Smith John 181 Uphill Way (215)-555-2425
Zuckerberg Mark 1 Hacker Way (650)-555-9999
Compression in SQL Server
Uncompressed Table
LastName
FirstName
Address PhoneNumber
Ellison Larry 1 Oracle Way (650)-555-1245
Gates Bill 101 Money Ln (206)-555-1111
Smith John 101 Anywhere Rd
(212)-566-1112
Smith John 181 Uphill Way
(215)-555-2425
Zuckerberg
Mark 1 Hacker Way (650)-555-9999
Row Compressed Table
LastName
FirstName
Address PhoneNumber
Ellison Larry 1 ***c** W** (650)-555-*245
G*t** B*** *0* M**** ** *2***********
S***h J*** *** ******** ** *************
***** **** *8* Up**** *** *************
Z******** **** * ******* *** *************
Page Compressed Table
![Page 14: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/14.jpg)
Introducing Columnstore Indexes (SQL 2012)• Data is stored in columns, as
opposed to rows• This allows a much higher rate
of compression• Columns not used in a query a
simply not scanned, nor returned
• Recommended practice is to add most columns in a table to a index
![Page 15: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/15.jpg)
Fn LnAreaCode Phone StNum StName StType City State
A Disney 661872-4547 111Wilson Dr
Bakersfield CA
Al Disney 530778-3737 222Main St Lewiston CA
Amy Disney 209577-5824 410Park Av
Santa Rosa CA
Anita Disney 559642-4472 89
Ahwahnee St San Diego CA
Anita Disney 209966-4472 781Mariposa Dr Napa CA
Ann Disney 949830-1883 3Amato Ct Yountville CA
Original Table
Fn
A
Al
Amy
Anita
Anita
Ann
LnDisneyDisneyDisneyDisneyDisneyDisney
AreaCode
661530209559209949
Phone872-4547778-3737577-5824642-4472966-4472830-1883
StNum111222410
89781
3
StNameWilsonMainParkAhwahneeMariposaAmato
StTypeDrStAvStDrCt
CityBakersfieldLewistonSanta RosaSan DiegoNapaYountville
StateCACACACACACA
Split in Columns
Fn A*l*my*nita********
LnDisney******************************
AreaCode
6615302*9*******4*
Phone872-4547***-3*3****-****6**-****9**-******0-1***
StNum1112224*089
7**3
StNameWilsonMa**P*rk*hw***e****i*******t*
StTypeDrStAv****C*
CityBakersfieldL*wi*tonS**** ******* DiegoNapaYountville
StateCA**********
Columnstore Compressed
![Page 16: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/16.jpg)
Columnar Data Storage
From Microsoft SIGMOD Paper
![Page 17: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/17.jpg)
So How are Columnstores So Much Faster?• Very good compression ratio for Column
oriented data• Better use of Memory• Segment Elimination Skips Large Chunks of
Data• Batch Mode
• Processes data in chunks of a 1000 row “batches” rather than row by row
• 7-40x CPU savings with batch mode
“The key to getting the best performance is to make sure your queries process the large majority of data in batch mode.”
![Page 18: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/18.jpg)
Columnstore All The Things?• Awesome performance—so
what’s the negative?• Can’t update/insert in
2012• Can only be nonclustered
index—so we are storing more data on disk
• Data types are somewhat limited
• One index per table• Can’t be a sorted index
![Page 19: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/19.jpg)
Update Process (2012)
Fact Table
Partition 1
Fact Table
Partition 3
Fact Table
Partition 2
Staging Table Data To Be
Loaded
Build Columnstore Index
Fact Table
Partition 4Partition Switch
Data From Staging to Fact Table
![Page 20: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/20.jpg)
So Where To Use Columnstore Indexes?• Only on Large Tables—Fact
tables and Dimension Tables > 3 Million Rows
• Include Every Column • Structure Queries as star
joins with grouping and aggregation
More details here
![Page 21: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/21.jpg)
Columnstore 2014
![Page 22: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/22.jpg)
Columnstore in 2014• Fewer Data Type Limitations
• Updateable
• Can be Clustered Index
• New Archival Compression Mode
• Batch Mode Improvements
![Page 23: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/23.jpg)
Columnstore Trickle Updates (2014)
Updates To Index
Collected until they reach 210
rows
Tuple Movers
Move into Index
This is the process when loading 102,399 rows or fewer
![Page 24: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/24.jpg)
Columnstore Bulk Insert
![Page 25: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/25.jpg)
Columnstore Updates (2014)• Bulk Inserts go
through special API• Updates are
processed as inserts and deletes, so expensive operation
![Page 26: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/26.jpg)
Columnstore Compression Effect
1 2 3 4 5 6 70
50
100
150
200
250
300
Columnstore Compression
No CS Clustered CS Archival CS
1 2 3 4 5 6 70
10
20
30
40
50
60
70
80
Columnstore Archival Compression
Clustered CS Archival CS
• Average space savings of columnstore versus no compression—69%
• Average space savings of columnstore Archival versus regular columnstore—29%
![Page 27: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/27.jpg)
Columnstore 2014Demo
![Page 28: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/28.jpg)
What Do We Do Differently in 2014• Best Practices are mostly the
same• Batch mode gets enhanced
and gains more query types• No need to worry about
dropping and rebuilding indexes—just append data
• Still focus on large tables where data is not frequently updated
• Archival Compression Good for old unused data
![Page 29: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/29.jpg)
Questions
![Page 30: In memory columnstore indexes--make your data warehouse](https://reader035.vdocuments.site/reader035/viewer/2022062300/557d55a6d8b42abf3d8b46ac/html5/thumbnails/30.jpg)
Contact [email protected]
Joedantoni.wordpress.com
@jdanton
http://bit.ly/SQLColumnstore -- Slides, Resources