sql server 2012 beyond relational performance and scale
DESCRIPTION
Pragmatic Works SQL Server 2012 Webinar presentationTRANSCRIPT
![Page 1: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/1.jpg)
Beyond Relational Performance and Scale in SQL Server 2012
Michael RysPrincipal Program Manager@SQLServerMike
![Page 2: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/2.jpg)
My favorite Beyond Relational Application
Structured and unstructured Search
Related/”Semantic” Search
![Page 3: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/3.jpg)
Beyond Relational Data
Building and Maintaining Applications with relational and non-relational data is hard
Complex integrationDuplicated functionalityCompensation for unavailable services
Pain Points
Goals
Reduce the cost of managing all dataSimplify the development of applications over all dataProvide management and programming services for all data
![Page 4: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/4.jpg)
What is the Beyond Relational Mission?Efficient storage for all data
Tables, XML, Spatial, Documents, Digital Media, Scientific Records, Factoids…
Rich Data Processing Capabilities for all applications
Data formats and content natively understood for rich application and user experienceConsistent Application Model and Data Constructs to ease application development, migration and long-term retention
Rich Capabilities and Services over all dataProvide rich services, e.g.,
Query and Reason over data and extracted semanticsSearch across structural impedance of different data formatsIntegrated backup/restore for all data
![Page 5: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/5.jpg)
Beyond Relational Story
StructuredData
Query
T-SQL
B-treesManageabilit
yAvailability
Files
Programmability
![Page 6: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/6.jpg)
Beyond Relational Story
StructuredData
Query
T-SQL
B-trees
ManageabilityAvailability
Files
Programmability
Unstructured Data
Search
![Page 7: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/7.jpg)
Beyond Relational Story
StructuredData
Query and Type Operations
T-SQL/Data Types
B-trees
ManageabilityAvailability
Files
Programmability
Unstructured Data
Search
Filestream
Win 32
Semi-structuredData/XML
XML, FTS, SpatialIndices
XQuerySpatial ops
Spatial, XML, HierarchyID
![Page 8: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/8.jpg)
Beyond Relational Story
StructuredData
Query and Type Operations
T-SQL/Data Types
B-trees
Manageability& Availability
Programmability
Unstructured Data
SearchWin 32
Semi-structuredData/XML
Semantic
Platform
Efficient Storage for BR Data
Rich Query and Search Services over all Data
Rich Data ProgrammingCapabilities
Files
Filestream
XML, FTS, SpatialIndices
XQuerySpatial
ops
Spatial, XML, HierarchyID
![Page 9: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/9.jpg)
Beyond Relational in SQL Server 2012
Address important customer requests for Capabilities and rich services for Rich Unstructured Data (RUDS)
Scale Up for storage and searchEasy use/access to Unstructured data from all applicationsRich insight into unstructured data to make better decisions
We deliver what you asked for to build Spatial-aware Applications
Advanced 2D SpatialMake Spatial pervasive across platformImprove performance and scale
Service Broker Message Broadcast
![Page 10: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/10.jpg)
Rich Unstructured Data Performance and Scale
Scale Up for storage and search to 100m to 500m documentsMultiple containers for FileStream Scale Up Improved Scale Up for Search
![Page 11: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/11.jpg)
Rich Unstructured Data & Services Ecosystem
Fulltext Search
Semantic Similarity Search
Rich
S
erv
ices
Database
Disk1
Disk2
Disk3
Multiple Containers
Sca
le-u
p
Solu
tions
Database Applications
Transactional Access
Blobs
DB FileStre
DB FileStreams
Integrated Backup/Replication/AlwaysO
n
Integrated AdministrationIntegrated Administration?
Windows Apps
SMB Share Files/Folders
FileStream API
Streaming Win32 AccessStreaming Win32 Access??
Customer Application
Azure lib Centera lib
SQL FILESTREAM lib
SQL RBS API
Azure Centera SQL DB
Remote BLOB Storage
FileStreamsFileTable
SQL Apps
![Page 12: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/12.jpg)
FilestreamStorage Attribute on VARBINARY(MAX)
Works with integrated FTSUnstructured data stored directly in the file system (requires NTFS)Dual Programming Model
TSQL (Same as SQL BLOB)Win32 Streaming APIs with T-SQL transactional semantics
Data ConsistencyIntegrated Manageability
Back Up/RestoreAdministration
Size limit is the file system volume sizeSQL Server Security Stack
Store BLOBs in DB + File SystemApplication
BLOB
DB
![Page 13: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/13.jpg)
FILETABLE Overview
FileTable: A Table of Files/Directories
User created Table with a fixed schema
contains FILESTREAM and File Attributes
Each row represents a File or a Directory
System defined constraints maintain the tree integrity
File/Directory hierarchy view through a Windows Share
Supports Win32 APIs for File/Directory Management
DB Storage is Transparent to Win32 applications
SMB level of application compatibility
Virtual network name (VNN) path support for transparent Win32 application failover
Private Docs(Database1)
Office Docs(Database2)
LogFiles (FileTable)
Documents(FileTable)
Media(FileTable)
MSSQLSERVER
\\my_machine\MSSQLSERVER\Office Docs\Documents
FILESTREAM Share
Database Directories
FileTable Directories
FileTable Folder Hierarchy
User-Defined Directory Structure
![Page 14: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/14.jpg)
Some FileStream/FileTable performance tipsReading bigger buffers gives better performance
Volumes hosting FILESTREAM/FILETABLE data should have 8.3 name generation and LastAccessTime disabled
FILESTREAM/FILETABLE containers to reside on dedicated volumes
Have one volume per FILESTREAM/FILETABLE containerenables space management at volume level
“Magic” SMB buffer size = ~60KB Another “good” value is 480KB
ROWGUID unique index for aligned partitioning for FILESTREAM
AntiVirus programs should be configured not to delete infected files but to quarantine them
If using compressed volumes, use cluster size 4 KB
![Page 15: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/15.jpg)
FILESTREAM Read Performance (Remote)
240 KB 480 KB 1 MB 2 MB 4 MB 8 MB0
100
200
300
400
500
600
700
800
900
Filestream Win32 (Filesystem) Ac-cess
Filestream T-SQL
Varbinary
Filesystem Win32 Access Gain (%)T
hro
ug
hp
ut
(Mb
ps
)
Measured with SQL Server 2008
![Page 16: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/16.jpg)
FILESTREAM Write Performance (Remote)
240 KB 480 KB 1 MB 2 MB 4 MB 8 MB
-200
-100
0
100
200
300
400
500
600 Insert
Filestream Win32 (Filesys-tem) Access
Filestream T-SQL
Varbinary
Filesystem Win32 Access Gain (%)
Th
rou
gh
pu
t (M
bp
s)
Measured with SQL Server 2008
![Page 17: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/17.jpg)
Unstructured Data Scale-upMultiple Containers for FILESTREAM data
SQL 2008 R2Only one storage container/FILESTREAM filegroup
Limits storage capacity scaling and I/O scaling
SQL Server 2012Support for multiple storage containers/filegroup.
DDL Changes to Create/Alter Database statements
Ability to set max_size for the containers
DBCC Shrinkfile Emptyfile support
Scaling FlexibilityStorage scaling by adding additional storage drives
I/O scaling with multiple spindles
![Page 18: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/18.jpg)
Unstructured Data : Multiple containers
Use of multiple spindles for achieving better I/O Scalability
![Page 19: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/19.jpg)
RUDS Scale-up: FileStream Perf/ScaleImproved performance of T-SQL and File I/O access
Various enhancements to improve read/write throughput 5 fold increase in Read throughput
Linear scaling with large number of concurrent threads
2012 2012
![Page 20: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/20.jpg)
Full Text Search Improvements in SQL Server 2012Improved Performance and Scale:
Scale-up to 350M documents
iFTS query perf 7-10 times faster than in SQL Server 2008
Worst-case iFTS query response times < 3 sec for corpus
At par or better than main database search competitors
New Functionality:Property Search
customizable NEAR
New Wordbrakers: update existing WB, add Czech and Greek
Innovation in Search: Semantic Similarity Search
![Page 21: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/21.jpg)
Full Text Search Performance & Scale ImprovementsArchitectural Improvements
Improved internal implementation
Queries no longer block Index updates
Improved Query Plans: Better Plans for common queries
Fulltext predicate folding
Parallel Plan execution
Index and Query tested on scale up to 350Million documents with < ~2 Sec Response
~3X better w/o DML and ~9X better with DML throughput
Scale easily with increasing number of connections
![Page 22: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/22.jpg)
Scale-up: Full-Text Search
Queries over 350M documents database and random DMLs running in background. Beating SQL Server 2005 with a scale factor more than 2x and with avg 60x times better throughput
2012
2005/8
2005/8 vs 2012
![Page 23: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/23.jpg)
Scale-up: Full-Text Search
Query avgExecTime (ms) under various number of connections (50 ~ 2000 users) for customer playback benchmark
2012
2005/8
2005/8 vs 2012
![Page 24: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/24.jpg)
Performance and Scale for Spatial ApplicationsSupport Persisted computed spatial columnsNew geodetic SRID for faster calculationsImproved implementation of operations
Faster Spatial index creation for point data (4 to 5 times faster)Faster point data queriesOptimized STBuffer, lower memory footprintFaster “secondary” filter step
Improved default spatial indexing scheme and new hintsAutoGridQuery Window Grid density hint
Spatial Index CompressionImproved index-aware query plans
Nearest NeighborOptimized spatial query plan for STDistance and STIntersects like queries
![Page 25: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/25.jpg)
Support Persisted Computed Columns
Convert 2 columns (latitude, longitude) to geographyalter table MyTable
add geo as (geography::Point(lat, lon, 4326)) persisted
![Page 26: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/26.jpg)
Spatial Reference ID (SRID)Each Spatial object has an SRID associatedSRID is “locale” for spatial objects
Determines Coordinate systemMeasurementsProjection semanticsGeoid dimensions
Only objects of same SRID can operationally be combinedSRID for GEOMETRY (default: 0)
User-defined, no impact on operational semantics
SRID for GEOGRAPHY (default: WGS 84)Impacts operational semantics390 predefined SRIDs based on European Petroleum Survey Group List:select * from sys.spatial_reference_systemsSQL Server 2012: We added Microsoft specified UnitSphere SRID 104001 for a spherical globe!
![Page 27: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/27.jpg)
Spatial Indexing Basics
In general, split predicates in twoPrimary filter finds all candidates, possibly with false positives (but never false negatives)Secondary filter removes false positives
The index provides our primary filterOriginal predicate is our secondary filterSome tweaks to this scheme
Sometimes possible to skip secondary filter
A B
C
D A BD A BPrimary Filter (Index lookup)
Secondary Filter (Original predicate)E
![Page 28: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/28.jpg)
Spatial index tessellation
Better and more continuous coverage
64 cells 128 cells 256 cells
Fully contained
cellsPartially contained
cells
![Page 29: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/29.jpg)
Auto Grid Spatial Index
New spatial index Tessellations:
geometry_auto_gridgeography_auto_grid
Uses 8 Grid levels instead of the previous 4No GRIDS parameter needed (or available)
Fixed at HLLLLLLLdefault number of cells per object:
8 for geometry 12 for geography
More stable performance for windows of different sizefor data with different spatial density
For default values:Up to 2x faster for longer queries > 500 ms
More efficient primary filter Fewer rows returned
10ms slower for very fast queries < 50 ms
Increased tessellation time which is constant
![Page 30: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/30.jpg)
Spatial Index Performance
New grid gives much stable performance for query windows of different sizeBetter grid coverage gives fewer high peaks
![Page 31: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/31.jpg)
DEMOIndexing and Performance
![Page 32: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/32.jpg)
Query window number of cells
Typical spatial query performanceOptimal value (theoretical) is
somewhere between two extremes
Time needed to process false
positives
Default values:512 - Geometry AUTO grid768 - Geography AUTO grid1024 - MANUAL grids
SELECT * FROM table t WITH (SPATIAL_WINDOW_MAX_CELLS=256)WHERE t.geom.STIntersects(@window)=1;
![Page 33: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/33.jpg)
Query Window Hinting (SQL Server 2012)
• SELECT * FROM table t with(SPATIAL_WINDOW_MAX_CELLS=1024)WHERE t.geom.STIntersects(@window)=1
• Used if an index is chosen (does not force an index)• Overwrites the default (512 for geometry, 768 for geography)• Rule of thumb:
• Higher value makes primary filter phase longer but reduces work in secondary filter phase
• Set higher for dense spatial data • Set lower for sparse spatial data
![Page 34: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/34.jpg)
Query Hinting
demo
![Page 35: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/35.jpg)
Spatial Index Compression
CREATE SPATIAL INDEX idxGeog ON table(geography column) USING GEOGRAPHY_GRID WITH ( DATA_COMPRESSION = page | row );
On the basis of internal tests, with compression- 40%-50% smaller
- 20% faster -15% slower queries- Per partition compression setting is not
supported.
![Page 36: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/36.jpg)
Additional Query Processing Support
• Index intersection• Enables efficient mixing of spatial and non-spatial
predicates• Matching
• New in SQL Server 2012: Nearest Neighbor query• Distance queries: convert to STIntersects• Commutativity: a.STIntersects(b) = b.STIntersects(a)• Dual: a.STContains(b) = b.STWithin(a)• Multiple spatial indexes on the same column
• Various bounding boxes, granularities• Outer references as window objects
• Enables spatial join to use one index
![Page 37: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/37.jpg)
Spatial Nearest Neighbor
Main scenarioGive me the closest 5 Italian restaurants
Execution plan SQL Server 2008/2008 R2: table scanSQL Server 2012: uses spatial index
Specific query pattern requiredSELECT TOP(5) *FROM Restaurants rWHERE r.type = ‘Italian’ AND r.pos.STDistance(@me) IS NOT NULLORDER BY r.pos.STDistance(@me)
![Page 38: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/38.jpg)
Nearest Neighbor Performance in SQL Server 2012
demo
![Page 39: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/39.jpg)
Nearest Neighbor Performance
NN query vs best current workaround (sort all points in 10km radius)
*Average time for NN query is ~236ms
Find the closest 50 business points to a specific location (out of 22 million in total)
![Page 40: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/40.jpg)
Spatial Tips on index settingsSome best practice recommendations (YMMV):• Start out with new default tessellation• Point data: always use HIGH for all 4 level. CELL_PER_OBJECT
are not relevant in the case.• Simple, relatively consistent polygons: set all levels to LOW or
MEDIUM, MEDIUM, LOW, LOW • Very complex LineString or Polygon instances:
• High number of CELL_PER_OBJECT (often 8192 is best)• Setting all 4 levels to HIGH may be beneficial
• Polygons or line strings which have highly variable sizes: experimentation is needed.
• Rule of thumb for GEOGRAPHY: if MMMM is not working, try HHMM
![Page 41: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/41.jpg)
What to do if my Spatial Query is slow?• Make sure you are running SQL Server 2008 SP1, 2008 R2 or
2012• Check query plan for use of index• Make sure it is a supported operation• Hint the index (and/or a different join type)• Do not use a spatial index when there is a highly selective non-
spatial predicate• Run above index support procedure:
• Assess effectiveness of primary filter (Primary_Filter_Efficiency)• Assess effectiveness of internal filter (Internal_Filter_Efficiency)• Redefine or define a new index with better characteristics
• More appropriate bounding box for GEOMETRY• Better grid densities
![Page 42: SQL Server 2012 Beyond Relational Performance and Scale](https://reader038.vdocuments.site/reader038/viewer/2022102922/54be76d74a7959af118b456b/html5/thumbnails/42.jpg)
Related ContentSome Rich Unstructured Data Presentations (with further links):
http://www.slideshare.net/MichaelRys/sql-bits-brrudshttp://www.slideshare.net/MichaelRys/filetable-and-semantic-search-in-sql-server-2012 http://www.sqlserverlaunch.com/WW/theater?sid=634
Some Spatial Presentations (with further links):http://www.slideshare.net/MichaelRys/sqlbits-x-sql-server-2012-spatialhttp://www.slideshare.net/MichaelRys/sqlbits-x-sql-server-2012-spatial-indexing
Forum: http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=1629&SiteID=1
Find Us Later At…On Twitter: @SQLServerMike, @Spatial_EdBlogs: http://sqlblog.com/blogs/michael_rys, http://blogs.msdn.com/b/edkatibah/