san diego spug - bill baer is a senior technical product manager and microsoft certified master for...

77

Upload: jonah-skinner

Post on 22-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint
Page 2: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Deep Dive Into Shredded StorageBill Baer, MicrosoftChris Givens, Architecting Connected Systems

SPC416

Page 3: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Chris Givens, CEO ACS

@givenscj [email protected]

San Diego SPUG - www.sanspug.org

Page 4: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Chris’s BackgroundBS Computer Science, Math, Business5 years at IBM

Microsoft Certified Trainer (MCT) since 2007CISSP, CCNP, JAVA, MCSD, SharePoint 4x

CEO ACS, leading SharePoint courseware provider to Microsoft Certified Training centersTop selling titles in Development, BI and Search

SharePoint Sr. Architect eBayGeneral Atomics

Page 5: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint product group in Redmond, Washington; having previously worked at Hewlett-Packard Bill Baer has a proven background in infrastructure engineering and enterprise deployments of SharePoint Products and Technologies. While at Hewlett-Packard Bill Baer was awarded the MVP award for his contributions in the Technology Solutions Group, now known as HP Enterprise Business, which encompasses server and storage hardware, technology consulting, and software sales.

Twitter @williambaer

LinkedIn /billbaer

TechNet /b/wbaer

Bill Baer (ˈbɛər)Senior Product Marketing Manager

SharePoint Microsoft Corporation

www.wbaer.net

Page 6: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

AgendaStructured and Unstructured dataSharePoint Storage Architecture historyShredded Storage OverviewShredded Storage ComponentsTesting your knowledge (be ready to vote!)Recommendations

Page 7: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

DataStructured, Unstructured

Page 8: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

“on average 20% of data is structured, 80% is unstructured or semi-structured”

Page 9: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Unstructured DataNo specific format or sequenceNot tied to rulesUnpredictableExamples:Text, Video, Audio, Images, Word, PowerPoint

Page 10: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Structured DataOrganized in semantic chunks (entities)Tied to relationships and has attributesAssociated with a defined schemaAll entities have the defined formatHave a predefined length

ExampleEDI

Page 11: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Data in SharePointBLOB = Binary Large ObjectBLOB is the data stream associated with a fileSharePoint file metadata and BLOBs are stored in SQL databasesBLOBs do not participate in query operationsSample BLOB operations: Get, Put, Read range, etc.

SharePoint is built around the fileDocument libraries, Record Centers

BLOBs generally represent 80% of total content

Page 12: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

SQL BLOBSBinary large objects stored in data tables (varbinary(MAX) – 2010 and 2013, image in 2007)Image was limited to 2GBVarbinary virtually unlimited, but SharePoint still has limit of 2GB in code

SQL BLOBS are traditional method of storing and retrieving binary large objects with SharePoint

Page 13: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

BLOB Storage ChallengesStorageSQL storage is usually more expensiveSAN versus CAS stores

PerformanceImpacts load on SQL Server box

Policy requirementsExpunge, BLOB immutability

Page 14: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Storage Evolution

Page 15: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

SharePoint Storage HistorySharePoint Portal Server 2001 (10.145.3941)Web Storage System

SharePoint Portal Server 2003 (11.0.5704.0)Relational Database Storage

SharePoint Server 2007External BLOB Storage (EBS)

SharePoint Server 2010Remote Blob Storage (RBS)

SharePoint Server 2013Shredded Storage (Awesome sauce)

Page 16: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

External Blob Storage (EBS)Introduced in SharePoint 2007Runs parallel to the content databaseRequires COM interface (ISPExternalBinaryProvider) to coordinate data storesUses simple semantics to recognize file Save and Open commands and invokes redirection calls to the BLOB store when it recognizes BLOB data streams.

Page 17: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Remote Blob Storage (RBS)RBS is the technology that allowed the file blobs to be saved outside of the databaseFeature of SQL Server

Designed to delineate structured (metadata) and unstructured (BLOB data) dataAllows for more optimized data tiering with solutions such as StorSimple (Microsoft owned company)Commonly used files reside in memory\SSDRules push the data down to less expensive and less performant storage based on usage

Page 18: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Shredded StorageQuestion?What is Shredded Storage?

Simple AnswerA technology that break apart files into smaller chunks

Advanced AnswerA platform for other higher level applications to take advantage of

Page 19: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Logic Behind The TechnologyDeparture from storage of files as a monolithic stream or independently addressable BLOBsDistributes monolithic unstructured data into chunksChunks are uniquely identified for recompiling data in monolithic stream to service user requests

Page 20: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Goals

Page 21: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Shredded Storage GoalsReduce StorageOptimize BandwidthOptimize File I/OSecurity

Page 22: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Reduce StorageShredded storage will reduce storage requirements by only saving the parts of a file that have changedThis prevents the entire file from being saved over and over again when versioning is enabled

In heavily collaborative and versioned environments, this storage saving is significant

Page 23: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Optimize BandwidthHigher level applications (Office Client and Office Web Apps) take advantage of Shredded Storage and Cobalt’s bandwidth optimizationOffice clients will only save back the parts that have been changed, not the entire fileOffice Web Apps will request updates from multi-user sessions where parts of the file are locked and being editing

Page 24: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Optimize File I/OReduce performance penalties related to partial file updatesSmoother IO PatternsEnsure write costs are proportional to size of change

Two communication pathsClient to WFEWFE to Database

WFE to DatabaseRather than send the entire file back to save, only the shreds that have changed are sent back for persistenceThis in turn generates smaller transaction log files on your database server (think disaster recovery)

Page 25: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

SecurityPrevious vulnerabilities to data leak are prevented using shredded storage!It is much more difficult to get a file out of the database using simple PowerShell scripts

And now….”Fort Knox”

Page 26: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

SSL

Secure Shredded Store

Contoso Sales.pptx

AES256 Encrypted Storage

64KB

64KB

64KB

Enc. Partition

Enc. Partition

Enc. Partition

Page 27: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Customer ImpactExisting customersNo downtimeNew and/or modified content pushed at runtimeAsynchronous jobs update existing content

Securing contentData is chunked and encrypted uniquely on every chunk.Keys to chunks stored locally on ‘dedicated hardware’ Access keys per customer/account refreshed regularlyCerts and access data are stored separately of the data and require domain accounts specific to the customer

Page 28: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

End <> End SecurityMoving DataStrong SSL encryption for all Server/Client and Service/Service communications.

@ RESTNetwork and domain isolation limits access to your environmentBitLocker encryption guards against physical theftSecure Shredded Store guards against logical theft, encrypts individual blobs to limit the scope of accessSecrets (the keys to the keys) are also encrypted-at-rest, held in a secure store, and updated frequently.MFA

Operational AccessTime-bound approval to perform specific actions and access to customer data.Scoped access to only the minimal set of actions necessary for the task.Today, 10 engineers have standing access – we are driving this to Zero.

Page 29: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Content Database Changes

Page 30: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Content Database ChangesChanges were needed to support Shredded StorageAllDocStreams -> DocStreamsDocsToStream (NEW)AllDocsAllDocVersions

Non-relational binary large objects stored in dbo.DocStreamsEnables logical transactional consistency between relational data and the associated non-relational file contentsSmaller exaggerated storage utilization until T-log is purged

Page 31: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

BLOB Sequence Numbers (BSN)BSNs are used to keep track the sequence of each blob. BSN field (bigint) added to AllDocVersions, DocsToStreams and DocStreams tables.NextBSN stores the last BSN for each file.

Streams will be accessed from AllDocs/AllDocVersions DocsToStreams DocStreams.

Page 32: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Demo

Content Database Changes

Page 33: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Client ApplicationsProtocols, Impacts, Considerations

Page 34: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Overview (Windows Native SOAP Stack)

Windows Native SOAP StackDesign pattern based on wire contracts and loosely coupled systemsStandards-based and interoperable

Consolidation of existing stacksMIG, WDPG (Windows)ATL Server (VC++)SOAP Toolkit (Office)SQL Server

WCF ‘light’Peer to WCF, not a replacementWCF does not layer on the Windows Native SOAP StackSmall, fast, minimal dependenciesWindows layer 20

Win32 Developer APIsPublic “Flat” C APINo MFC, ATL, COM, C++

Page 35: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Why is Sapphire important?Client operationsOffice BackstageStore, Share, Sync

FileWriteChunkSize versus FileReadChunkSizeDownload size versus partition size

Page 36: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

CellStorage.svcWCF endpoint that manages download and upload of files to SharePointUsed by Office clients and OneDrive for BusinessAPI Layer that implements locking and coauth of documents

Page 37: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Demo

CellStorage.svc

Page 38: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Office Web AppsOnenote.ashxOffice documents are transformed into JSON arraysEach section of the document is distinguished and tracked separatelyThis allows for the multi-user editing of OWA

Shredded Storage will create new partitions and shreds based on the users editing document sections

Page 39: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Demo

OneNote.ashx

Page 40: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Configuration

Page 41: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Configuration ParametersFileWriteChunkSizeThe target size of the shreds of a file binary

FileReadChunkSizeThe size of the data returned from each Stored Procedure call to a file binary

Page 42: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

FileWriteChunkSizeThis value should not exceed 4MBSignificant hit on I/O will occur

The value should not be set lower than 64KOptimal setting will be based on workload1-4MB (Depending on performance testing, RBS, Dedup)OneDrive is set to 2MB

Page 43: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

FileReadChunkSizeRecommendations:>12.5% of average file size = normal operation6%<x>12.5% = 10% hit on read operations3%<x>6% = 20% hit on read operationsX<3% = 50% hit on read operations

Average size of out of box content database files is <64KBeware Too high of a setting OneDrive for Business will stop workingICsiError: csierrWebService_QuotaExceeded (0x662)

What is your average file size in your content databases?This average will drive your setting of FileReadChunkSize

Page 44: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Shredded Storage Testing Framework

Page 45: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Shredded Storage Testing FrameworkTool developed by Chris Givens with support from SharePoint ISVshttp://shreddedstorage.codeplex.com/

Monitoring features include:Office Client to WFEWFE to SQL trafficSQL activities

Achieved withFiddler integrationSQL Profiler integrationSupporting result tracking database

Page 46: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Demo

Shredded Storage Testing Framework

Page 47: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Knowledge Check

Page 48: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Question #1Can You Disable Shredded Storage?

Page 49: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint
Page 50: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Answer #1No, it cannot be disabledSetting the Write Chunk Size to 2GB will not disable it and will only cause performance issuesAny other “unknown” means will destroy your farm

Page 51: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Question #2TRUE or FALSEIf you set the File Write size to 128K, will the size of all the shreds be 128K except for the last?

Page 52: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint
Page 53: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Answer #2FALSEThe algorithms do not break up the shreds based solely on the File Write Chunk sizeIn some cases the header and footer of the shred will be of varying size

Page 54: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Question #3TRUE or FALSEIf you set the File Write size to X, will the size of all the shreds be less than X?

Page 55: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint
Page 56: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Answer #3FALSESimilarly, the algorithms do not break up the shreds based solely on the File Write Chunk sizeThe header and footer of the shred will be of varying size. This metadata does not count towards the File Write Chunk size

Page 57: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Question #4Is a lower or higher FileReadChunkSize better for download speeds?

Page 58: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint
Page 59: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Answer #4HigherEach chunk must be executed via a Stored Procedure call, the more calls you make, the more CPU and network activities will be generated

But not TOO high

Page 60: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Question #5TRUE OR FALSEImages in Word and PowerPoint files are broken into their own shreds

Page 61: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint
Page 62: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Answer #5FALSEImages are not distinguished from other entities inside the office file XML, therefore they are not shredded separately

Page 63: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Question #6TRUE OR FALSEShredded Storage will apply to all instances of the same file binary (ie, same file binary uploaded to multiple libraries)

Page 64: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint
Page 65: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Answer #6FALSEShredded storage works at an SPListItem level. Each time you upload a file to a document library, a new SPListItem is generated, therefore, no dedup across libraries

Page 66: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Question #7TRUE OR FALSEChanging the TITLE property of a versioned document will cause new shreds to get created

Page 67: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint
Page 68: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Answer #7TRUEEven though the file does not change, a new set of shreds are created for the new file version!This is a side effect of the SharePoint platform and not a bug in shredded storageTitle property is special and SharePoint set the property in the binary of the Office file upon modification (you didn’t change the file, but SharePoint did)

Page 69: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Question #8What is the max FileWriteChunkSize and everything works?

Page 70: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Answer #88.25MBIf you pass this value, OneDrive for Business will error out with the following:ICsiError: csierrWebService_QuotaExceeded (0x662)

Page 71: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

RecommendationsDon’t modify FileWriteChunkSize without justification (keep less than 4MB)FileReadChuckSize should be proportional to your average file size (dependent on your workload)Test your RBS and Storage vendors hardware and software for acceptable performance

Page 72: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

SummaryShredded Storage is AWESOME!Shredded Storage adds security to SharePointFort Knox

Read and Write chunk sizes will be different for workloadsYou cannot disable Shredded StorageShredded Storage with the combination of RBS and File DeDup should always be tested for performance

Page 73: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Questions?What do you want to know about Shredded Storage?

Page 74: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

Events Evening Event – 7pm

Page 75: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

SurveyDon’t forget to fill out your survey!Session SPC416

Page 76: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

ContactBill BaerTwitter: @williambaerEmail: [email protected]

Chris GivensTwitter: @givenscjEmail: [email protected]

Page 77: San Diego SPUG -  Bill Baer is a Senior Technical Product Manager and Microsoft Certified Master for SharePoint in the SharePoint

© 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.