russ houberg senior technical architect, mcm knowledgelake, inc

25

Upload: alexina-hoover

Post on 24-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Architecting for Scale in SharePoint 2010Russ HoubergSenior Technical Architect, MCMKnowledgeLake, Inc.

Storage ArchitectureSQL Tuning TidbitsRemote Blob Storage (Demo)Performance and Control Scalable Taxonomy Design (Demo)Search… A Complete StoryThe Big Picture: 10 million, 100 million

A BILLION Documents…

Scaling SP2010 from the Ground Up

Storage Architecture can make or break SharePoint Performance• Poor storage performance can tank the whole SharePoint

farm!

Can Be Tough to Estimate• Use an extendable storage platform if possible

Wider is Better• More spindles always better than higher GB• Avoid using a small number of large disks for increasing

storage capacity

Storage Architecture

TempDB, Search DBs, Content DBs• Multiple Data Files in Primary File Group• # Files = ½ to ¼ of CPU Cores | <= CPU Cores• Separate to unique spindle sets if possible

• Pre-Allocate all Data Files, Including TempDB• Estimate Projected DB Size and Divide by # Files to get the pre-

allocation size for each file

• Leave “AutoGrow” enabled, but don’t rely on it• Pre-Allocation to prevent AutoGrow• Set AutoGrow to 10% or logical MB/GB value based on projected

database Size

Storage Architecture

Data / Log File Spindle Priority

Storage Architecture

Priority DB File RAID IOPS Optimization

1 TempDB Data RAID 10 2 IOPS/GB Write

2 TempDB Log RAID 10 2 IOPS/GB Write

3 Content/DB Log RAID 10 2 IOPS/GB Write

4 Crawl DB Log RAID 10 2 IOPS/GB Write

5 Crawl DB Data RAID 10 2 IOPS/GB Read/Write

6 Property DB Log RAID 10 2 IOPS/GB Write

7 Property DB Data RAID 10 2 IOPS/GB Read/Write

8 Services DB Log RAID 10 2 IOPS/GB Write

9 Services DB Data [Depends] [Depends] [Depends]

10 Content DB Data (Collab) RAID 10 0.75 IOPS/GB Read / Write

11 Content DB Data (Archive) RAID 5 0.75 IOPS/GB Read

SQL Instant Initialization• Run SQL As Domain User with either…• Local Admin • Grant “Perform Volume Maintenance Tasks”

TempDB Pre-Allocation to 10% Largest DBSAN vs DAS vs NAS (Don’t Overshare!)Host Bus Adapter (HBA) ConfigurationNTFS Allocation Unit Size: 64KEnable Locked Pages in Memory (SQL Std.)Don’t skimp on RAM!

SQL Tuning Tidbits

Remote BLOB Storage (RBS)• By default SharePoint stores Binary Large Objects (BLOBs) in

the content database

• When enabled… Intercepts binary content (documents) and sends them to a BLOB store

• Microsoft provides the “local” FILESTREAM provider to allow for usage of the SQL Server local NTFS file system as a BLOB store.

RBS Background

Remote BLOB StorageWhat’s this ECM thing?- Interesting workarounds• API access was problematic

SharePoint 2003

SP1 Brings us EBS Provider- BLOBs are orphaned during edit/save- Orphan cleanup is resource intensive- Externalization happens on the WFE (reduced RPS)- Future support of EBS API is not guaranteed

SharePoint 2007

Long Live RBS- Transactional consistency supports “VETO”- Transactional consistency allows for UPDATE- Orphan cleanup uses SQL Indexes- Transparent to the SharePoint API- RBS is the best option for future support

SharePoint 2010

Remote BLOB StorageSharePoint WFE

SharePoint Object Model

BLOB StoreProvider Library

BlobStore

SQL Server

ContentDB

ConfigDB

2. Enforce Business

Logic

RBS Client Library Relational Access

1. Save Request

3. Save Blob

4. Write Blob

5. Return BLOB ID

6. Save Metadata & BLOB ID

7. Back to User

SQL Server 2008 R2• Any Version, even SQL Express R2

FILESTREAM RBS Provider (Current Version)• http://go.microsoft.com/fwlink/?LinkId=177388

RBS Requirements

The FILESTREAM provider is supported by SharePoint Server 2010 only when it is used with SQL Server 2008 R2 or SQL Server 2008 R2 Express. • Only “local commodity storage” (hard drive) is supported.• Direct Attached Storage (DAS), Network Attached Storage

(NAS), and Storage Area Network (SAN) are all considered to be “remote commodity storage” and are not supported by SharePoint 2010.

Any other 3rd Party RBS Provider is considered to be a “remote server” provider and SharePoint 2010 licensing requires that SQL Server 2008 R2 Enterprise Edition be implemented.

RBS Licensing and Limitations

Remote BLOB Storage

demo…

Performance and Control- Column Indexes were not possible- Database Indexes were not supported

SharePoint 2003

- Column Indexes (10) could be configured via the UI- End users could impact performance with poor performing list views

SharePoint 2007

- Database optimizations allow far more items in a list- Support for (20) Multi-Column Indexes- Resource intensive operations can be limited or disallowed during production hours• Large query thresholds• Blocking Operations• Can be overridden via the Object Model• Can configure an unblocked “window”

SharePoint 2010

SP2010 Boundaries – Now More Stuff!!!• 30 Million Documents/Items in a List• 5000 Item View/Query Result Size (Default for a reason)• 100 Million Items in SharePoint Server 2010 Search• 1 BILLION Items in FAST For SharePoint 2010 Index• 250,000 Site Collections per Web Application• 200GB Content DB Size (SOFT LIMIT)• Recommend for Collaboration content or Fast Backup/Restore SLA• Content DB sizes up to 1TB are SUPPORTED for large single-site

repositories and archives of non-collaborative content!• That’s 150 Million items in a single Site Collection in a single Content

Database with RBS enabled (avg. 7KB metadata row)

Scalable Taxonomy Design

Enabling 100 Million• Place large Collaboration Site Collections (20GB+) in their

own content database• Break Up Archive/Records Site Collections by Year or, if

necessary, Content Type and Year• AVOID Item Level ACLs!!!• Release to Metadata Based Folder Structures as a workaround

• Use Content Type Syndication to facilitate multiple Site Collections of the same type

• Use Content Organizer as a “Drop Zone”

Scalable Taxonomy Design

Content Organization

demo…

Search… A Complete Story- WSS CAML Only- SPS Shared Services yielded decent full text results

SharePoint 2003

- WSS 3.0 SiteDataQuery allowed search across lists/sites- MOSS Search added Managed Properties - FAST ESP for SharePoint was a late player

SharePoint 2007

- Microsoft SharePoint Foundation Search- Site Collection Scope | No Redundancy | 10 Million

- Microsoft Search Server Express 2010- Extended Features| No Redundancy | 10 Million

- Microsoft SharePoint 2010 Search / Search Server- Extended Features | Scale Out | Redundancy | 100 Million

- Microsoft FAST Search Server 2010 for SharePoint - Extreme Scale | Redundancy | Doc Processing Pipeline- 1 Billion documents! (per farm)

SharePoint 2010

SharePoint Server 2010 / Search Server• Multiple Crawl Servers (Scale Out/Redundancy)• Crawl Servers comprised of stateless Crawlers• Multiple Crawlers improve crawl performance• Multiple Crawl DBs support more Crawlers• Crawl DB is separated from Property DB• Index is comprised of multiple Index Partitions that can be

mirrored on different Query Servers• Multiple Index Partitions improve Query Performance

Search… A Complete Story

Cool… What can it do?

Search… A Complete Story

FAST Search Server 2010 for SharePoint• Extreme Scale and Performance• Custom Relevancy and Navigation Tuning• Tune Performance for content volume, query volume, crawl

pipeline performance and query speed• Uses SharePoint 2010 Query Servers• Bolt on FAST Servers for additional processing• Add server ROWS for query performance and high availability

or COLUMNS for crawl performance• Can scale to support 1 Billion items!

Search… A Complete Story

10 million, 100 million, 1 Billion

Storage is the KEY to PerformanceRBS reduces Content DB Size and facilitates large repositoriesSharePoint governs end-user operations Content Type Publishing and Content Organization help balance database loadingSearch solutions now handle the entire range of corpus possibilities10 million is easy, 100 million can be done, 1 BILLION is possible!

In Review…

http://www.houberg.net

@rhouberg

http://www.knowledgelake.com/resources

More…