Module 4
Designing Databases for Optimal Performance
Module Overview
• Guidelines for Designing Indexes
• Designing a Partitioning Strategy
• Designing a Plan Guide
• Designing Scalable Databases
Lesson 1: Guidelines for Designing Indexes
• Guidelines for Selecting a Clustered Index
• Guidelines for Selecting a Nonclustered Index
• Guidelines for Selecting a Filtered Index
• Guidelines for Selecting a Computed Column Index
• Guidelines for Selecting a Strategy for Index Compression
• Discussion: Using Indexing
Create a clustered index on the frequently used columns ü
Consider clustered index data types and column widthsü
Consider the frequency of data changesü
Clustered Index
Guidelines for Selecting a Clustered Index
id indid = 2 root
Page 12 - Root
Page 37 Page 28
Page 51 Page 61 Page 71
MartinSmith...
MartinMatherOwen
4:708:014:706:044:707:02
MartinAkersGanio...
Akers…MartinMartin
Owen 4:707:02Mather 4:706:04
NonleafLevel
Page 12 - Root
Page 37 Page 28
Leaf Level(Key Value)
Page 51 Page 61 Page 71 Page 41AkersBarrCon
4:706:014:705:034:704:01
MartinSmith...
SmithSmithSmith
4:706:034:708:044:707:01
GanioHall
Jones
4:709:014:709:034:709:02
MartinMatherOwen
4:708:014:706:044:707:02
sys.sysindexes
MartinAkersGanio...
Akers…MartinMartin
id indid = 2 root
Owen 4:707:02Mather 4:706:04
Guidelines for Selecting a Nonclustered Index
Consider performance gain versus maintenance costüIndex on frequently used search argumentsüConsider nonclustered indexes for columns with high selectivityüConsider placing nonclustered indexes on foreign key columnsüChoose a nonclustered index to cover the queryüConsider using included columnsüConsider using sys.sysindexes to gather information about an indexü
Create filtered indexes for heterogeneous dataü
Create filtered indexes for subsets of dataü
Compare views with filtered indexesü
Include a small number of key or included columns in a filtered index definitionü
Use filtered indexes when columns contain well-defined subsets of data ü
Compare indexed views with filtered indexesü
Use data conversion operators in the filter predicateü
Use referencing dependenciesü
Guidelines for Selecting a Filtered Index
• Assess benefits for common or important queries
• Assign only values of other columns in the same row
• Assess performance cost against performance gain
• Choose a deterministic and precise computed column expression
• Use CLR functions in computed columns to restrict access
Guidelines for Selecting a Computed Column Index
Compresses Nonclustered indexes individuallyüRebuild all the nonclustered indexes on the table to compress a heapüEnable or disable ROW or PAGE compression online or offlineüNon–leaf-level pages do not receive page compression when compressing indexesüData compression is not available for data that is stored separatelyü
Avoid specifying out-of-range partitionsüRebuild a heap to compress new pages allocated to the heapü
For individual partitions, set the compression type to NONE and for a list of partitions, set the type to ROWü
Compress tables with row size less than 8,060 bytesü
Guidelines for Selecting a Strategy for Index Compression
• Is it necessary for every table to have a clustered index? Justify your answer.
• An Orders table has a clustered index on the InvoiceNumber (int). The most frequently executed queries use SARG arguments on the OrderDate (datetime) column. A nonclustered index has been created on the OrderDate column. What are the advantages and disadvantages of this clustered index?
Discussion: Using Indexing
• Overview of Partitioning
• Guidelines for Planning Partitioned Tables and Indexes
• Designing Partitions to Manage Subsets of Data
• Designing Partitions to Improve Query Performance
• Special Guidelines for Partitioned Indexes
• Discussion: Using Partitioning
Lesson 2: Designing a Partitioning Strategy
Overview of Partitioning
Advantages of Partitioning When to Implement Partitioning?
Implement partitioning when:• The table contains, or is expected
to contain data that is used in different ways
• Queries or updates against the table are not performing as intended
• Maintenance costs exceed predefined maintenance periods
• Partitioning makes large tables or indexes more manageable
• Partitioned tables and indexes support designing and querying
• Maintenance operations performed on subsets of data can be performed more efficiently
• Partitioning a table or index might improve query performance
Partitioning helps to break a large table into multiple physical files without comprising the integrity or structure of the database
Guidelines for Planning Partitioned Tables and Indexes
Defines how the rows of a table or index are mapped to partitioning columns
Partition function
Maps each partition specified by the partition function to a filegroup
Partition scheme
Designing Partitions to Manage Subsets of Data
Adding a table as a partition to an already existing partitioned table
Switching a partition from onepartitioned table to another
Removing a partition to form a single table
Partitioning for Join Queries
Taking Advantage of Multiple Disk Drives
Controlling Lock Escalation Behavior
Designing Partitions to Improve Query Performance
Partitioning Clustered Indexes
Partitioning Nonclustered Indexes
Memory Limitations and Partitioned Indexes
Partitioning Unique Indexesü
ü
ü
ü
Special Guidelines for Partitioned Indexes
Discussion: Using Partitioning
• What problems does table partitioning solve? How?
• Please explain how to create a table partition, identifying the T-SQL object and statement level support
• Overview of Plan Guide
• Guidelines for Designing Plan Guides
• Designing Plan Guides for Parameterized Queries
• Discussion: Using Plan Guides
Lesson 3: Designing a Plan Guide
Types of plan guides include:
Plan guides in SQL Server are useful when a small subset of queries in a database application deployed from a third-party vendor are not performing as expected.
Plan guides influence optimization of queries by attaching query hints or a fixed query plan to them
• Object plan guide• SQL plan guide• Template plan guide
Overview of Plan Guide
Attach a query plan to a plan guide ü
Follow the plan guide that matches requirementsü
Evaluate the plan guide effect on the plan cache ü
Guidelines for Designing Plan Guides
Attach query hints to plan guide ü
To obtain the parameterized form of a query and create a plan guide on it, perform the following steps:
Obtain the parameterized form of the query by executing the sp_get_query_template
1
Create a plan guide of type TEMPLATE to force parameterization If the query is not already being parameterized by SQL Server by using the sp_executesql or the PARAMETERIZATION FORCED database SET option
2
Create a plan guide of type SQL on the parameterized query3
Designing Plan Guides for Parameterized Queries
What problems does plan guide solve? How?
Discussion: Using Plan Guides
• Guidelines for Scaling-Out Databases
• Overview of Federated Databases
• Selecting Federated Databases
• Overview of Scalable Shared Databases
• Guidelines for Selecting Scalable Shared Databases
• Overview of Replication
• Guidelines for Selecting Replication
• Overview of Database Mirroring
• Guidelines for Selecting Database Mirroring
• Discussion: Using Scalable Databases
Lesson 4: Designing Scalable Databases
Scale out to multiple database servers and instances
Scale out with redundancy
Scale up for improved performance
Guidelines for Scaling-Out Databases
Single Server Tier Federated Server Tier
There is one instance of SQL Server on the production server.
There is one instance of SQL Server on each member server.
The production data is stored in one database.
Each member server has a member database, containing a copy of each table, with only the data relevant to that site.
Each table is typically a single entity. Distributed partitioned views are used to make it appear as if there was a full copy of the original table on each member server.
All connections are made to the single server, and all SQL statements are processed by the same instance of SQL Server.
The application layer must be able to direct the SQL statements to the member server that contains most of the data referenced by the statement.
SQL Server shares the database processing load across a group of servers that process database requests cooperatively. This cooperative group of servers is called a federation.
Overview of Federated Databases
Symmetric partitions are effective when:• Related data is put on the same member server • Data is partitioned uniformly across the member servers
Selecting Federated Databases
Symmetric Partitions Asymmetric PartitionsDistributed Partitioned
Views
Asymmetric partitions can:• Improve the performance of databases that cannot be symmetrically
partitioned • Partition a large, existing system by using a series of iterative, asymmetric
improvements
To use distributed partitioned views, consider the:• Pattern of SQL statements executed by an application • Relationships of the tables• Frequency of SQL statements against the partitions • SQL statement routing rules
Symmetric Partitions Asymmetric PartitionsDistributed Partitioned
Views
Scalable shared databases let you attach a read-only reporting database to multiple server instances over a storage area network (SAN)
• Allows workload scale-out on reporting databases by using commodity servers and hardware
• Provides workload isolation • Ensures identical views of reporting
data from all servers
Benefits
• The database must be on a read-only volume
• The data files can be accessed only over a SAN
• The databases do not support database snapshots
Limitations
SAN
Overview of Scalable Shared Databases
• Verify that the reporting servers and associated reporting database are running on identical platforms
• Update all reporting servers for a scalable shared database uniformly• Limit your scalable shared database configurations to eight server instances
per shared database• Ensure that the reporting database has the same layout as the production
database• Use a single path for the reporting database and the production database• Ensure that the scalable shared database is on a read-only volume that is
accessible over your SAN from all the reporting servers• Ensure that all the server instances use the same sort order• Ensure that all the server instances use the same memory footprint
Guidelines for Selecting Scalable Shared Databases
Overview of Replication
Snapshot ReplicationDistributes data exactly as it appears at a specific moment in time and does not monitor for updates to the data
Transactional Replication
Takes an initial snapshot. Subsequent data changes and schema modifications are delivered to the Subscriber as they occur
Merge Replication
Takes an initial snapshot. Subsequent data changes and schema modifications are tracked with triggers
Peer-to-Peer ReplicationProvides a scale-out and high-availability solution by maintaining copies of data across multiple server instances
Replication
Snapshot Replication
MergeReplication
TransactionalReplication
Peer-to-PeerReplication
• Create and secure the snapshot folder• Estimate the disk space required to transfer and store snapshot files• Schedule snapshots at off-peak hours• Set up a mail-enabled user account in Active Directory Domain Services (ADDS)
• Ensure that any SELECT and INSERT statements that reference published tables use column lists
• Filter out Timestamp columns during article validation• Specify a value of TRUE for the @stream_blob_columns parameter of
sp_addmergearticle• Add a dummy UPDATE statement within a transaction• Track changes when performing bulk updates
• Ensure adequate space for the transaction log• Ensure adequate space for the distribution database• Declare primary keys for each published table• Consider the issues with using triggers• Consider using large object (LOB) data types
• Use each node for its own distribution database• Avoid including tables in multiple peer-to-peer publications in a single publication
database• Enable publications for peer-to-peer replication before creating subscriptions• Initialize subscriptions by using a backup• Avoid using identity columns
Snapshot Replication
MergeReplication
TransactionalReplication
Peer-to-PeerReplication
Guidelines for Selecting Replication
Benefits
Witness Server (optional)
Principal Server
Mirror Server
Data Flow
Improved data protection
Improved databaseavailability
Improved availability of the production databaseduring upgrades
Allows reporting of MirrorServer
Working of Database Mirroring
Overview of Database Mirroring
Consider using the high-performance mode for disaster-recovery scenarios in which the principal and mirror servers are separated by a significant distance and where you do not want small errors to impact the principal server
ü
Consider using log shipping as an alternative to asynchronous database mirroringü
Consider setting the WITNESS property to OFF if the SAFETY property is set to OFF when you use Transact-SQL to configure high-performance modeü
When the principal server fails, you can:
• Leave the database unavailable until the principal server becomes available• Manually update the database and then begin a new database mirroring session• Sparingly use forced service on the mirror server
ü
Guidelines for Selecting Database Mirroring
Discussion: Using Scalable Databases
• Federated databases can increase the total storage and performance in extremely high capacity or high performance systems. What is the single key element necessary to ensure that a query is executed on the server contains the appropriate data?
• What is the primary problem that scalable shared databases solve?
• A single table from the production database is required to be copied to a different database, on a different server instance. Select the best solution from the following options. Why?(A) Clustering, (B) Mirroring, (C) Replication
Logon Information
Estimated time: 60 minutes
• Exercise 1: Applying Optimization Techniques
• Exercise 2: Creating Plan Guides
• Exercise 3: Designing a Partitioning Strategy
Lab 4: Designing Databases for Optimal Performance
Virtual machine
User name
Password
NYC-SQL1
Administrator
Pa$$w0rd
You are a lead database administrator at QuantamCorp. You are working on the Human Resources Vacation and Sick Leave Enhancement (HR VASE) project that is designed to enhance the current HR system of your organization. This system is based on the QuantamCorp sample database in SQL Server 2008.
The main goals of the HR VASE project are as follows:
• Provide managers with current and historical information about employee vacation and sick-leave data.• Provide permission to individual employees to view their vacation and sick-leave balances.• Provide permission to selected employees in the HR department to view and update employee vacation and sick-leave data.• Provide permission to the HR manager to view and update all data.•Ensure that the application uses the database in an optimal way and optimize the performance of reports for managers and HR personnel.
You need to formulate a list of tasks that you would need to ensure optimal query performance. Before finalizing the task, you need to verify the result of each task.In this lab, you will examine the business requirements and identify different ways to improve performance. You will enhance the database performance by creating appropriate indexes, plan guide, and partition.
Lab Scenario
Lab Review
• What is the purpose of examining the database model, schema, data metadata, and dynamic management views before you decide the course of action to improve query performance.
• What is a plan guide?
• You are developing a partitioning scheme for your application database. The table that you need to partition is sorted according to the date. Users usually access yearly data from that table. How would you design the partitioning scheme?
• You are working on partitioning a data warehouse table by using a column that has the datetime datatype. Why you would you use RIGHT as the RANGE parameter for the partitioning scheme?
Module Review and Takeaways
• Review Questions
• Real-world Issues and Scenarios
• List of Tools