sql server clustered index design for performance
TRANSCRIPT
-
7/27/2019 SQL Server Clustered Index Design for Performance
1/17
-
7/27/2019 SQL Server Clustered Index Design for Performance
2/17
increasing and static in nature. The reason ever-increasing is so important has to do with
the range architecture I outlined earlier. If the values are not ever-increasing, then SQL
Server has to allocate space within existing ranges for those records rather than placingthem in new ranges at the end of the index.
If the values are not ever-increasing, then once the ranges fill up and a value comes inthat fits within a filled up index range, SQL Server will make room in an index by doing
a page split. Internally, SQL Server takes the filled up page and splits it into two separatepages that have substantially more room at that point but take significantly more
resources to process. You can prepare for this eventuality by setting a fill factor of 70%
or so, which gives you 30% free space for incoming values.
The problem with this approach is that you continually have to "reindex" the clustered
index so it maintains a free space percentage of 30%. Reindexing the clustered index will
also cause heavy I/O load since it has to move the actual data itself and any non-clustered
indexes have to be rebuilt, adding greatly to maintenance time.
If the clustered index is ever-increasing, you will not have to rebuild the clustered index;
you can set a 100% fill factor on the clustered index, and at that point you will only need
to reindex the less-intensive, non-clustered indexes as time progresses, resulting in more
up time.
Ever-increasing values will only add entries to the end of the index and build new ranges
when necessary. Logical fragmentation will not exist since the new values are continually
added to the end of the index and the fill factor will be 100%. The higher the fill factor,the more rows are stored on each page. Higher fill factors require less I/O, RAM and
CPU for queries. The smaller the data types you pick for the clustered index, the faster
the joins/queries will be. Also, since each non-clustered index requires it to contain theclustered index key, then the smaller the clustered index key and the smaller the non-clustered indexes will be.
The best data types for clustered indexes are generally pretty narrow. Referring to data
type size, it's typically a smallint, int, bigint or datetime. When datetime values are used
as the clustering index, they are the only column and are normally ever-increasing datevalues that are often queried as range data. Generally, you should avoid compound
(multiple columns) clustered indexes except in the following situations: many-to-many
tables and SQL Server 2005 partitioned tables that have the partitioning column includedas part of the clustered index to allow for index alignment.
Many-to-many tables and clustered indexes
Many-to-many tables are used for their extremely fast join capabilities and their ability to
allow for quick re-association of records, from one owning record to another. Considerthe following structure:
Customer
-
7/27/2019 SQL Server Clustered Index Design for Performance
3/17
CustomerID (bigint identity) Name Fieldn+
CustomerOrder
CustomerID OrderID
Orders
OrderID (bigint identity) Date Fieldn+
The clustered indexes in this structure would be CustomerID, OrderID. The compound
would be CustomerID/OrderID. Here are the benefits with this structure:
The joins are all based on clustered indexes (much faster than joins to non-
clustered indexes).
Moving an order to another customer only involves an update to the
CustomerOrder table, which is very narrow, with only one clustered index.
Therefore, it reduces the blocking that would occur if you had to update a wider
table such as Orders.
Use of a many-to-many table eliminates the need for some non-clustered indexes
on the wider tables such as Customer/Orders. Hence, it reduces the maintenance
time on the large tables.
One negative result of this approach is the fragmentation that occurs on theCustomerOrder table. However, that should not be a big issue, since the table is relatively
narrow, has only two columns with narrow data types and only one clustered index. The
elimination of the non-clustered indexes, which would be needed on the Orders table if it
contained CustomerID, more than makes up for this cost.
Clustered indexes and partitioned tables in SQL Server 2005
Partitioned tables in SQL Server 2005 are tables that appear to be a single table on the
surface, but behind the scenes -- at the storage subsystem level -- they are actually
multiple partitions that can be spread across many filegroups. The table partitions arespread across various filegroups based on the values in a single column. Partitioning
tables in this manner causes several side effects. I will just cover the basics here, to give
you some understanding of the factors involved. I recommend that you study partitionedtables before attempting to implement them.
You can create a clustered index in this environment based on only one column.
-
7/27/2019 SQL Server Clustered Index Design for Performance
4/17
But, if that one column is not the column the table is partitioned on, then the clustered
index is said to be non-aligned. If a clustered index is non-aligned, then any snapping
in/out (or merging) of partitions will require you to drop the clustered index along withthe non-clustered indexes and rebuild them from scratch. This is necessary because SQL
Server cannot tell what portions of the clustered/non-clustered indexes belong to which
table partitions. Needless to say, this will certainly cause system downtime.
The clustered index on a partitioned table should always contain the regular clusteringcolumn, which is ever-increasing and static, as well as the column that is used for
partitioning the table. If the clustered index includes the column used for partitioning the
table, then SQL Server knows what portion of the clustered/non-clustered indexes belongto which partition. Once a clustered index contains the column that the table is partitioned
on, then the clustered index is "aligned." Partitions can then be snapped in/out (and
merged) without rebuilding the clustered/non-clustered indexes, causing no downtime forthe system. Inserts/updates/deletes will also work faster, because those operations only
have to consider the indexes that reside on their particular partition.
Summary
SQL Server clustered indexes are an important part of database architecture and I hope
you've learned enough from this article to know why you need to carefully plan for themfrom the very start. It is vital for the future health of your database that clustered indexes
be narrow, static and ever-increasing. Clustered indexes can help you achieve faster join
times and faster IUD operations and minimize blocking as the system becomes busy.
Finally, we covered how partitioned tables in SQL Server 2005 affect your choices for
the clustered index, what it means to "align" the clustered index with the partitions, and
why clustered indexes have to be aligned in order for the partitioned table concept towork as intended. Keep watching for tips on non-clustered indexes (part two) coming inFebruary and optimal index maintenance (part three) in March.
Designing SQL Server non-clustered
indexes for query optimization
Non-clustered indexes are bookmarks that allow SQL Server to find shortcuts to the datayou're searching for. Non-clustered indexes are important because they allow you to
focus queries on a specific subset of the data instead of scanning the entire table. We'lladdress this critical topic by first hitting the basics, such as how clustered indexes interact
with non-clustered indexes, how to pick fields, when to use compound indexes and how
statistics influence non-clustered indexes.
The basics of non-clustered indexes in SQL Server
-
7/27/2019 SQL Server Clustered Index Design for Performance
5/17
A non-clustered index consists of the chosen fields and the clustered index value. If the
clustered index is not defined as unique, then SQL Server will use a clustered index value
plus a uniqueness value. Always define your clustered indexes as unique -- if they are infact unique -- because it will result in a smaller clustered/non-clustered index size. If your
unique clustered index consists of an int and you create a non-clustered index on a year
column (defined as smallint), then your non-clustered index will contain an int andsmallint for every row in the table. The size would increase according to the data types
chosen. So the smaller the clustered/non-clustered index data types are, the smaller the
resulting index size will be, and the maintenance capacity will increase.
Choosing fields for non-clustered indexes
The first rule is to never include the clustered index key fields in the non-clustered index.
The field is already part of the clustered index, so it will always be used for queries. The
only time it makes sense to include any clustered index key in a non-clustered index is
when the clustered index is a compound index and the query is referencing the second,
third or higher field in the compound index.
Assume you have the following table:
ID (identity, clustered unique) DateFrom DateTo Amt DateInserted Description
Now assume you always run queries such as:
Example 1:
Select *From tbl [t]
where t.datefrom = '12/12/2006' and
t.DateTo = '12/31/2006' and t.DateInserted
= '12/01/2006'
At this point it makes sense to have a non-clustered index defined on DateFrom, DateTo
and DateInserted, since that will always give the best unique results.
Now assume you run multiple queries such as:
Example 2:
Select *
From tbl [t]
where t.datefrom = '12/12/2006' and
t.DateInserted = '12/01/2006'
Select *
From tbl [t]
where t.datefrom = '12/12/2006'
-
7/27/2019 SQL Server Clustered Index Design for Performance
6/17
Select *
From tbl [t]
where t.DateTo = '12/31/2006'
Select *
From tbl [t]
where t.DateInserted = '12/01/2006'
Select *
From tbl [t]
where t.DateTo = '12/31/2006' and
t.DateInserted = '12/01/2006'
Select *
From tbl [t]
where t.id = 5 and t.DateTo = '12/31/2006'
and t.DateInserted = '12/01/2006'
Many people, at this point, would be tempted to create the following non-clustered
indexes:
1. DateFrom
2. DateTo
3. DateInserted4. DateTo and DateInserted
5. DateFrom and DateInserted
6. ID, DateTo and DateInserted
You probably expect the index size to increase dramatically at this point, since you arestoring DateFrom in two separate locations, DateTo in three locations and DateInserted in
four locations. On top of this, you've stored the clustered index key in seven locations.This approach increases I/O for insert, update and delete operations (also known as IUD
operations). Updates to the records must be written first to the clustered index data row.Then, the non-clustered indexes will have to be updated so they can be written to.
You should routinely ask yourself these questions:
Is the cost of additional I/O for IUD operations and maintenance worth the improved
query time? Will the additional I/O and increased maintenance time outweigh any performance
boost I get on the queries?
What will give me the most unique results with the least overhead as possible?
In this case, the best solution would be three non-clustered indexes as follows:
1. DateFrom
2. DateTo
3. DateInserted
-
7/27/2019 SQL Server Clustered Index Design for Performance
7/17
Each field in this scenario is only stored once, except for the primary key which is stored
on all three non-clustered indexes. As a result, the index size is much smaller and will
require less I/O and less maintenance. SQL Server will query each of the non-clusteredindexes, depending on the criteria chosen, and then hash the results together. While this is
not as efficient as Example 1, it is much more efficient than defining the five separate
non-clustered indexes. Real world queries will more often match Example 2 rather thanbeing structured as Example 1.
SQL Server statistics
Statistics tell SQL Server how many rows most likely match a given value. It gives SQL
Server an idea of how "unique" a value is, information it then uses to determine whetherto use an index. By default, SQL Server automatically updates statistics whenever it
thinks approximately 20% of the records have changed. In SQL Server 2000, this is done
synchronously with the IUD operation, delaying the completion of the IUD operation
while the rows are sampled. In SQL Server 2005, you can have it sample either
synchronously with the IUD operation or asynchronously after the IUD operation is done.The latter approach is better and will cause less blocking because locks will be released
sooner. I recommend turning off the database setting "Auto Update Statistics." Thissetting will increase your server loads at the worst times. Instead of letting SQL Server
automatically keep statistics up to date, create a job that calls the command "update
statistics" and runs during your slowest time. You can pick your own sampling ratiodepending on how accurate you want the statistics to be.
Statistics are only kept on the first column in any non-clustered index. What does this
mean in compound non-clustered indexes? It means SQL Server will use the first field to
determine whether an index should be used. Even if the second field in the compound
index will match 50% of the rows, the field still needs to be used to return the results (seeExample 3). Now, if the non-clustered index were split into two non-clustered indexes,
SQL Server might choose to use index 1, but not index 2. This is because the statistics onindex 2 may show that it will not benefit the query (see Example 4).
Example 3
Assume you have a compound, non-clustered index defined on DateFrom and Amt.
Statistics would only be kept on the DateFrom field within the index, and SQL Server
would have to seek (or scan) across both DateFrom and Amt. Since SQL Server has to
traverse more data, the query will be slower.
Example 4
Assume you have two non-clustered indexes: The first is defined on DateFrom and the
second is defined on Amt.
-
7/27/2019 SQL Server Clustered Index Design for Performance
8/17
Statistics would be kept on both fields because they are separate indexes. SQL Server
will examine the statistics on DateFrom and decide to use that index. It will then examine
the Amt column and may decide -- based on the statistics -- that the index is not uniqueenough and should be ignored. At this point, SQL Server would only need to traverse the
DateFrom field, rather than both DateFrom and Amt, resulting in a faster query.
By using non-clustered indexes in SQL Server, you'll be able to focus queries on a data
subset. Use the guidelines described in this tip to determine if it's best to create multiplenon-clustered indexes or a compound non-clustered index. Also keep in mind the role of
statistics and how they impact non-clustered indexes: Statistics affect the choice between
using multiple non-clustered indexes and a compound non-clustered index in SQLServer.
How to maintain SQL Server indexes for
query optimization
Maintaining SQL Server indexes is an uncommon practice. If a query stops usingindexes, oftentimes a new non-clustered index is created that simply holds a different
combination of columns or the same columns. A detailed analysis on why SQL Server is
ignoring those indexes is not explored.
Let's take a look at how clustered and non-clustered indexes are selected and why queryoptimizer might choose a table scan instead of a non-clustered index. In this tip, you'll
learn how page splits, fragmented indexes, table partitions and statistics updates affect the
use of indexes. Ultimately, you'll find out how to maintain SQL Server indexes so thatquery optimizer uses these indexes, and so these indexes are searched quickly.
Index selection
Clustered indexes are by far the easiest to understand in the area of index selection.
Clustered indexes are basically keys that reference each row uniquely. Even if you definea clustered index and do not declare it as unique, SQL Server still makes the clustered
index unique behind the scenes by adding a 4-byte "uniqueifier" to it. The additional
"uniqueifier" increases the width of the clustered index, which causes increased
maintenance time and slower searches. Since clustered indexes are the key that identifies
each row, they are used in every query.
When we start talking about non-clustered indexes, things get confusing. Queries can
ignore non-clustered indexes for the following reasons:
1. High fragmentation If an index is fragmented over 40%, the optimizer willprobably ignore the index because it's more costly to search a fragmented index
than to perform a table scan.
-
7/27/2019 SQL Server Clustered Index Design for Performance
9/17
2. Uniqueness If the optimizer determines that a non-clustered index is not very
unique, it may decide that a table scan is faster than trying to use the non-
clustered index. For example: If a query references a bit column (where bit = 1)and the statistics on the column say that 75% of the rows are 1, then the optimizer
will probably decide a table scan will get the results faster versus trying to scan
over a non-clustered index.3. Outdated statistics If the statistics on a column are out of date, then SQL Server
can misguide the benefit of a non-clustered index. Automatically updating
statistics doesn't just slow down your data modification scripts, but over time italso becomes out of sync with the real statistics of the rows. Occasionally it's a
good idea to run sp_updatestats or UPDATE STATISTICS.
4. Function usage SQL Server is unable to use indexes if a function is present in
the criteria. If you're referencing a non-clustered index column, but you're using afunction such as convert(varchar, Col1_Year) = 2004, then SQL Server cannot
use the index on Col1_Year.
5. Wrong columns If a non-clustered index is defined on (col1, col2, col3) and
your query has a where clause, such as "where col2 = 'somevalue'", that indexwon't be used. A non-clustered index can only be used if the first column in the
index is referenced within the where clause. A where clause, such as "where col3= 'someval'", would not use the index, but a where clause, like "where col1 =
'someval'" or "where col1='someval and col3 = 'someval2'" would pick up the
index.
The index would not use col3 for its seek, since that column is not after col1 inthe index definition. If you wanted col3 to have a seek occur in situations such as
this, then it is best if you define two separate non-clustered indexes, one on col1
and the other on col3.
Page splits
To store data, SQL Server uses pages that are 8 kb data blocks. The amount of data filling
the pages is called thefill factor, and the higher the fill factor, the more full the 8 kb page
is. A higher fill factor means fewer pages will be required resulting in less IO/CPU/RAM
usage. At this point, you might want to set all your indexes to 100% fill factor; however,here is the gotcha: Once the pages fill up and a value comes in that fits within a filled-up
index range, then SQL Server will make room in an index by doing a "page split." In
essence, SQL Server takes the full page and splits it into two separate pages, which havesubstantially more room at that point. You can account for this issue by setting a fill-
factor of 70% or so. This allows 30% free space for incoming values. The problem with
this approach is that you continually have to "re-index" the index so that it maintains afree space percentage of 30%.
Clustered index maintenance
Clustered indexes that are static or "ever-increasing" should have a fill factor of 100%.
Since the values are always increasing, pages will just be added to the end of the index
-
7/27/2019 SQL Server Clustered Index Design for Performance
10/17
and virtually no fragmentation will occur. For a more detailed explanation, see part 1 of
this series, SQL Server clustered index design for performance. This index category does
not need to be re-indexed because it doesn't fragment.
Clustered indexes that are either not static or "ever-increasing" will experience
fragmentation and page splits as the data rows move around within the data pages. Theindexes in this category have to be re-indexed in order to keep fragmentation low and
allow queries to efficiently use the index.
When you re-index these clustered indexes, you have to decide what the fill factor should
be. Normally this is 70% to 80%, giving you 20% to 30% empty space for new records
coming into the page. The optimal settings for your environment will depend on howoften records shift around, how many records are inserted and how often re-indexing
occurs. The goal is to set a fill factor low enough so that by the time you reach your next
maintenance cycle, the pages are around 95% full, but not yet splitting, which happens
when they hit the 100% limit.
Non-clustered index maintenance
Non-clustered indexes will always have data shifting around the pages. It's not quite as
big of an issue like it is with clustered indexes -- the actual row data shifts with clustered
indexes, whereas only row pointers shift with non-clustered indexes. That said, the samerules apply to non-clustered indexes as far as fill factors go. Again, the goal is to set a fill
factor low enough so that by the time you reach your next maintenance cycle, the pages
are only around 95% full.
Non-clustered indexes will always fragment, and to avoid this you must constantly
monitor and maintain them.
Partitioned table index considerations
Partitioned tables allow data to be segregated into different partitions, depending on the
data in a column. Many tables are partitioned based on date ranges. Let's say your ordertable is partitioned into years. Assuming the clustered index is aligned (see part 1 of this
series), then you could re-index the non-clustered indexes for, say, year 2000 at 100% fill
factor, since that data, technically, won't be shifting around. In this scenario, the year
2008 partition may have a fill factor of 70% on non-clustered indexes to allow for datashifts, but the year 2000 will not have any shifts and can be re-indexed at 100% fill factor
so you optimize index seeks.
The same concept would apply to clustered indexes that are either not static or ever-
increasing. Clustered indexes with shifting data might be set to 70% fill factor for theyear 2008 partition and 100% fill factor for the year 2000.
SQL Server statistics
-
7/27/2019 SQL Server Clustered Index Design for Performance
11/17
-
7/27/2019 SQL Server Clustered Index Design for Performance
12/17
Select @dbid =db_id('Northwind')
Select objectname=object_name(i.object_id)
, indexname=i.name, i.index_idfromsys.indexes i,sys.objects o
whereobjectproperty(o.object_id,'IsUserTable')= 1
and i.index_idNOTIN(select s.index_idfromsys.dm_db_index_usage_stats s
where s.object_id=i.object_idand
i.index_id=s.index_id anddatabase_id = @dbid )
and o.object_id= i.object_id
orderby objectname,i.index_id,indexname asc
Rarely used indexes will appear in sys.dm_db_index_usage_stats just like heavily used
indexes. To find rarely used indexes, you look at columns such as user_seeks,
user_scans, user_lookups, and user_updates.
--- rarely used indexes appear first
declare @dbid intselect @dbid =db_id()
select objectname=object_name(s.object_id), s.object_id, indexname=i.name, i.index_id
, user_seeks, user_scans, user_lookups, user_updatesfromsys.dm_db_index_usage_stats s,
sys.indexes i
where database_id = @dbid andobjectproperty(s.object_id,'IsUserTable')= 1
and i.object_id= s.object_idand i.index_id = s.index_id
orderby(user_seeks + user_scans + user_lookups + user_updates)asc
(3) What is the cost of index maintenance vs. its benefit?
If a table is heavily updated and also has indexes that are rarely used, the cost ofmaintaining the indexes could exceed the benefits. To compare the cost and benefit, you
can use the table valued function sys.dm_db_index_operational_stats as follows:
--- sys.dm_db_index_operational_statsdeclare @dbid int
select @dbid =db_id()
select objectname=object_name(s.object_id), indexname=i.name, i.index_id
, reads=range_scan_count + singleton_lookup_count
,'leaf_writes'=leaf_insert_count+leaf_update_count+ leaf_delete_count,'leaf_page_splits'= leaf_allocation_count
,'nonleaf_writes'=nonleaf_insert_count + nonleaf_update_count +
nonleaf_delete_count
,'nonleaf_page_splits'= nonleaf_allocation_count
-
7/27/2019 SQL Server Clustered Index Design for Performance
13/17
from sys.dm_db_index_operational_stats (@dbid,NULL,NULL,NULL) s,
sys.indexes i
whereobjectproperty(s.object_id,'IsUserTable')= 1and i.object_id= s.object_id
and i.index_id = s.index_id
orderby reads desc, leaf_writes, nonleaf_writes
--- sys.dm_db_index_usage_stats
select objectname=object_name(s.object_id), indexname=i.name, i.index_id ,reads=user_seeks + user_scans + user_lookups
,writes = user_updates
fromsys.dm_db_index_usage_stats s,
sys.indexes iwhereobjectproperty(s.object_id,'IsUserTable')= 1
and s.object_id= i.object_id
and i.index_id = s.index_id
and s.database_id = @dbidorderby reads desc
go
The difference between sys.dm_db_index_usage_stats and
sys.dm_db_index_operational_stats is as follows. Sys.dm_db_index_usage_stats countseach access as 1, whereas sys.dm_db_index_operational_stats counts depending on the
operation, pages or rows.
(4) Do I have hot spots & index contention?
Index contention (e.g. waits for locks) can be seen in
sys.dm_db_index_operational_stats. Columns such as row_lock_count,row_lock_wait_count, row_lock_wait_in_ms, page_lock_count, page_lock_wait_count,
page_lock_wait_in_ms, page_latch_wait_count, page_latch_wait_in_ms,
pageio_latch_wait_count, pageio_latch_wait_in_ms detail lock and latch contention interms of waits. You can determine the average blocking and lock waits by comparing
waits to counts as follows:
declare @dbid intselect @dbid =db_id()
Select dbid=database_id, objectname=object_name(s.object_id)
, indexname=i.name, i.index_id --, partition_number, row_lock_count, row_lock_wait_count
, [block %]=cast(100.0 * row_lock_wait_count /(1 + row_lock_count)as
numeric(15,2)), row_lock_wait_in_ms
, [avg row lock waits in ms]=cast(1.0 * row_lock_wait_in_ms /(1 +
row_lock_wait_count)asnumeric(15,2))
-
7/27/2019 SQL Server Clustered Index Design for Performance
14/17
from sys.dm_db_index_operational_stats (@dbid,NULL,NULL,NULL) s,
sys.indexes i
whereobjectproperty(s.object_id,'IsUserTable')= 1and i.object_id= s.object_id
and i.index_id = s.index_id
orderby row_lock_wait_count desc
The following report shows blocks in the [Order Details] table, index
OrdersOrder_Details. While blocks occur less than 2 percent of the time, when they dooccur, the average block time is 15.7 seconds.
It would be important to track this down using the SQL Profiler Blocked Process Report.You can set the Blocked Process Threshold to 15 using sp_configure Blocked Process
Threshold,15. Afterwards, you can run a trace to capture blocks over 15 seconds.
The Profiler trace will include the blocked and blocking process. The advantage of
tracing for long blocks is the blocked and blocking details can be saved in the trace fileand can be analyzed long after the block disappears. Historically, you can see the
common causes of blocks. In this case the blocked process is the stored procedure
NewCustOrder. The blocking process is the stored procedureUpdCustOrderShippedDate.
The caveat with Profiler Trace of Blocked Process Report is that in the case of storedprocedures, you cannot see the actual statement within the stored procedure that is
blocked. You do however, get thestmtstartandstmtendoffset that does identify the
statement blocked inside the stored procedure NewCustOrder. Using the above blockedprocess report, you could extract the blocked statement out of the NewCustOrder stored
procedure by providing thesqlhandle, stmtstartand stmtendas follows:
declare @sql_handle varbinary(64),
@stmtstart int,
@stmtend int
Select @sql_handle = 0x3000050005d9f67ea8425301059700000100000000000000
Select @stmtstart = 920, @stmtend = 1064
selectsubstring(qt.text,s.statement_start_offset/2,
(casewhen s.statement_end_offset =-1
thenlen(convert(nvarchar(max), qt.text))* 2else s.statement_end_offset end-s.statement_start_offset)/2)
as "blocked statement"
,s.statement_start_offset
,s.statement_end_offset
-
7/27/2019 SQL Server Clustered Index Design for Performance
15/17
,batch=qt.text
,qt.dbid
,qt.objectid ,s.execution_count
,s.total_worker_time
,s.total_elapsed_time ,s.total_logical_reads
,s.total_physical_reads
,s.total_logical_writesfromsys.dm_exec_query_stats s
crossapply sys.dm_exec_sql_text(s.sql_handle)as qt
where s.sql_handle = @sql_handle
and s.statement_start_offset = @stmtstartand s.statement_end_offset = @stmtend
You can capture the actual blocked statement of a stored procedure in realtime (as it is
occuring) using the following:
createproc sp_block_infoas
select t1.resource_type as [lock type]
,db_name(resource_database_id)as [database] ,t1.resource_associated_entity_id as [blk object]
,t1.request_mode as [lock req] ---
lock requested
,t1.request_session_id as [waiter sid] ---spid of waiter
,t2.wait_duration_ms as [wait time]
,(selecttextfromsys.dm_exec_requestsas r--- get sql for waiter
crossapply sys.dm_exec_sql_text(r.sql_handle)
where r.session_id = t1.request_session_id)as waiter_batch ,(selectsubstring(qt.text,r.statement_start_offset/2,
(casewhen r.statement_end_offset =-1
thenlen(convert(nvarchar(max), qt.text))* 2
else r.statement_end_offset end- r.statement_start_offset)/2)fromsys.dm_exec_requestsas r
crossapply sys.dm_exec_sql_text(r.sql_handle)as qt
where r.session_id = t1.request_session_id)as waiter_stmt ---statement blocked
,t2.blocking_session_id as [blocker sid]
-- spid of blocker ,(selecttextfromsys.sysprocessesas p --- get
sql for blocker
crossapply sys.dm_exec_sql_text(p.sql_handle)
where p.spid = t2.blocking_session_id)as blocker_stmt
-
7/27/2019 SQL Server Clustered Index Design for Performance
16/17
from
sys.dm_tran_locksas t1,
sys.dm_os_waiting_tasksas t2where
t1.lock_owner_address = t2.resource_address
go exec sp_block_info
(5) Could I benefit from more (or less) indexes?
Remembering that indexes involve both a maintenance cost and a read benefit, the overall
index cost benefit can be determined by comparing reads and writes. Reading an index
allows us to avoid table scans however they do require maintenance to be kept up-to-date. While it is easy to identify the fringe cases where indexes are not used, and the
rarely used cases, in the final analysis, index cost benefit is somewhat subjective. The
reason is the number of reads and writes are highly dependent on the workload and
frequency. In addition, qualitative factors beyond the number of reads and writes caninclude a highly important monthly management report or quarterly VP report in which
the maintenance cost is of secondary concern.
Writes of all indexes are performed forinserts, but there are no associated reads (unless
there are referential constraints). Besides select statements, reads are performed for
updates and deletes, writes are performed if rows qualify. OLTP workloads have lots of
small transactions, frequently combining select, insert, update and delete operations.
Data Warehouse activity is typically separated into batch windows having a high
concentation of write activity, followed by an on-line window of read activity.
SQL Statement Read WriteSelect Yes No
Insert No Yes, all indexes
Update Yes Yes, if row qualifies
Delete Yes Yes, if row qualifies
In general, you want to keep indexes to a functional minimum in a high transaction OLTP
environment due to high transaction throughput combined with the cost of index
maintenance and potential for blocking. In contrast, you pay for index maintenance onceduring the batch window when updates occur for a data warehouse. Thus, data
warehouses tend to have more indexes to benefit its read-intensive on-line users.
In conclusion, an important new feature of SQL Server 2005 includes DynamicManagement Views (DMVs). DMVs provide a level of transparency that was not
available in SQL Server 2000 and can be used for diagnostics, memory and process
tuning, and monitoring. DMVs can be useful in answering practical questions such asindex usage, cost benefit of indexes, and index hot spots. Finally, DMVs are queriable
with SELECT statements but are not persisted to disk. Thus they reflect changing server
state information since the last SQL Server recycle.
-
7/27/2019 SQL Server Clustered Index Design for Performance
17/17