sql server clustered index design for performance

7/27/2019 SQL Server Clustered Index Design for Performance

1/17


2/17

increasing and static in nature. The reason ever-increasing is so important has to do with

the range architecture I outlined earlier. If the values are not ever-increasing, then SQL

Server has to allocate space within existing ranges for those records rather than placingthem in new ranges at the end of the index.

If the values are not ever-increasing, then once the ranges fill up and a value comes inthat fits within a filled up index range, SQL Server will make room in an index by doing

a page split. Internally, SQL Server takes the filled up page and splits it into two separatepages that have substantially more room at that point but take significantly more

resources to process. You can prepare for this eventuality by setting a fill factor of 70%

or so, which gives you 30% free space for incoming values.

The problem with this approach is that you continually have to "reindex" the clustered

index so it maintains a free space percentage of 30%. Reindexing the clustered index will

also cause heavy I/O load since it has to move the actual data itself and any non-clustered

indexes have to be rebuilt, adding greatly to maintenance time.

If the clustered index is ever-increasing, you will not have to rebuild the clustered index;

you can set a 100% fill factor on the clustered index, and at that point you will only need

to reindex the less-intensive, non-clustered indexes as time progresses, resulting in more

up time.

Ever-increasing values will only add entries to the end of the index and build new ranges

when necessary. Logical fragmentation will not exist since the new values are continually

added to the end of the index and the fill factor will be 100%. The higher the fill factor,the more rows are stored on each page. Higher fill factors require less I/O, RAM and

CPU for queries. The smaller the data types you pick for the clustered index, the faster

the joins/queries will be. Also, since each non-clustered index requires it to contain theclustered index key, then the smaller the clustered index key and the smaller the non-clustered indexes will be.

The best data types for clustered indexes are generally pretty narrow. Referring to data

type size, it's typically a smallint, int, bigint or datetime. When datetime values are used

as the clustering index, they are the only column and are normally ever-increasing datevalues that are often queried as range data. Generally, you should avoid compound

(multiple columns) clustered indexes except in the following situations: many-to-many

tables and SQL Server 2005 partitioned tables that have the partitioning column includedas part of the clustered index to allow for index alignment.

Many-to-many tables and clustered indexes

Many-to-many tables are used for their extremely fast join capabilities and their ability to

allow for quick re-association of records, from one owning record to another. Considerthe following structure:

Customer


3/17

CustomerID (bigint identity) Name Fieldn+

CustomerOrder

CustomerID OrderID

Orders

OrderID (bigint identity) Date Fieldn+

The clustered indexes in this structure would be CustomerID, OrderID. The compound

would be CustomerID/OrderID. Here are the benefits with this structure:

The joins are all based on clustered indexes (much faster than joins to non-

clustered indexes).

Moving an order to another customer only involves an update to the

CustomerOrder table, which is very narrow, with only one clustered index.

Therefore, it reduces the blocking that would occur if you had to update a wider

table such as Orders.

Use of a many-to-many table eliminates the need for some non-clustered indexes

on the wider tables such as Customer/Orders. Hence, it reduces the maintenance

time on the large tables.

One negative result of this approach is the fragmentation that occurs on theCustomerOrder table. However, that should not be a big issue, since the table is relatively

narrow, has only two columns with narrow data types and only one clustered index. The

elimination of the non-clustered indexes, which would be needed on the Orders table if it

contained CustomerID, more than makes up for this cost.

Clustered indexes and partitioned tables in SQL Server 2005

Partitioned tables in SQL Server 2005 are tables that appear to be a single table on the

surface, but behind the scenes -- at the storage subsystem level -- they are actually

multiple partitions that can be spread across many filegroups. The table partitions arespread across various filegroups based on the values in a single column. Partitioning

tables in this manner causes several side effects. I will just cover the basics here, to give

you some understanding of the factors involved. I recommend that you study partitionedtables before attempting to implement them.

You can create a clustered index in this environment based on only one column.


4/17

But, if that one column is not the column the table is partitioned on, then the clustered

index is said to be non-aligned. If a clustered index is non-aligned, then any snapping

in/out (or merging) of partitions will require you to drop the clustered index along withthe non-clustered indexes and rebuild them from scratch. This is necessary because SQL

Server cannot tell what portions of the clustered/non-clustered indexes belong to which

table partitions. Needless to say, this will certainly cause system downtime.

The clustered index on a partitioned table should always contain the regular clusteringcolumn, which is ever-increasing and static, as well as the column that is used for

partitioning the table. If the clustered index includes the column used for partitioning the

table, then SQL Server knows what portion of the clustered/non-clustered indexes belongto which partition. Once a clustered index contains the column that the table is partitioned

on, then the clustered index is "aligned." Partitions can then be snapped in/out (and

merged) without rebuilding the clustered/non-clustered indexes, causing no downtime forthe system. Inserts/updates/deletes will also work faster, because those operations only

have to consider the indexes that reside on their particular partition.

Summary

SQL Server clustered indexes are an important part of database architecture and I hope

you've learned enough from this article to know why you need to carefully plan for themfrom the very start. It is vital for the future health of your database that clustered indexes

be narrow, static and ever-increasing. Clustered indexes can help you achieve faster join

times and faster IUD operations and minimize blocking as the system becomes busy.

Finally, we covered how partitioned tables in SQL Server 2005 affect your choices for

the clustered index, what it means to "align" the clustered index with the partitions, and

why clustered indexes have to be aligned in order for the partitioned table concept towork as intended. Keep watching for tips on non-clustered indexes (part two) coming inFebruary and optimal index maintenance (part three) in March.

Designing SQL Server non-clustered

indexes for query optimization

Non-clustered indexes are bookmarks that allow SQL Server to find shortcuts to the datayou're searching for. Non-clustered indexes are important because they allow you to

focus queries on a specific subset of the data instead of scanning the entire table. We'lladdress this critical topic by first hitting the basics, such as how clustered indexes interact

with non-clustered indexes, how to pick fields, when to use compound indexes and how

statistics influence non-clustered indexes.

The basics of non-clustered indexes in SQL Server


5/17

A non-clustered index consists of the chosen fields and the clustered index value. If the

clustered index is not defined as unique, then SQL Server will use a clustered index value

plus a uniqueness value. Always define your clustered indexes as unique -- if they are infact unique -- because it will result in a smaller clustered/non-clustered index size. If your

unique clustered index consists of an int and you create a non-clustered index on a year

column (defined as smallint), then your non-clustered index will contain an int andsmallint for every row in the table. The size would increase according to the data types

chosen. So the smaller the clustered/non-clustered index data types are, the smaller the

resulting index size will be, and the maintenance capacity will increase.

Choosing fields for non-clustered indexes

The first rule is to never include the clustered index key fields in the non-clustered index.

The field is already part of the clustered index, so it will always be used for queries. The

only time it makes sense to include any clustered index key in a non-clustered index is

when the clustered index is a compound index and the query is referencing the second,

third or higher field in the compound index.

Assume you have the following table:

ID (identity, clustered unique) DateFrom DateTo Amt DateInserted Description

Now assume you always run queries such as:

Example 1:

Select *From tbl [t]

where t.datefrom = '12/12/2006' and

t.DateTo = '12/31/2006' and t.DateInserted

= '12/01/2006'

At this point it makes sense to have a non-clustered index defined on DateFrom, DateTo

and DateInserted, since that will always give the best unique results.

Now assume you run multiple queries such as:

Example 2:

Select *

From tbl [t]

where t.datefrom = '12/12/2006' and

t.DateInserted = '12/01/2006'

Select *

From tbl [t]

where t.datefrom = '12/12/2006'


6/17

Select *

From tbl [t]

where t.DateTo = '12/31/2006'

Select *

From tbl [t]

where t.DateInserted = '12/01/2006'

Select *

From tbl [t]

where t.DateTo = '12/31/2006' and

t.DateInserted = '12/01/2006'

Select *

From tbl [t]

where t.id = 5 and t.DateTo = '12/31/2006'

and t.DateInserted = '12/01/2006'

Many people, at this point, would be tempted to create the following non-clustered

indexes:

1. DateFrom

2. DateTo

3. DateInserted4. DateTo and DateInserted

5. DateFrom and DateInserted

6. ID, DateTo and DateInserted

You probably expect the index size to increase dramatically at this point, since you arestoring DateFrom in two separate locations, DateTo in three locations and DateInserted in

four locations. On top of this, you've stored the clustered index key in seven locations.This approach increases I/O for insert, update and delete operations (also known as IUD

operations). Updates to the records must be written first to the clustered index data row.Then, the non-clustered indexes will have to be updated so they can be written to.

You should routinely ask yourself these questions:

Is the cost of additional I/O for IUD operations and maintenance worth the improved

query time? Will the additional I/O and increased maintenance time outweigh any performance

boost I get on the queries?

What will give me the most unique results with the least overhead as possible?

In this case, the best solution would be three non-clustered indexes as follows:

1. DateFrom

2. DateTo

3. DateInserted


7/17

Each field in this scenario is only stored once, except for the primary key which is stored

on all three non-clustered indexes. As a result, the index size is much smaller and will

require less I/O and less maintenance. SQL Server will query each of the non-clusteredindexes, depending on the criteria chosen, and then hash the results together. While this is

not as efficient as Example 1, it is much more efficient than defining the five separate

non-clustered indexes. Real world queries will more often match Example 2 rather thanbeing structured as Example 1.

SQL Server statistics

Statistics tell SQL Server how many rows most likely match a given value. It gives SQL

Server an idea of how "unique" a value is, information it then uses to determine whetherto use an index. By default, SQL Server automatically updates statistics whenever it

thinks approximately 20% of the records have changed. In SQL Server 2000, this is done

synchronously with the IUD operation, delaying the completion of the IUD operation

while the rows are sampled. In SQL Server 2005, you can have it sample either

synchronously with the IUD operation or asynchronously after the IUD operation is done.The latter approach is better and will cause less blocking because locks will be released

sooner. I recommend turning off the database setting "Auto Update Statistics." Thissetting will increase your server loads at the worst times. Instead of letting SQL Server

automatically keep statistics up to date, create a job that calls the command "update

statistics" and runs during your slowest time. You can pick your own sampling ratiodepending on how accurate you want the statistics to be.

Statistics are only kept on the first column in any non-clustered index. What does this

mean in compound non-clustered indexes? It means SQL Server will use the first field to

determine whether an index should be used. Even if the second field in the compound

index will match 50% of the rows, the field still needs to be used to return the results (seeExample 3). Now, if the non-clustered index were split into two non-clustered indexes,

SQL Server might choose to use index 1, but not index 2. This is because the statistics onindex 2 may show that it will not benefit the query (see Example 4).

Example 3

Assume you have a compound, non-clustered index defined on DateFrom and Amt.

Statistics would only be kept on the DateFrom field within the index, and SQL Server

would have to seek (or scan) across both DateFrom and Amt. Since SQL Server has to

traverse more data, the query will be slower.

Example 4

Assume you have two non-clustered indexes: The first is defined on DateFrom and the

second is defined on Amt.


8/17

Statistics would be kept on both fields because they are separate indexes. SQL Server

will examine the statistics on DateFrom and decide to use that index. It will then examine

the Amt column and may decide -- based on the statistics -- that the index is not uniqueenough and should be ignored. At this point, SQL Server would only need to traverse the

DateFrom field, rather than both DateFrom and Amt, resulting in a faster query.

By using non-clustered indexes in SQL Server, you'll be able to focus queries on a data

subset. Use the guidelines described in this tip to determine if it's best to create multiplenon-clustered indexes or a compound non-clustered index. Also keep in mind the role of

statistics and how they impact non-clustered indexes: Statistics affect the choice between

using multiple non-clustered indexes and a compound non-clustered index in SQLServer.

How to maintain SQL Server indexes for

query optimization

Maintaining SQL Server indexes is an uncommon practice. If a query stops usingindexes, oftentimes a new non-clustered index is created that simply holds a different

combination of columns or the same columns. A detailed analysis on why SQL Server is

ignoring those indexes is not explored.

Let's take a look at how clustered and non-clustered indexes are selected and why queryoptimizer might choose a table scan instead of a non-clustered index. In this tip, you'll

learn how page splits, fragmented indexes, table partitions and statistics updates affect the

use of indexes. Ultimately, you'll find out how to maintain SQL Server indexes so thatquery optimizer uses these indexes, and so these indexes are searched quickly.

Index selection

Clustered indexes are by far the easiest to understand in the area of index selection.

Clustered indexes are basically keys that reference each row uniquely. Even if you definea clustered index and do not declare it as unique, SQL Server still makes the clustered

index unique behind the scenes by adding a 4-byte "uniqueifier" to it. The additional

"uniqueifier" increases the width of the clustered index, which causes increased

maintenance time and slower searches. Since clustered indexes are the key that identifies

each row, they are used in every query.

When we start talking about non-clustered indexes, things get confusing. Queries can

ignore non-clustered indexes for the following reasons:

1. High fragmentation If an index is fragmented over 40%, the optimizer willprobably ignore the index because it's more costly to search a fragmented index

than to perform a table scan.


9/17

2. Uniqueness If the optimizer determines that a non-clustered index is not very

unique, it may decide that a table scan is faster than trying to use the non-

clustered index. For example: If a query references a bit column (where bit = 1)and the statistics on the column say that 75% of the rows are 1, then the optimizer

will probably decide a table scan will get the results faster versus trying to scan

over a non-clustered index.3. Outdated statistics If the statistics on a column are out of date, then SQL Server

can misguide the benefit of a non-clustered index. Automatically updating

statistics doesn't just slow down your data modification scripts, but over time italso becomes out of sync with the real statistics of the rows. Occasionally it's a

good idea to run sp_updatestats or UPDATE STATISTICS.

4. Function usage SQL Server is unable to use indexes if a function is present in

the criteria. If you're referencing a non-clustered index column, but you're using afunction such as convert(varchar, Col1_Year) = 2004, then SQL Server cannot

use the index on Col1_Year.

5. Wrong columns If a non-clustered index is defined on (col1, col2, col3) and

your query has a where clause, such as "where col2 = 'somevalue'", that indexwon't be used. A non-clustered index can only be used if the first column in the

index is referenced within the where clause. A where clause, such as "where col3= 'someval'", would not use the index, but a where clause, like "where col1 =

'someval'" or "where col1='someval and col3 = 'someval2'" would pick up the

index.

The index would not use col3 for its seek, since that column is not after col1 inthe index definition. If you wanted col3 to have a seek occur in situations such as

this, then it is best if you define two separate non-clustered indexes, one on col1

and the other on col3.

Page splits

To store data, SQL Server uses pages that are 8 kb data blocks. The amount of data filling

the pages is called thefill factor, and the higher the fill factor, the more full the 8 kb page

is. A higher fill factor means fewer pages will be required resulting in less IO/CPU/RAM

usage. At this point, you might want to set all your indexes to 100% fill factor; however,here is the gotcha: Once the pages fill up and a value comes in that fits within a filled-up

index range, then SQL Server will make room in an index by doing a "page split." In

essence, SQL Server takes the full page and splits it into two separate pages, which havesubstantially more room at that point. You can account for this issue by setting a fill-

factor of 70% or so. This allows 30% free space for incoming values. The problem with

this approach is that you continually have to "re-index" the index so that it maintains afree space percentage of 30%.

Clustered index maintenance

Clustered indexes that are static or "ever-increasing" should have a fill factor of 100%.

Since the values are always increasing, pages will just be added to the end of the index


10/17

and virtually no fragmentation will occur. For a more detailed explanation, see part 1 of

this series, SQL Server clustered index design for performance. This index category does

not need to be re-indexed because it doesn't fragment.

Clustered indexes that are either not static or "ever-increasing" will experience

fragmentation and page splits as the data rows move around within the data pages. Theindexes in this category have to be re-indexed in order to keep fragmentation low and

allow queries to efficiently use the index.

When you re-index these clustered indexes, you have to decide what the fill factor should

be. Normally this is 70% to 80%, giving you 20% to 30% empty space for new records

coming into the page. The optimal settings for your environment will depend on howoften records shift around, how many records are inserted and how often re-indexing

occurs. The goal is to set a fill factor low enough so that by the time you reach your next

maintenance cycle, the pages are around 95% full, but not yet splitting, which happens

when they hit the 100% limit.

Non-clustered index maintenance

Non-clustered indexes will always have data shifting around the pages. It's not quite as

big of an issue like it is with clustered indexes -- the actual row data shifts with clustered

indexes, whereas only row pointers shift with non-clustered indexes. That said, the samerules apply to non-clustered indexes as far as fill factors go. Again, the goal is to set a fill

factor low enough so that by the time you reach your next maintenance cycle, the pages

are only around 95% full.

Non-clustered indexes will always fragment, and to avoid this you must constantly

monitor and maintain them.

Partitioned table index considerations

Partitioned tables allow data to be segregated into different partitions, depending on the

data in a column. Many tables are partitioned based on date ranges. Let's say your ordertable is partitioned into years. Assuming the clustered index is aligned (see part 1 of this

series), then you could re-index the non-clustered indexes for, say, year 2000 at 100% fill

factor, since that data, technically, won't be shifting around. In this scenario, the year

2008 partition may have a fill factor of 70% on non-clustered indexes to allow for datashifts, but the year 2000 will not have any shifts and can be re-indexed at 100% fill factor

so you optimize index seeks.

The same concept would apply to clustered indexes that are either not static or ever-

increasing. Clustered indexes with shifting data might be set to 70% fill factor for theyear 2008 partition and 100% fill factor for the year 2000.

SQL Server statistics


11/17


12/17

Select @dbid =db_id('Northwind')

Select objectname=object_name(i.object_id)

, indexname=i.name, i.index_idfromsys.indexes i,sys.objects o

whereobjectproperty(o.object_id,'IsUserTable')= 1

and i.index_idNOTIN(select s.index_idfromsys.dm_db_index_usage_stats s

where s.object_id=i.object_idand

i.index_id=s.index_id anddatabase_id = @dbid )

and o.object_id= i.object_id

orderby objectname,i.index_id,indexname asc

Rarely used indexes will appear in sys.dm_db_index_usage_stats just like heavily used

indexes. To find rarely used indexes, you look at columns such as user_seeks,

user_scans, user_lookups, and user_updates.

--- rarely used indexes appear first

declare @dbid intselect @dbid =db_id()

select objectname=object_name(s.object_id), s.object_id, indexname=i.name, i.index_id

, user_seeks, user_scans, user_lookups, user_updatesfromsys.dm_db_index_usage_stats s,

sys.indexes i

where database_id = @dbid andobjectproperty(s.object_id,'IsUserTable')= 1

and i.object_id= s.object_idand i.index_id = s.index_id

orderby(user_seeks + user_scans + user_lookups + user_updates)asc

(3) What is the cost of index maintenance vs. its benefit?

If a table is heavily updated and also has indexes that are rarely used, the cost ofmaintaining the indexes could exceed the benefits. To compare the cost and benefit, you

can use the table valued function sys.dm_db_index_operational_stats as follows:

--- sys.dm_db_index_operational_statsdeclare @dbid int

select @dbid =db_id()

select objectname=object_name(s.object_id), indexname=i.name, i.index_id

, reads=range_scan_count + singleton_lookup_count

,'leaf_writes'=leaf_insert_count+leaf_update_count+ leaf_delete_count,'leaf_page_splits'= leaf_allocation_count

,'nonleaf_writes'=nonleaf_insert_count + nonleaf_update_count +

nonleaf_delete_count

,'nonleaf_page_splits'= nonleaf_allocation_count


13/17

from sys.dm_db_index_operational_stats (@dbid,NULL,NULL,NULL) s,

sys.indexes i

whereobjectproperty(s.object_id,'IsUserTable')= 1and i.object_id= s.object_id

and i.index_id = s.index_id

orderby reads desc, leaf_writes, nonleaf_writes

--- sys.dm_db_index_usage_stats

select objectname=object_name(s.object_id), indexname=i.name, i.index_id ,reads=user_seeks + user_scans + user_lookups

,writes = user_updates

fromsys.dm_db_index_usage_stats s,

sys.indexes iwhereobjectproperty(s.object_id,'IsUserTable')= 1

and s.object_id= i.object_id


and s.database_id = @dbidorderby reads desc

go

The difference between sys.dm_db_index_usage_stats and

sys.dm_db_index_operational_stats is as follows. Sys.dm_db_index_usage_stats countseach access as 1, whereas sys.dm_db_index_operational_stats counts depending on the

operation, pages or rows.

(4) Do I have hot spots & index contention?

Index contention (e.g. waits for locks) can be seen in

sys.dm_db_index_operational_stats. Columns such as row_lock_count,row_lock_wait_count, row_lock_wait_in_ms, page_lock_count, page_lock_wait_count,

page_lock_wait_in_ms, page_latch_wait_count, page_latch_wait_in_ms,

pageio_latch_wait_count, pageio_latch_wait_in_ms detail lock and latch contention interms of waits. You can determine the average blocking and lock waits by comparing

waits to counts as follows:

declare @dbid intselect @dbid =db_id()

Select dbid=database_id, objectname=object_name(s.object_id)

, indexname=i.name, i.index_id --, partition_number, row_lock_count, row_lock_wait_count

, [block %]=cast(100.0 * row_lock_wait_count /(1 + row_lock_count)as

numeric(15,2)), row_lock_wait_in_ms

, [avg row lock waits in ms]=cast(1.0 * row_lock_wait_in_ms /(1 +

row_lock_wait_count)asnumeric(15,2))


14/17

from sys.dm_db_index_operational_stats (@dbid,NULL,NULL,NULL) s,

sys.indexes i

whereobjectproperty(s.object_id,'IsUserTable')= 1and i.object_id= s.object_id


orderby row_lock_wait_count desc

The following report shows blocks in the [Order Details] table, index

OrdersOrder_Details. While blocks occur less than 2 percent of the time, when they dooccur, the average block time is 15.7 seconds.

It would be important to track this down using the SQL Profiler Blocked Process Report.You can set the Blocked Process Threshold to 15 using sp_configure Blocked Process

Threshold,15. Afterwards, you can run a trace to capture blocks over 15 seconds.

The Profiler trace will include the blocked and blocking process. The advantage of

tracing for long blocks is the blocked and blocking details can be saved in the trace fileand can be analyzed long after the block disappears. Historically, you can see the

common causes of blocks. In this case the blocked process is the stored procedure

NewCustOrder. The blocking process is the stored procedureUpdCustOrderShippedDate.

The caveat with Profiler Trace of Blocked Process Report is that in the case of storedprocedures, you cannot see the actual statement within the stored procedure that is

blocked. You do however, get thestmtstartandstmtendoffset that does identify the

statement blocked inside the stored procedure NewCustOrder. Using the above blockedprocess report, you could extract the blocked statement out of the NewCustOrder stored

procedure by providing thesqlhandle, stmtstartand stmtendas follows:

declare @sql_handle varbinary(64),

@stmtstart int,

@stmtend int

Select @sql_handle = 0x3000050005d9f67ea8425301059700000100000000000000

Select @stmtstart = 920, @stmtend = 1064

selectsubstring(qt.text,s.statement_start_offset/2,

(casewhen s.statement_end_offset =-1

thenlen(convert(nvarchar(max), qt.text))* 2else s.statement_end_offset end-s.statement_start_offset)/2)

as "blocked statement"

,s.statement_start_offset

,s.statement_end_offset


15/17

,batch=qt.text

,qt.dbid

,qt.objectid ,s.execution_count

,s.total_worker_time

,s.total_elapsed_time ,s.total_logical_reads

,s.total_physical_reads

,s.total_logical_writesfromsys.dm_exec_query_stats s

crossapply sys.dm_exec_sql_text(s.sql_handle)as qt

where s.sql_handle = @sql_handle

and s.statement_start_offset = @stmtstartand s.statement_end_offset = @stmtend

You can capture the actual blocked statement of a stored procedure in realtime (as it is

occuring) using the following:

createproc sp_block_infoas

select t1.resource_type as [lock type]

,db_name(resource_database_id)as [database] ,t1.resource_associated_entity_id as [blk object]

,t1.request_mode as [lock req] ---

lock requested

,t1.request_session_id as [waiter sid] ---spid of waiter

,t2.wait_duration_ms as [wait time]

,(selecttextfromsys.dm_exec_requestsas r--- get sql for waiter

crossapply sys.dm_exec_sql_text(r.sql_handle)

where r.session_id = t1.request_session_id)as waiter_batch ,(selectsubstring(qt.text,r.statement_start_offset/2,

(casewhen r.statement_end_offset =-1

thenlen(convert(nvarchar(max), qt.text))* 2

else r.statement_end_offset end- r.statement_start_offset)/2)fromsys.dm_exec_requestsas r

crossapply sys.dm_exec_sql_text(r.sql_handle)as qt

where r.session_id = t1.request_session_id)as waiter_stmt ---statement blocked

,t2.blocking_session_id as [blocker sid]

-- spid of blocker ,(selecttextfromsys.sysprocessesas p --- get

sql for blocker

crossapply sys.dm_exec_sql_text(p.sql_handle)

where p.spid = t2.blocking_session_id)as blocker_stmt


16/17

from

sys.dm_tran_locksas t1,

sys.dm_os_waiting_tasksas t2where

t1.lock_owner_address = t2.resource_address

go exec sp_block_info

(5) Could I benefit from more (or less) indexes?

Remembering that indexes involve both a maintenance cost and a read benefit, the overall

index cost benefit can be determined by comparing reads and writes. Reading an index

allows us to avoid table scans however they do require maintenance to be kept up-to-date. While it is easy to identify the fringe cases where indexes are not used, and the

rarely used cases, in the final analysis, index cost benefit is somewhat subjective. The

reason is the number of reads and writes are highly dependent on the workload and

frequency. In addition, qualitative factors beyond the number of reads and writes caninclude a highly important monthly management report or quarterly VP report in which

the maintenance cost is of secondary concern.

Writes of all indexes are performed forinserts, but there are no associated reads (unless

there are referential constraints). Besides select statements, reads are performed for

updates and deletes, writes are performed if rows qualify. OLTP workloads have lots of

small transactions, frequently combining select, insert, update and delete operations.

Data Warehouse activity is typically separated into batch windows having a high

concentation of write activity, followed by an on-line window of read activity.

SQL Statement Read WriteSelect Yes No

Insert No Yes, all indexes

Update Yes Yes, if row qualifies

Delete Yes Yes, if row qualifies

In general, you want to keep indexes to a functional minimum in a high transaction OLTP

environment due to high transaction throughput combined with the cost of index

maintenance and potential for blocking. In contrast, you pay for index maintenance onceduring the batch window when updates occur for a data warehouse. Thus, data

warehouses tend to have more indexes to benefit its read-intensive on-line users.

In conclusion, an important new feature of SQL Server 2005 includes DynamicManagement Views (DMVs). DMVs provide a level of transparency that was not

available in SQL Server 2000 and can be used for diagnostics, memory and process

tuning, and monitoring. DMVs can be useful in answering practical questions such asindex usage, cost benefit of indexes, and index hot spots. Finally, DMVs are queriable

with SELECT statements but are not persisted to disk. Thus they reflect changing server

state information since the last SQL Server recycle.


17/17

sql server clustered index design for performance

Documents