sql server clustered index design for performance

Upload: andera4u

Post on 14-Apr-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    1/17

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    2/17

    increasing and static in nature. The reason ever-increasing is so important has to do with

    the range architecture I outlined earlier. If the values are not ever-increasing, then SQL

    Server has to allocate space within existing ranges for those records rather than placingthem in new ranges at the end of the index.

    If the values are not ever-increasing, then once the ranges fill up and a value comes inthat fits within a filled up index range, SQL Server will make room in an index by doing

    a page split. Internally, SQL Server takes the filled up page and splits it into two separatepages that have substantially more room at that point but take significantly more

    resources to process. You can prepare for this eventuality by setting a fill factor of 70%

    or so, which gives you 30% free space for incoming values.

    The problem with this approach is that you continually have to "reindex" the clustered

    index so it maintains a free space percentage of 30%. Reindexing the clustered index will

    also cause heavy I/O load since it has to move the actual data itself and any non-clustered

    indexes have to be rebuilt, adding greatly to maintenance time.

    If the clustered index is ever-increasing, you will not have to rebuild the clustered index;

    you can set a 100% fill factor on the clustered index, and at that point you will only need

    to reindex the less-intensive, non-clustered indexes as time progresses, resulting in more

    up time.

    Ever-increasing values will only add entries to the end of the index and build new ranges

    when necessary. Logical fragmentation will not exist since the new values are continually

    added to the end of the index and the fill factor will be 100%. The higher the fill factor,the more rows are stored on each page. Higher fill factors require less I/O, RAM and

    CPU for queries. The smaller the data types you pick for the clustered index, the faster

    the joins/queries will be. Also, since each non-clustered index requires it to contain theclustered index key, then the smaller the clustered index key and the smaller the non-clustered indexes will be.

    The best data types for clustered indexes are generally pretty narrow. Referring to data

    type size, it's typically a smallint, int, bigint or datetime. When datetime values are used

    as the clustering index, they are the only column and are normally ever-increasing datevalues that are often queried as range data. Generally, you should avoid compound

    (multiple columns) clustered indexes except in the following situations: many-to-many

    tables and SQL Server 2005 partitioned tables that have the partitioning column includedas part of the clustered index to allow for index alignment.

    Many-to-many tables and clustered indexes

    Many-to-many tables are used for their extremely fast join capabilities and their ability to

    allow for quick re-association of records, from one owning record to another. Considerthe following structure:

    Customer

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    3/17

    CustomerID (bigint identity) Name Fieldn+

    CustomerOrder

    CustomerID OrderID

    Orders

    OrderID (bigint identity) Date Fieldn+

    The clustered indexes in this structure would be CustomerID, OrderID. The compound

    would be CustomerID/OrderID. Here are the benefits with this structure:

    The joins are all based on clustered indexes (much faster than joins to non-

    clustered indexes).

    Moving an order to another customer only involves an update to the

    CustomerOrder table, which is very narrow, with only one clustered index.

    Therefore, it reduces the blocking that would occur if you had to update a wider

    table such as Orders.

    Use of a many-to-many table eliminates the need for some non-clustered indexes

    on the wider tables such as Customer/Orders. Hence, it reduces the maintenance

    time on the large tables.

    One negative result of this approach is the fragmentation that occurs on theCustomerOrder table. However, that should not be a big issue, since the table is relatively

    narrow, has only two columns with narrow data types and only one clustered index. The

    elimination of the non-clustered indexes, which would be needed on the Orders table if it

    contained CustomerID, more than makes up for this cost.

    Clustered indexes and partitioned tables in SQL Server 2005

    Partitioned tables in SQL Server 2005 are tables that appear to be a single table on the

    surface, but behind the scenes -- at the storage subsystem level -- they are actually

    multiple partitions that can be spread across many filegroups. The table partitions arespread across various filegroups based on the values in a single column. Partitioning

    tables in this manner causes several side effects. I will just cover the basics here, to give

    you some understanding of the factors involved. I recommend that you study partitionedtables before attempting to implement them.

    You can create a clustered index in this environment based on only one column.

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    4/17

    But, if that one column is not the column the table is partitioned on, then the clustered

    index is said to be non-aligned. If a clustered index is non-aligned, then any snapping

    in/out (or merging) of partitions will require you to drop the clustered index along withthe non-clustered indexes and rebuild them from scratch. This is necessary because SQL

    Server cannot tell what portions of the clustered/non-clustered indexes belong to which

    table partitions. Needless to say, this will certainly cause system downtime.

    The clustered index on a partitioned table should always contain the regular clusteringcolumn, which is ever-increasing and static, as well as the column that is used for

    partitioning the table. If the clustered index includes the column used for partitioning the

    table, then SQL Server knows what portion of the clustered/non-clustered indexes belongto which partition. Once a clustered index contains the column that the table is partitioned

    on, then the clustered index is "aligned." Partitions can then be snapped in/out (and

    merged) without rebuilding the clustered/non-clustered indexes, causing no downtime forthe system. Inserts/updates/deletes will also work faster, because those operations only

    have to consider the indexes that reside on their particular partition.

    Summary

    SQL Server clustered indexes are an important part of database architecture and I hope

    you've learned enough from this article to know why you need to carefully plan for themfrom the very start. It is vital for the future health of your database that clustered indexes

    be narrow, static and ever-increasing. Clustered indexes can help you achieve faster join

    times and faster IUD operations and minimize blocking as the system becomes busy.

    Finally, we covered how partitioned tables in SQL Server 2005 affect your choices for

    the clustered index, what it means to "align" the clustered index with the partitions, and

    why clustered indexes have to be aligned in order for the partitioned table concept towork as intended. Keep watching for tips on non-clustered indexes (part two) coming inFebruary and optimal index maintenance (part three) in March.

    Designing SQL Server non-clustered

    indexes for query optimization

    Non-clustered indexes are bookmarks that allow SQL Server to find shortcuts to the datayou're searching for. Non-clustered indexes are important because they allow you to

    focus queries on a specific subset of the data instead of scanning the entire table. We'lladdress this critical topic by first hitting the basics, such as how clustered indexes interact

    with non-clustered indexes, how to pick fields, when to use compound indexes and how

    statistics influence non-clustered indexes.

    The basics of non-clustered indexes in SQL Server

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    5/17

    A non-clustered index consists of the chosen fields and the clustered index value. If the

    clustered index is not defined as unique, then SQL Server will use a clustered index value

    plus a uniqueness value. Always define your clustered indexes as unique -- if they are infact unique -- because it will result in a smaller clustered/non-clustered index size. If your

    unique clustered index consists of an int and you create a non-clustered index on a year

    column (defined as smallint), then your non-clustered index will contain an int andsmallint for every row in the table. The size would increase according to the data types

    chosen. So the smaller the clustered/non-clustered index data types are, the smaller the

    resulting index size will be, and the maintenance capacity will increase.

    Choosing fields for non-clustered indexes

    The first rule is to never include the clustered index key fields in the non-clustered index.

    The field is already part of the clustered index, so it will always be used for queries. The

    only time it makes sense to include any clustered index key in a non-clustered index is

    when the clustered index is a compound index and the query is referencing the second,

    third or higher field in the compound index.

    Assume you have the following table:

    ID (identity, clustered unique) DateFrom DateTo Amt DateInserted Description

    Now assume you always run queries such as:

    Example 1:

    Select *From tbl [t]

    where t.datefrom = '12/12/2006' and

    t.DateTo = '12/31/2006' and t.DateInserted

    = '12/01/2006'

    At this point it makes sense to have a non-clustered index defined on DateFrom, DateTo

    and DateInserted, since that will always give the best unique results.

    Now assume you run multiple queries such as:

    Example 2:

    Select *

    From tbl [t]

    where t.datefrom = '12/12/2006' and

    t.DateInserted = '12/01/2006'

    Select *

    From tbl [t]

    where t.datefrom = '12/12/2006'

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    6/17

    Select *

    From tbl [t]

    where t.DateTo = '12/31/2006'

    Select *

    From tbl [t]

    where t.DateInserted = '12/01/2006'

    Select *

    From tbl [t]

    where t.DateTo = '12/31/2006' and

    t.DateInserted = '12/01/2006'

    Select *

    From tbl [t]

    where t.id = 5 and t.DateTo = '12/31/2006'

    and t.DateInserted = '12/01/2006'

    Many people, at this point, would be tempted to create the following non-clustered

    indexes:

    1. DateFrom

    2. DateTo

    3. DateInserted4. DateTo and DateInserted

    5. DateFrom and DateInserted

    6. ID, DateTo and DateInserted

    You probably expect the index size to increase dramatically at this point, since you arestoring DateFrom in two separate locations, DateTo in three locations and DateInserted in

    four locations. On top of this, you've stored the clustered index key in seven locations.This approach increases I/O for insert, update and delete operations (also known as IUD

    operations). Updates to the records must be written first to the clustered index data row.Then, the non-clustered indexes will have to be updated so they can be written to.

    You should routinely ask yourself these questions:

    Is the cost of additional I/O for IUD operations and maintenance worth the improved

    query time? Will the additional I/O and increased maintenance time outweigh any performance

    boost I get on the queries?

    What will give me the most unique results with the least overhead as possible?

    In this case, the best solution would be three non-clustered indexes as follows:

    1. DateFrom

    2. DateTo

    3. DateInserted

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    7/17

    Each field in this scenario is only stored once, except for the primary key which is stored

    on all three non-clustered indexes. As a result, the index size is much smaller and will

    require less I/O and less maintenance. SQL Server will query each of the non-clusteredindexes, depending on the criteria chosen, and then hash the results together. While this is

    not as efficient as Example 1, it is much more efficient than defining the five separate

    non-clustered indexes. Real world queries will more often match Example 2 rather thanbeing structured as Example 1.

    SQL Server statistics

    Statistics tell SQL Server how many rows most likely match a given value. It gives SQL

    Server an idea of how "unique" a value is, information it then uses to determine whetherto use an index. By default, SQL Server automatically updates statistics whenever it

    thinks approximately 20% of the records have changed. In SQL Server 2000, this is done

    synchronously with the IUD operation, delaying the completion of the IUD operation

    while the rows are sampled. In SQL Server 2005, you can have it sample either

    synchronously with the IUD operation or asynchronously after the IUD operation is done.The latter approach is better and will cause less blocking because locks will be released

    sooner. I recommend turning off the database setting "Auto Update Statistics." Thissetting will increase your server loads at the worst times. Instead of letting SQL Server

    automatically keep statistics up to date, create a job that calls the command "update

    statistics" and runs during your slowest time. You can pick your own sampling ratiodepending on how accurate you want the statistics to be.

    Statistics are only kept on the first column in any non-clustered index. What does this

    mean in compound non-clustered indexes? It means SQL Server will use the first field to

    determine whether an index should be used. Even if the second field in the compound

    index will match 50% of the rows, the field still needs to be used to return the results (seeExample 3). Now, if the non-clustered index were split into two non-clustered indexes,

    SQL Server might choose to use index 1, but not index 2. This is because the statistics onindex 2 may show that it will not benefit the query (see Example 4).

    Example 3

    Assume you have a compound, non-clustered index defined on DateFrom and Amt.

    Statistics would only be kept on the DateFrom field within the index, and SQL Server

    would have to seek (or scan) across both DateFrom and Amt. Since SQL Server has to

    traverse more data, the query will be slower.

    Example 4

    Assume you have two non-clustered indexes: The first is defined on DateFrom and the

    second is defined on Amt.

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    8/17

    Statistics would be kept on both fields because they are separate indexes. SQL Server

    will examine the statistics on DateFrom and decide to use that index. It will then examine

    the Amt column and may decide -- based on the statistics -- that the index is not uniqueenough and should be ignored. At this point, SQL Server would only need to traverse the

    DateFrom field, rather than both DateFrom and Amt, resulting in a faster query.

    By using non-clustered indexes in SQL Server, you'll be able to focus queries on a data

    subset. Use the guidelines described in this tip to determine if it's best to create multiplenon-clustered indexes or a compound non-clustered index. Also keep in mind the role of

    statistics and how they impact non-clustered indexes: Statistics affect the choice between

    using multiple non-clustered indexes and a compound non-clustered index in SQLServer.

    How to maintain SQL Server indexes for

    query optimization

    Maintaining SQL Server indexes is an uncommon practice. If a query stops usingindexes, oftentimes a new non-clustered index is created that simply holds a different

    combination of columns or the same columns. A detailed analysis on why SQL Server is

    ignoring those indexes is not explored.

    Let's take a look at how clustered and non-clustered indexes are selected and why queryoptimizer might choose a table scan instead of a non-clustered index. In this tip, you'll

    learn how page splits, fragmented indexes, table partitions and statistics updates affect the

    use of indexes. Ultimately, you'll find out how to maintain SQL Server indexes so thatquery optimizer uses these indexes, and so these indexes are searched quickly.

    Index selection

    Clustered indexes are by far the easiest to understand in the area of index selection.

    Clustered indexes are basically keys that reference each row uniquely. Even if you definea clustered index and do not declare it as unique, SQL Server still makes the clustered

    index unique behind the scenes by adding a 4-byte "uniqueifier" to it. The additional

    "uniqueifier" increases the width of the clustered index, which causes increased

    maintenance time and slower searches. Since clustered indexes are the key that identifies

    each row, they are used in every query.

    When we start talking about non-clustered indexes, things get confusing. Queries can

    ignore non-clustered indexes for the following reasons:

    1. High fragmentation If an index is fragmented over 40%, the optimizer willprobably ignore the index because it's more costly to search a fragmented index

    than to perform a table scan.

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    9/17

    2. Uniqueness If the optimizer determines that a non-clustered index is not very

    unique, it may decide that a table scan is faster than trying to use the non-

    clustered index. For example: If a query references a bit column (where bit = 1)and the statistics on the column say that 75% of the rows are 1, then the optimizer

    will probably decide a table scan will get the results faster versus trying to scan

    over a non-clustered index.3. Outdated statistics If the statistics on a column are out of date, then SQL Server

    can misguide the benefit of a non-clustered index. Automatically updating

    statistics doesn't just slow down your data modification scripts, but over time italso becomes out of sync with the real statistics of the rows. Occasionally it's a

    good idea to run sp_updatestats or UPDATE STATISTICS.

    4. Function usage SQL Server is unable to use indexes if a function is present in

    the criteria. If you're referencing a non-clustered index column, but you're using afunction such as convert(varchar, Col1_Year) = 2004, then SQL Server cannot

    use the index on Col1_Year.

    5. Wrong columns If a non-clustered index is defined on (col1, col2, col3) and

    your query has a where clause, such as "where col2 = 'somevalue'", that indexwon't be used. A non-clustered index can only be used if the first column in the

    index is referenced within the where clause. A where clause, such as "where col3= 'someval'", would not use the index, but a where clause, like "where col1 =

    'someval'" or "where col1='someval and col3 = 'someval2'" would pick up the

    index.

    The index would not use col3 for its seek, since that column is not after col1 inthe index definition. If you wanted col3 to have a seek occur in situations such as

    this, then it is best if you define two separate non-clustered indexes, one on col1

    and the other on col3.

    Page splits

    To store data, SQL Server uses pages that are 8 kb data blocks. The amount of data filling

    the pages is called thefill factor, and the higher the fill factor, the more full the 8 kb page

    is. A higher fill factor means fewer pages will be required resulting in less IO/CPU/RAM

    usage. At this point, you might want to set all your indexes to 100% fill factor; however,here is the gotcha: Once the pages fill up and a value comes in that fits within a filled-up

    index range, then SQL Server will make room in an index by doing a "page split." In

    essence, SQL Server takes the full page and splits it into two separate pages, which havesubstantially more room at that point. You can account for this issue by setting a fill-

    factor of 70% or so. This allows 30% free space for incoming values. The problem with

    this approach is that you continually have to "re-index" the index so that it maintains afree space percentage of 30%.

    Clustered index maintenance

    Clustered indexes that are static or "ever-increasing" should have a fill factor of 100%.

    Since the values are always increasing, pages will just be added to the end of the index

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    10/17

    and virtually no fragmentation will occur. For a more detailed explanation, see part 1 of

    this series, SQL Server clustered index design for performance. This index category does

    not need to be re-indexed because it doesn't fragment.

    Clustered indexes that are either not static or "ever-increasing" will experience

    fragmentation and page splits as the data rows move around within the data pages. Theindexes in this category have to be re-indexed in order to keep fragmentation low and

    allow queries to efficiently use the index.

    When you re-index these clustered indexes, you have to decide what the fill factor should

    be. Normally this is 70% to 80%, giving you 20% to 30% empty space for new records

    coming into the page. The optimal settings for your environment will depend on howoften records shift around, how many records are inserted and how often re-indexing

    occurs. The goal is to set a fill factor low enough so that by the time you reach your next

    maintenance cycle, the pages are around 95% full, but not yet splitting, which happens

    when they hit the 100% limit.

    Non-clustered index maintenance

    Non-clustered indexes will always have data shifting around the pages. It's not quite as

    big of an issue like it is with clustered indexes -- the actual row data shifts with clustered

    indexes, whereas only row pointers shift with non-clustered indexes. That said, the samerules apply to non-clustered indexes as far as fill factors go. Again, the goal is to set a fill

    factor low enough so that by the time you reach your next maintenance cycle, the pages

    are only around 95% full.

    Non-clustered indexes will always fragment, and to avoid this you must constantly

    monitor and maintain them.

    Partitioned table index considerations

    Partitioned tables allow data to be segregated into different partitions, depending on the

    data in a column. Many tables are partitioned based on date ranges. Let's say your ordertable is partitioned into years. Assuming the clustered index is aligned (see part 1 of this

    series), then you could re-index the non-clustered indexes for, say, year 2000 at 100% fill

    factor, since that data, technically, won't be shifting around. In this scenario, the year

    2008 partition may have a fill factor of 70% on non-clustered indexes to allow for datashifts, but the year 2000 will not have any shifts and can be re-indexed at 100% fill factor

    so you optimize index seeks.

    The same concept would apply to clustered indexes that are either not static or ever-

    increasing. Clustered indexes with shifting data might be set to 70% fill factor for theyear 2008 partition and 100% fill factor for the year 2000.

    SQL Server statistics

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    11/17

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    12/17

    Select @dbid =db_id('Northwind')

    Select objectname=object_name(i.object_id)

    , indexname=i.name, i.index_idfromsys.indexes i,sys.objects o

    whereobjectproperty(o.object_id,'IsUserTable')= 1

    and i.index_idNOTIN(select s.index_idfromsys.dm_db_index_usage_stats s

    where s.object_id=i.object_idand

    i.index_id=s.index_id anddatabase_id = @dbid )

    and o.object_id= i.object_id

    orderby objectname,i.index_id,indexname asc

    Rarely used indexes will appear in sys.dm_db_index_usage_stats just like heavily used

    indexes. To find rarely used indexes, you look at columns such as user_seeks,

    user_scans, user_lookups, and user_updates.

    --- rarely used indexes appear first

    declare @dbid intselect @dbid =db_id()

    select objectname=object_name(s.object_id), s.object_id, indexname=i.name, i.index_id

    , user_seeks, user_scans, user_lookups, user_updatesfromsys.dm_db_index_usage_stats s,

    sys.indexes i

    where database_id = @dbid andobjectproperty(s.object_id,'IsUserTable')= 1

    and i.object_id= s.object_idand i.index_id = s.index_id

    orderby(user_seeks + user_scans + user_lookups + user_updates)asc

    (3) What is the cost of index maintenance vs. its benefit?

    If a table is heavily updated and also has indexes that are rarely used, the cost ofmaintaining the indexes could exceed the benefits. To compare the cost and benefit, you

    can use the table valued function sys.dm_db_index_operational_stats as follows:

    --- sys.dm_db_index_operational_statsdeclare @dbid int

    select @dbid =db_id()

    select objectname=object_name(s.object_id), indexname=i.name, i.index_id

    , reads=range_scan_count + singleton_lookup_count

    ,'leaf_writes'=leaf_insert_count+leaf_update_count+ leaf_delete_count,'leaf_page_splits'= leaf_allocation_count

    ,'nonleaf_writes'=nonleaf_insert_count + nonleaf_update_count +

    nonleaf_delete_count

    ,'nonleaf_page_splits'= nonleaf_allocation_count

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    13/17

    from sys.dm_db_index_operational_stats (@dbid,NULL,NULL,NULL) s,

    sys.indexes i

    whereobjectproperty(s.object_id,'IsUserTable')= 1and i.object_id= s.object_id

    and i.index_id = s.index_id

    orderby reads desc, leaf_writes, nonleaf_writes

    --- sys.dm_db_index_usage_stats

    select objectname=object_name(s.object_id), indexname=i.name, i.index_id ,reads=user_seeks + user_scans + user_lookups

    ,writes = user_updates

    fromsys.dm_db_index_usage_stats s,

    sys.indexes iwhereobjectproperty(s.object_id,'IsUserTable')= 1

    and s.object_id= i.object_id

    and i.index_id = s.index_id

    and s.database_id = @dbidorderby reads desc

    go

    The difference between sys.dm_db_index_usage_stats and

    sys.dm_db_index_operational_stats is as follows. Sys.dm_db_index_usage_stats countseach access as 1, whereas sys.dm_db_index_operational_stats counts depending on the

    operation, pages or rows.

    (4) Do I have hot spots & index contention?

    Index contention (e.g. waits for locks) can be seen in

    sys.dm_db_index_operational_stats. Columns such as row_lock_count,row_lock_wait_count, row_lock_wait_in_ms, page_lock_count, page_lock_wait_count,

    page_lock_wait_in_ms, page_latch_wait_count, page_latch_wait_in_ms,

    pageio_latch_wait_count, pageio_latch_wait_in_ms detail lock and latch contention interms of waits. You can determine the average blocking and lock waits by comparing

    waits to counts as follows:

    declare @dbid intselect @dbid =db_id()

    Select dbid=database_id, objectname=object_name(s.object_id)

    , indexname=i.name, i.index_id --, partition_number, row_lock_count, row_lock_wait_count

    , [block %]=cast(100.0 * row_lock_wait_count /(1 + row_lock_count)as

    numeric(15,2)), row_lock_wait_in_ms

    , [avg row lock waits in ms]=cast(1.0 * row_lock_wait_in_ms /(1 +

    row_lock_wait_count)asnumeric(15,2))

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    14/17

    from sys.dm_db_index_operational_stats (@dbid,NULL,NULL,NULL) s,

    sys.indexes i

    whereobjectproperty(s.object_id,'IsUserTable')= 1and i.object_id= s.object_id

    and i.index_id = s.index_id

    orderby row_lock_wait_count desc

    The following report shows blocks in the [Order Details] table, index

    OrdersOrder_Details. While blocks occur less than 2 percent of the time, when they dooccur, the average block time is 15.7 seconds.

    It would be important to track this down using the SQL Profiler Blocked Process Report.You can set the Blocked Process Threshold to 15 using sp_configure Blocked Process

    Threshold,15. Afterwards, you can run a trace to capture blocks over 15 seconds.

    The Profiler trace will include the blocked and blocking process. The advantage of

    tracing for long blocks is the blocked and blocking details can be saved in the trace fileand can be analyzed long after the block disappears. Historically, you can see the

    common causes of blocks. In this case the blocked process is the stored procedure

    NewCustOrder. The blocking process is the stored procedureUpdCustOrderShippedDate.

    The caveat with Profiler Trace of Blocked Process Report is that in the case of storedprocedures, you cannot see the actual statement within the stored procedure that is

    blocked. You do however, get thestmtstartandstmtendoffset that does identify the

    statement blocked inside the stored procedure NewCustOrder. Using the above blockedprocess report, you could extract the blocked statement out of the NewCustOrder stored

    procedure by providing thesqlhandle, stmtstartand stmtendas follows:

    declare @sql_handle varbinary(64),

    @stmtstart int,

    @stmtend int

    Select @sql_handle = 0x3000050005d9f67ea8425301059700000100000000000000

    Select @stmtstart = 920, @stmtend = 1064

    selectsubstring(qt.text,s.statement_start_offset/2,

    (casewhen s.statement_end_offset =-1

    thenlen(convert(nvarchar(max), qt.text))* 2else s.statement_end_offset end-s.statement_start_offset)/2)

    as "blocked statement"

    ,s.statement_start_offset

    ,s.statement_end_offset

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    15/17

    ,batch=qt.text

    ,qt.dbid

    ,qt.objectid ,s.execution_count

    ,s.total_worker_time

    ,s.total_elapsed_time ,s.total_logical_reads

    ,s.total_physical_reads

    ,s.total_logical_writesfromsys.dm_exec_query_stats s

    crossapply sys.dm_exec_sql_text(s.sql_handle)as qt

    where s.sql_handle = @sql_handle

    and s.statement_start_offset = @stmtstartand s.statement_end_offset = @stmtend

    You can capture the actual blocked statement of a stored procedure in realtime (as it is

    occuring) using the following:

    createproc sp_block_infoas

    select t1.resource_type as [lock type]

    ,db_name(resource_database_id)as [database] ,t1.resource_associated_entity_id as [blk object]

    ,t1.request_mode as [lock req] ---

    lock requested

    ,t1.request_session_id as [waiter sid] ---spid of waiter

    ,t2.wait_duration_ms as [wait time]

    ,(selecttextfromsys.dm_exec_requestsas r--- get sql for waiter

    crossapply sys.dm_exec_sql_text(r.sql_handle)

    where r.session_id = t1.request_session_id)as waiter_batch ,(selectsubstring(qt.text,r.statement_start_offset/2,

    (casewhen r.statement_end_offset =-1

    thenlen(convert(nvarchar(max), qt.text))* 2

    else r.statement_end_offset end- r.statement_start_offset)/2)fromsys.dm_exec_requestsas r

    crossapply sys.dm_exec_sql_text(r.sql_handle)as qt

    where r.session_id = t1.request_session_id)as waiter_stmt ---statement blocked

    ,t2.blocking_session_id as [blocker sid]

    -- spid of blocker ,(selecttextfromsys.sysprocessesas p --- get

    sql for blocker

    crossapply sys.dm_exec_sql_text(p.sql_handle)

    where p.spid = t2.blocking_session_id)as blocker_stmt

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    16/17

    from

    sys.dm_tran_locksas t1,

    sys.dm_os_waiting_tasksas t2where

    t1.lock_owner_address = t2.resource_address

    go exec sp_block_info

    (5) Could I benefit from more (or less) indexes?

    Remembering that indexes involve both a maintenance cost and a read benefit, the overall

    index cost benefit can be determined by comparing reads and writes. Reading an index

    allows us to avoid table scans however they do require maintenance to be kept up-to-date. While it is easy to identify the fringe cases where indexes are not used, and the

    rarely used cases, in the final analysis, index cost benefit is somewhat subjective. The

    reason is the number of reads and writes are highly dependent on the workload and

    frequency. In addition, qualitative factors beyond the number of reads and writes caninclude a highly important monthly management report or quarterly VP report in which

    the maintenance cost is of secondary concern.

    Writes of all indexes are performed forinserts, but there are no associated reads (unless

    there are referential constraints). Besides select statements, reads are performed for

    updates and deletes, writes are performed if rows qualify. OLTP workloads have lots of

    small transactions, frequently combining select, insert, update and delete operations.

    Data Warehouse activity is typically separated into batch windows having a high

    concentation of write activity, followed by an on-line window of read activity.

    SQL Statement Read WriteSelect Yes No

    Insert No Yes, all indexes

    Update Yes Yes, if row qualifies

    Delete Yes Yes, if row qualifies

    In general, you want to keep indexes to a functional minimum in a high transaction OLTP

    environment due to high transaction throughput combined with the cost of index

    maintenance and potential for blocking. In contrast, you pay for index maintenance onceduring the batch window when updates occur for a data warehouse. Thus, data

    warehouses tend to have more indexes to benefit its read-intensive on-line users.

    In conclusion, an important new feature of SQL Server 2005 includes DynamicManagement Views (DMVs). DMVs provide a level of transparency that was not

    available in SQL Server 2000 and can be used for diagnostics, memory and process

    tuning, and monitoring. DMVs can be useful in answering practical questions such asindex usage, cost benefit of indexes, and index hot spots. Finally, DMVs are queriable

    with SELECT statements but are not persisted to disk. Thus they reflect changing server

    state information since the last SQL Server recycle.

  • 7/27/2019 SQL Server Clustered Index Design for Performance

    17/17