김종구 대리 기술지원부 마이크로소프트

57

Upload: colin

Post on 11-Feb-2016

80 views

Category:

Documents


0 download

DESCRIPTION

SQL 서버 성능 문제 해결 및 Locking Internals. 김종구 대리 기술지원부 마이크로소프트. 강사 소개. 김종구 / 마이크로소프트 기술지원부 (2000 ~ 현재 ) Infrastructure RRE SQL Support Engineer KT NeOSS, Auction, 삼성생명 등 주요 사이트 기술지원 “SQL Memory Architecture” 등 TechNet 세미나 진행. 목적. 성능 관련 문제 발생 시 데이터 수집을 위한 PSSDIAG 툴 소개 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 김종구 대리 기술지원부 마이크로소프트
Page 2: 김종구 대리 기술지원부 마이크로소프트

김종구 대리기술지원부마이크로소프트

SQL 서버 성능 문제 해결 및Locking Internals

Page 3: 김종구 대리 기술지원부 마이크로소프트

강사 소개

• 김종구 / 마이크로소프트 기술지원부 (2000 ~ 현재 )

• Infrastructure RRE • SQL Support Engineer• KT NeOSS, Auction, 삼성생명 등

주요 사이트 기술지원• “SQL Memory Architecture” 등

TechNet 세미나 진행

Page 4: 김종구 대리 기술지원부 마이크로소프트

목적• 성능 관련 문제 발생 시 데이터 수집을 위한

PSSDIAG 툴 소개• 성능 문제 해결을 위한 전반적인 방법론 소개 및 관련

지식 전달• Locking Internal 에 대한 지식 전달

Page 5: 김종구 대리 기술지원부 마이크로소프트

대상 기술범위

• PSSDIAG Tool• Read80Trace Tool• SQL Server Performance Troubleshooting Methodology• Locking Internals

Page 6: 김종구 대리 기술지원부 마이크로소프트

이 주제를 이해하는 데 필요한 지식

Level 200Level 200

• Performance 문제 해결을 위한 기본 데이터에 대한 이해 - sysprocesses 의 waittype 및 lastwaittype - blocking Monitoring방법 - 성능 로그 사용법 및 주요 counter - Profiler 사용법

Page 7: 김종구 대리 기술지원부 마이크로소프트

목차

• PSSDiag • SQL Server Performance Troubleshooting• SQL Server Locking Internals and Troubleshooting

Page 8: 김종구 대리 기술지원부 마이크로소프트

PSSDiag

Page 9: 김종구 대리 기술지원부 마이크로소프트

What is PSSDiag?

• Wrapper around data collection APIs commonly used in PSS, particularly SQL Server Support (Profiler, blocking script, Perfmon/Sysmon, SQLDIAG, event logs, etc)

• Designed to provide double-click simplicity and reduce user error

• Get all the needed data the first time, collected at the same time

• See KB 830232 for information and download location

Page 10: 김종구 대리 기술지원부 마이크로소프트

Components• GUI configuration utility (PSSDiagConfig.EXE)

– Configure types of data to collect– Save configuration in XML document– Can also manage the collector service

• Collector (PSSDiag.EXE)– Collector app, consumes configuration file created by GUI– Can run as a service or a console app

Page 11: 김종구 대리 기술지원부 마이크로소프트

GUI configuration utility• Typical use:

1. Select the target version of SQL Server2. Supply authentication mode and relevant info3. Select and configure the diagnostics you want to collect4. Click Start

• The GUI configures and starts the collector service for you• Any output messages from the collector will be displayed in the GUI• Diagnostic files will be written to the output folder (.\OUTPUT by default)

Page 12: 김종구 대리 기술지원부 마이크로소프트

Collector (PSSDiag.EXE)• Can run as a console app or as a service (GUI always

runs it as a service)• Can compress files using NTFS or ZIP compression via

/Cn parameter• Logging out while running as a console will stop data

collection• Works fine from Terminal Services session• Set the output folder via /O parameter• To uninstall, run PSSDiag /U to uninstall the service (if

you’ve installed it), then delete the files extracted from the archive

Page 13: 김종구 대리 기술지원부 마이크로소프트

Collector (PSSDiag.EXE)• Optional command line params:

– /Cn – (/C1 for NTFS background compression, /C2 for ZIP compression at shutdown)

– /Nn – erase, overwrite, or rename output folder– /B YYYYMMDD_HH:MM:SS – Start time– /E YYYYMMDD_HH:MM:SS – Automatic shutdown time– /G – Generic mode. Disables SQL Server-centric mode to permit

collection on machines without SQL Server installed– /R – register as a service– /U – deregister service

Page 14: 김종구 대리 기술지원부 마이크로소프트

Collecting Data From a Clustered SQL Instance

• Two options here:1. Use the default machine name (“.”) when running on a cluster node,

and PSSDiag will collect data from all SQL Server virtual servers on the cluster

2. Supply a virtual SQL Server name for the machine name (leave the instance name set to “*”) and PSSDiag will collect from that virtual server only

Page 15: 김종구 대리 기술지원부 마이크로소프트

Collecting Data from a Remote Server• Supply the machine name when starting the GUI• Configuration is disabled when connecting to a remote machine• Output files will be written to remote machine• Output path must exist or be creatable• Run the collector as a console app if you wish to capture diagnostic files

to the local machine:– Profiler trace is always a server-side collection

• Output path must exist on local machine and the SQL Server• Never capture Profiler to a UNC or network drive• .TRC files will be copied to local OUTPUT dir on shutdown

– Other data types (blocker, perfmon, etc) will be captured wherever the collector console app is running

Page 16: 김종구 대리 기술지원부 마이크로소프트

Scheduling Collection Start/Stop• When running as a service:

– Schedule an NT job to run PSSDiagControl START to start the service– Schedule an NT job to run PSSDiagControl STOP to stop it

• When running as a console app:– Schedule the console app to start via the NT scheduler– Use /E or /B parameters if start/stop time and day (or relative time) is known– Schedule an NT job to create a file named PSSDiag.STOP in the output folder

• Don’t KILL or you will leave Profiler trace running on SQL - see KB 283786 for manually stop and delete trace.

Page 17: 김종구 대리 기술지원부 마이크로소프트

Collecting Data for Extended Periods• Run as a service so that you can log out of the console (the

GUI needn’t keep running)• Can schedule start/stop times and delete/rename old output

folder• Consider using /C1 to minimize space used by rolled

over .TRC and .BLG files• The fact that .TRC must always be collected on the server

makes remote collection not very effective for minimizing disk space use on the server

Page 18: 김종구 대리 기술지원부 마이크로소프트

PSSDiag Impact on Server Performance• Impact of PSSDiag.EXE itself is negligible• Perf impact of collection equals sum of costs of diagnostics

being collected• Generally dominated by cost of Profiler tracing (use the

“Detailed Performance” trace template only when you actually need it)

• Blocking detection script and Perfmon shouldn’t have a significant impact

Page 19: 김종구 대리 기술지원부 마이크로소프트

Troubleshooting• Intent is to save time – don’t let it become an obstacle• Main collector log is ##PSSDiag.LOG• All console output also written to application event log• Some Perfmon counter errors are normal (e.g. “Could not

add counter: XYZ. - The specified object is not found on the system.”)

• All scripts (and profiler trace) are started via osql.exe. Script output files are .OUT files with names like “##server__Run_sp_trace.OUT”

Page 20: 김종구 대리 기술지원부 마이크로소프트

- Using PSSDIAG Tool

DemoDemo

Page 21: 김종구 대리 기술지원부 마이크로소프트

SQL Server PerformanceTroubleshooting

Page 22: 김종구 대리 기술지원부 마이크로소프트

Agenda• Methodology• Resource bottlenecks• Determining your bottleneck• Which queries are responsible• Tuning the identified queries

Page 23: 김종구 대리 기술지원부 마이크로소프트

Methodology• System performance is the result of aggregate

performance of all queries• At a high level what type of bottleneck does

system have• Find the queries using the most of that resource• Is resource being used efficiently• Always another bottleneck

Page 24: 김종구 대리 기술지원부 마이크로소프트

Common DB Bottlenecks• Synchronization (Locks/Latches)

– 224453 is good KB for this• CPU

– Single query/single CPU– Single query/parallel– Aggregate query load over all CPUs

• IO– Insufficient memory or poor access path?

• Memory– SQL Server throttles the number of concurrently executing queries with

sorts/hashes

Page 25: 김종구 대리 기술지원부 마이크로소프트

Performance Monitor

• Synchronization– Locks: Lock Waits/sec, Lock Wait Time (ms)– Latches: Total Latch Wait Time, Latch Waits/sec

• CPU– Sustained rates at 75+ percent– Compiles/sec, Recompiles/sec

Page 26: 김종구 대리 기술지원부 마이크로소프트

Performance Monitor II• IO

– SQL Server View• Page Reads/sec• Readahead pages/sec• Checkpoint & Lazywrites/sec• fn_virtualfilestats

– Operating System View• Avg Disk sec/Read or Write• Disk Queue Length is often NOT a good indicator

• Memory– Memory Grants Pending– Max/Granted Workspace Memory

Page 27: 김종구 대리 기술지원부 마이크로소프트

DBCC SQLPERF(WAITSTATS)

• Number of waits & total wait time for each waittype• Example

– PAGEIOLATCH_SH 64.0 7748.0 761.0– PAGEIOLATCH_UP 24.0 2381.0 10.0– PAGEIOLATCH_EX 21.0 2274.0 60.0

• KB 822101 for description of the various waittypes• Take delta between snapshots, or clear with DBCC

SQLPERF(WAITSTATS, CLEAR)

Page 28: 김종구 대리 기술지원부 마이크로소프트

SQL Server Trace (Profiler)• Use the sp_trace procedures instead of the GUI

– Significantly less performance impact– Won’t “drop” events if rate is high

• Write trace files to fast drive(s)• Configure what you want to trace in GUI and use File

– Script Trace option• PSS prefers PSSDiag option for one step collection

Page 29: 김종구 대리 기술지원부 마이크로소프트

Analyzing Trace Data• Use GUI option to sort by a column• Use fn_trace_gettable to load and query the data• Problems

– Time consuming and generally requires you to have a specific problem in mind

– Individual queries identified may not be relevant to the problem

– Too manual—easy to miss things

Page 30: 김종구 대리 기술지원부 마이크로소프트

Introducing Read80Trace• All text is “normalized” to remove comments, white space &

parameters• Database only stores the text of the first “unique” entry• Detail data is loaded in normalized format to facilitate joins,

reduce redundant data– Connections– Batches/UniqueBatches– Statements/UniqueStatements– Plans/UniquePlans

• See KB 887057 for download location

Page 31: 김종구 대리 기술지원부 마이크로소프트

DemoDemo

• Using Read80trace Tool

- Queries using the most CPU- Query that changes execution plans- Comparing “good” trace with “bad” trace

Page 32: 김종구 대리 기술지원부 마이크로소프트

Looking at Specific Queries• Does performance change correlate with plan

differences– Different execution plan– Different amount of work performed

• Majority of bad plans caused by poor cardinality estimates– Use STATISTICS PROFILE to find the problematic part of

plan

Page 33: 김종구 대리 기술지원부 마이크로소프트

Using STATISTICS PROFILE

• Everything except Rows/Executes is compile time information

• Executes column reflects parallelism– For example, scan that executes 4 times

• Compare Rows with (EstimateRows * EstimateExecutes)– Find most deeply nested operator where the error

originates; it propagates up the tree from there

Page 34: 김종구 대리 기술지원부 마이크로소프트

DemoDemo

- Using STATISTICS PROFILE

Page 35: 김종구 대리 기술지원부 마이크로소프트

Acceptable Cardinality Error• Reasonable margin of error depends on operator

– Loop joins – within 2x range pretty reasonable– Merge join – 5x is reasonable– Hash joins – size of build input (first table below join) affects hash table

memory size. Probe input (second table) doesn’t matter much– Sorts – size affects memory grant and 2x is reasonable

• Differences in estimates may not be bugs– Should the optimizer do better given the available statistics

Page 36: 김종구 대리 기술지원부 마이크로소프트

Cardinality Estimation• Histograms contain most useful information for predicates with literal

– au_lname = ‘Smith’– StartTime BETWEEN ‘2003-01-01’ AND ‘2003-12-31’

• Density information– ColA = ‘x’ and ColB = ‘y’– Equijoins– Equality predicates with variables– Auto create statistics only creates single column statistics

• Other estimates usually based on fixed selectivity estimates– Percentage based on the comparison operator– See Inside SQL Server 2000 for table of values

Page 37: 김종구 대리 기술지원부 마이크로소프트

Auto Statistics• Samples a percentage of the data

– Minimum sampling ~4MB of data– Maximum a function of rows in table

• If you have issues with bad plans– Update statistics (sampled) – if this fixes it then histogram is probably

out of date or auto update not triggered soon enough for your query– If fullscan required to fix a problem use DBCC SHOW_STATISTICS to

see how much the density values differ

Page 38: 김종구 대리 기술지원부 마이크로소프트

Limitations to Consider• T-SQL variable (as opposed to parameter)

– Value not known at compile time so can’t use histogram• Builtin functions

– No statistics available• Multi-statement table-valued functions

– No statistics available• Table variables

– No statistics available– Temp table & recompile uses statistics

Page 39: 김종구 대리 기술지원부 마이크로소프트

SQL Server Locking Internals and Troubleshooting

Page 40: 김종구 대리 기술지원부 마이크로소프트

IntroductionIntroduction • UMS Scheduling and Workers• What is a SQL Resource• How SQL Server really waits on a ‘resource’• Blocking• Crabbing• Fetch Rates• Physical vs Logical Protection• Scans and Lock Classes• Lock Escalation

Page 41: 김종구 대리 기술지원부 마이크로소프트

UMS Scheduling• User Mode Scheduling• Precise Resource Usage• Preemptive vs Non-Preemptive

Page 42: 김종구 대리 기술지원부 마이크로소프트

SQL Server Workers• What is a worker?• Worker Pool• Request bound to Worker for (Life Time) - Example• Target Setting - Example• Not Dynamic• Division of Workers

Page 43: 김종구 대리 기술지원부 마이크로소프트

Connection Bound To Scheduler

• Assignment: Scheduler with fewest users• Life time: Bound for connection life time• Workload Matters

Page 44: 김종구 대리 기술지원부 마이크로소프트

A SQL Resource

• Reader / Writer• Waiters list• FIFO Maintained

Page 45: 김종구 대리 기술지원부 마이크로소프트

Blocked / Blocking• Blocked• Worker tied up in block scenario• Attentions / Query Timeout (Blocking)• Diagnostics

– sysprocesses– syslockinfo– dbcc opentran - Example– Profiler and SQLTrace– xact_abort– Performance Monitor– dbcc sqlperf(waitstats)– dbcc pss – Example– PSSDiag

Page 46: 김종구 대리 기술지원부 마이크로소프트

Resource Crabbing

• Maintains Data Stability• Acquire Next• Release Previous

Page 47: 김종구 대리 기술지원부 마이크로소프트

Client Fetch Rate Matters• Sending Results• How Crabbing Applies• Mobile Links• Lock Scope• Never Perfect• Preemptive Network Writes

Page 48: 김종구 대리 기술지원부 마이크로소프트

Batch Size Can Matter• Touching Several Objects• External Logic

– XProcs – COM Objects

• The Transaction Log– Sector Alignment– Flush to LSN

Page 49: 김종구 대리 기술지원부 마이크로소프트

DTC Locking• All Under One Roof• SQL 7.0 Behavior• SQL 2000 Behavior• Deadlocking• SPID = -1 or SPID = -2

Page 50: 김종구 대리 기술지원부 마이크로소프트

Physical versus Logical Protection

• Physical = Latch– Reading Page Into Memory– Writing A Page To Disk– Inserting A Row– Internal Data Structures

• Logical = Lock• Inserting / Modifying A Row

Page 51: 김종구 대리 기술지원부 마이크로소프트

Scans (the SDES)Lock Classes• What is an SDES?• SDES and Query Plan - Example• Parallel – Multiple Scans• What is a Lock Class?• Lock Class and Lock Escalation

Page 52: 김종구 대리 기술지원부 마이크로소프트

Lock Escalation• What is Lock Escalation?• Lock Escalation Paths• Escalation Boundaries• Monitoring Lock Escalation

– Performance Monitor– Profiler

• Work Tables• Escalation Failures• Trace Flag -T1211• Batch Size Matters

Page 53: 김종구 대리 기술지원부 마이크로소프트

Helpful Tidbits• Identity Property• Update with Variable Assignment• Attentions / Query Cancels• Blocked Scheduler Workers• Open, Fetch, and Close• Implicit Transactions• MAXDOP = 1• Locking Hints• Isolation Levels

Page 54: 김종구 대리 기술지원부 마이크로소프트

세션 요약

• 성능관련 문제 해결을 위한 데이터 수집 툴인 PSSDIAG 에 대한 사용 방법 소개

• 성능문제 파악을 위한 주요 방법론• Locking Internal 에 대한 개념

Page 55: 김종구 대리 기술지원부 마이크로소프트

참고 자료 (1)• TechEd 2004 Korea - SQL Server Performance Toolbox (by 하성희 )• http://msdn.microsoft.com/library/en-us/tsqlref/ts_sys-

p_3kmr.asp• “SQL Server 2000 Performance Tuning with Waits

and Queues” SQL Magazine (January 2004) By Tom Davidson

• “Inside SQL Server 2000” by Kalen Delaney• “Microsoft SQL Server 2000 Performance Tuning

Technical Reference “ by Edward Whalen• http://sqldev.net/misc/waittypes.htm

Page 56: 김종구 대리 기술지원부 마이크로소프트

참고자료 (2)• Q224587 Troubleshooting Application Performance with SQL Server • Q251004 How to Monitor SQL Server 2000 Blocking • Q224453 Understanding and Resolving SQL Server 7.0 Blocking Problem • Q243588 Troubleshooting Performance of Ad-Hoc Queries • Q244455 Definition of Sysprocesses Waittype and Lastwaittype Fields for SQL

Server 7.0 • Q271509 INF: How to Monitor SQL Server 2000 Blocking• Q224587 INF: Troubleshooting Application Performance with SQL Server • Q243589 INF: Troubleshooting Slow-running Queries• Q243586 INF: Troubleshooting Stored Procedure Recompilation• Q822101 The waittype and lastwaittype fields in the sysprocesses table• Q255596 sp_lock2 Returns Additional Data • Q283696 Job to Monitor SQL Server 2000 Performance and Activity• Q283784 How to View SQL Server 2000 Activity Data• Q262499 INF: Using Output Parameters with sp_executesql• Q283786 INF: How to Monitor SQL Server 2000 Traces

Page 57: 김종구 대리 기술지원부 마이크로소프트