® gdc’99 performance tuning with intel ® graphics tools larry wickstrom sr. software engineer...

49
® GDC’99 GDC’99 Performance Tuning Performance Tuning with Intel with Intel ® ® Graphics Graphics Tools Tools Larry Wickstrom Larry Wickstrom Sr. Software Engineer Sr. Software Engineer Judith Stanley Judith Stanley Application Engineer Application Engineer Intel Corporation Intel Corporation March 17, 1999 March 17, 1999

Upload: emory-dickerson

Post on 29-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Performance Tuning with Performance Tuning with IntelIntel®® Graphics Tools Graphics Tools

Larry WickstromLarry WickstromSr. Software EngineerSr. Software Engineer

Judith StanleyJudith StanleyApplication EngineerApplication Engineer

Intel CorporationIntel Corporation

March 17, 1999March 17, 1999

Page 2: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

PurposePurposeTo provide two tools that give more To provide two tools that give more

performance information than you performance information than you can get anywhere else!can get anywhere else!

Page 3: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Finding FPS problems in Finding FPS problems in youryour Game GameMeasuring concurrency in Measuring concurrency in youryour Game GamePinpointing performance thru API Pinpointing performance thru API

logging in logging in youryour Game Game

Tuning D3D App. Perf. Tuning D3D App. Perf. Using IPEAK/GPTUsing IPEAK/GPT

Page 4: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

The Tool Family TreeThe Tool Family Tree

Your GameYour Game

DirectX* DirectX*

GFX DriverGFX Driver

IntelIntel®® Graphics Graphics HardwareHardware

VTune AnalyzerVTune Analyzer

IPEAK GPTIPEAK GPT

*Third party marks and brands are the property of their respective owners*Third party marks and brands are the property of their respective owners

IntelIntel®® Graphics Profiler Graphics Profilerin VTune™ Analyzer 4.0in VTune™ Analyzer 4.0

Page 5: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Half-Life* FPSHalf-Life* FPS

DemoDemo

* Other brands and names are property of their respective owners.* Other brands and names are property of their respective owners.

GPT finds frame rate problems

Page 6: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

GPT Intercepts DX6.1: DirectDraw* GPT Intercepts DX6.1: DirectDraw* and Direct3D Immediate Mode*and Direct3D Immediate Mode*

GPT and the Graphics GPT and the Graphics PipelinePipeline

GraphicsController

Application

API (DirectDraw*/Direct3D IM*)

DisplayDriver

GPT interceptor

* Other brands and names are property of their respective owners.* Other brands and names are property of their respective owners.

–Retained Mode* Retained Mode* partially supportedpartially supported

–OpenGL* plannedOpenGL* planned

Page 7: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Remove Graphics LoadRemove Graphics Load–To measure load balance of CPU vs To measure load balance of CPU vs

GraphicsGraphics

Remove ParallelismRemove Parallelism–To measure concurrencyTo measure concurrency

Now let’s take control...Now let’s take control...

Page 8: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

GPT Can Remove... GPT Can Remove...

Measuring Load BalanceMeasuring Load Balance

API (DirectDraw*/Direct3D IM*)

DriverGraphicsController

Display

Application

GPT interceptor

* Other brands and names are property of their respective owners.* Other brands and names are property of their respective owners.

Page 9: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

GPT Can Remove... GPT Can Remove...

Measuring Load BalanceMeasuring Load Balance

API (DirectDraw*/Direct3D IM*)

Driver

Application

GPT interceptor

* Other brands and names are property of their respective owners.* Other brands and names are property of their respective owners.

–Graphics ControllerGraphics Controller

GraphicsController

Display

Page 10: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

GPT Can Remove... GPT Can Remove...

Measuring Load BalanceMeasuring Load Balance

API (DirectDraw*/Direct3D IM*)

Application

GPT interceptor

* Other brands and names are property of their respective owners.* Other brands and names are property of their respective owners.

–Driver CPU LoadDriver CPU Load

–Graphics ControllerGraphics Controller

Driver

Page 11: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

GPT Can Remove... GPT Can Remove...

Measuring Load BalanceMeasuring Load Balance

Application

GPT interceptor

* Other brands and names are property of their respective owners.* Other brands and names are property of their respective owners.

–API CPU LoadAPI CPU Load

–Driver CPU LoadDriver CPU Load

–Graphics ControllerGraphics ControllerAPI (DirectDraw*/Direct3D IM*)

Page 12: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

GPT Can Remove... GPT Can Remove...

Measuring Load BalanceMeasuring Load Balance

Application

GPT interceptor

–API CPU LoadAPI CPU Load

–Driver CPU LoadDriver CPU Load

–Graphics ControllerGraphics Controller

… … and keep the App happy and keep the App happy

Page 13: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Comparison of NULL API Comparison of NULL API to Normalto Normal

fpsfps

UnmodifiedUnmodified

NULL APINULL API

API OverheadAPI Overhead

TimeTime

Page 14: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Comparison of NULL API Comparison of NULL API to Normalto Normal

fpsfps

UnmodifiedUnmodifiedNULL APINULL APIApp BoundApp Bound

TimeTime

Page 15: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Comparison of NULL API Comparison of NULL API to Normalto Normal

fpsfpsUnmodifiedUnmodified

NULL APINULL API

TimeTime

Page 16: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

If performance increases dramaticallyIf performance increases dramatically–Too much graphicsToo much graphics

–Too little appToo little app– add more AI/Physics/...add more AI/Physics/...

If performance doesn’t increaseIf performance doesn’t increase–Too much AppToo much App

– could HW do more?could HW do more?

–Too little graphicsToo little graphics

What can be inferred...What can be inferred...

Page 17: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Parallel PerformanceParallel Performance–CPU & GC work at same timeCPU & GC work at same time

Serial PerformanceSerial Performance–CPU waits on GC, vice versaCPU waits on GC, vice versa

ConcurrencyConcurrency

CPUCPU

3D HW3D HW

CPUCPU

3D HW3D HW

Page 18: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

GPT Can Introduce Locks hereGPT Can Introduce Locks here

Measuring ConcurrencyMeasuring Concurrency

API (DirectDraw*/Direct3D IM*)

DriverGraphicsController

Display

Application

GPT interceptor

* Other brands and names are property of their respective owners.* Other brands and names are property of their respective owners.

//

Page 19: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

GPT Can Introduce Locks hereGPT Can Introduce Locks here

Measuring ConcurrencyMeasuring Concurrency

API (DirectDraw*/Direct3D IM*)

DriverGraphicsController

Display

Application

GPT interceptor

//

//

that Serialize CPU & that Serialize CPU & Graphics Hardware Graphics Hardware activity hereactivity here

* Other brands and names are property of their respective owners.* Other brands and names are property of their respective owners.

Page 20: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Comparison of Serialize Comparison of Serialize to Normalto Normal

Time

fps

Unmodified

SerialConcurrency

Page 21: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Comparison of Serialize Comparison of Serialize to Normalto Normal

Time

fps

UnmodifiedSerialLack of Concurrency

Page 22: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

If Serial << NormalIf Serial << Normal–Good. Wider gap means more Good. Wider gap means more

concurrencyconcurrency

If Serial == NormalIf Serial == Normal–Application isn’t benefiting from CPU/GC Application isn’t benefiting from CPU/GC

concurrencyconcurrency– App is causing CPU & GC to serializeApp is causing CPU & GC to serialize

– Extreme load imbalanceExtreme load imbalance– Either no graphics load, or no CPU loadEither no graphics load, or no CPU load

What can be Inferred...What can be Inferred...

Page 23: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Half-Life* Load Balance/ConcurrencyHalf-Life* Load Balance/Concurrency

DemoDemo

GPT finds frame concurrency problems

* Other brands and names are property of their respective owners.* Other brands and names are property of their respective owners.

Page 24: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

API LoggingAPI Logging

Direct3DcallingDirectDraw

DirectDrawcallingDirectDraw

Page 25: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

CoverageCoverage

Page 26: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Duration (Frame Marking)Duration (Frame Marking)

Page 27: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Half-Life* Load Balance/ConcurrencyHalf-Life* Load Balance/Concurrency

DemoDemo

GPT pinpoints performance problems

* Other brands and names are property of their respective owners.* Other brands and names are property of their respective owners.

Page 28: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

GPT quickly finds FPS problems in GPT quickly finds FPS problems in your gameyour game

GPT measures Concurrency & Load GPT measures Concurrency & Load BalanceBalance

GPT pinpoints API level performance GPT pinpoints API level performance problemsproblems

GPT SummaryGPT Summary

Page 29: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

IntelIntel®® Graphics Profiling Graphics Profiling Capability of VTune™ Capability of VTune™ Performance Analyzer 4.0Performance Analyzer 4.0

What Is It?What Is It?What’s It Do?What’s It Do?Show Me How...Show Me How...

Page 30: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

VTune™ Performance VTune™ Performance Analyzer 4.0Analyzer 4.0

System monitoringSystem monitoringSoftware execution examinationSoftware execution examinationDynamic simulation and analysisDynamic simulation and analysis

What Is It?What Is It?

Page 31: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

IntelIntel®® Graphics Profiling Graphics Profiling CapabilityCapability

Integrated into VTune™ Integrated into VTune™ Performance Analyzer 4.0Performance Analyzer 4.0

3D application profiling3D application profiling

What Is It?What Is It?

Page 32: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

The Tool Family TreeThe Tool Family Tree

IntelIntel®® Graphics Profiler Graphics Profilerin VTune™ Analyzer 4.0in VTune™ Analyzer 4.0

Your GameYour Game

DirectX*DirectX*

GFX DriverGFX Driver

Intel Graphics Intel Graphics HardwareHardware

VTune AnalyzerVTune Analyzer

IPEAK GPTIPEAK GPT

What Is It?What Is It?

*Third party marks and brands are the property of their respective owners*Third party marks and brands are the property of their respective owners

Page 33: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

ArchitectureArchitectureSelect and view eventsSelect and view events

L2

Ca

ch

eL

2 C

ac

he

CPUCPU

Chip SetChip Set

Sy

s M

em

Sy

s M

em

PCI BusPCI Bus

IntelIntel®® Graphics Accelerator Graphics Accelerator

Lo

cal

Vid

L

oca

l V

id

Mem

ory

Mem

ory

Intel740™ Intel740™ DriverDriver

SetupSetup Pix FillPix Fill

Frames/SecFrames/Sec

CPU UtilizationCPU Utilization

State ChangesState Changes

AGPAGP

– IntelIntel®® Graphics Hardware Driver Graphics Hardware Driver

Tri/Sec,Tri/Sec,UtilizationUtilization

Pix/Sec,Pix/Sec,UtilizationUtilization

– IntelIntel®® Graphics Chip Graphics Chip

3D 3D PipePipe

2D Engine2D Engine2D2D

What Is It?What Is It?

Page 34: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Analyze IntelAnalyze Intel®® Graphics Graphics HardwareHardware

Maximum fill rate Maximum fill rate

Clocks app sits idleClocks app sits idle

3D Clocks can be 3D Clocks can be recoveredrecovered

0

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

70,000,000

Pixel/Clock 3D Busy Cmd Stream Idle Clocks

What’s It Do?What’s It Do?

Page 35: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Watch IntelWatch Intel®® Graphics Graphics D3D*/OpenGL* DriversD3D*/OpenGL* Drivers

Total time in driverTotal time in driverDuty cycle for average Duty cycle for average

triangletriangleFrames per secondFrames per secondTotal time in each driver call Total time in each driver call

back back

What’s It Do?What’s It Do?

*Third party marks and brands are the property of their respective owners*Third party marks and brands are the property of their respective owners

Page 36: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Reports Bottlenecks Reports Bottlenecks

Triangle packet sizeTriangle packet sizeCPU/Intel740™ chip concurrency CPU/Intel740™ chip concurrency Locks to render targetsLocks to render targets

What’s It Do?What’s It Do?

Page 37: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Get StartedGet Started

Profile your app with VTune™ Profile your app with VTune™ Analyzer 4.0Analyzer 4.0

Look for hot-spotsLook for hot-spots

Look at HW/Driver Counter Look at HW/Driver Counter graphsgraphs

Find the problem then “drill Find the problem then “drill down” to the CPU time framedown” to the CPU time frame

Show Me How...Show Me How...

Page 38: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Find the BottleneckFind the BottleneckSerialization vs Concurrency Serialization vs Concurrency

– The CPU sits idle (waits for HW)The CPU sits idle (waits for HW)

– The graphics HW sits idle (CPU busy)The graphics HW sits idle (CPU busy)

Why?Why?– Improperly placed 2D instructionsImproperly placed 2D instructions

– Triangle-at-a-time methodologyTriangle-at-a-time methodology

Gfx HWGfx HW Raster Triangles Raster Triangles Raster Triangles... Raster Triangles...

Driver Duty CycleDriver Duty Cycle

One FrameOne Frame

Processor Processor GfxHW Drv GfxHW Drv Light/Transform/Game Control Light/Transform/Game Control GfxHW Drv Light/Transform… GfxHW Drv Light/Transform…

Show Me How...Show Me How...

Page 39: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Demo: Guess What the Demo: Guess What the Bottleneck Is?Bottleneck Is?

Show Me How...Show Me How...

Page 40: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

What to Look ForWhat to Look For

An app requires triple buffering...An app requires triple buffering...

An app requires MipMapping…An app requires MipMapping…

You can gather 3D statistics…You can gather 3D statistics…

Show Me How...Show Me How...

Page 41: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Demo: Guess What the Demo: Guess What the Bottleneck is?Bottleneck is?

Show Me How...Show Me How...

Page 42: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Summary:Summary:

IntelIntel®® Graphics Profiler is a new capability of Graphics Profiler is a new capability of VTune™ Analyzer 4.0VTune™ Analyzer 4.0

Intel Graphics Profiler monitors graphics Intel Graphics Profiler monitors graphics HW and driver performanceHW and driver performance

What you learn can apply to other graphics What you learn can apply to other graphics hardwarehardware

Usage: find the problem, then drill down!Usage: find the problem, then drill down!

Page 43: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

IPEAK GPTIPEAK GPT–Questions, comments - [email protected], comments - [email protected]– IPEAK Web siteIPEAK Web site

– http://developer.intel.com/design/ipeakhttp://developer.intel.com/design/ipeakIntel® Graphics Profiling Capability in Intel® Graphics Profiling Capability in

Vtune™ Analyzer 4.0Vtune™ Analyzer 4.0–http://intel.com/vtunehttp://intel.com/vtune–http://developer.intel.com/design/http://developer.intel.com/design/

graphics/swdev/index.htmgraphics/swdev/index.htm

Support & InformationSupport & Information

Download the Demo!!!

Page 44: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

BACKUPBACKUP

Page 45: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

InstallationInstallation Included in VTune™ 4.0 Analyzer InstallationIncluded in VTune™ 4.0 Analyzer Installation

– Select the Intel740™ Graphics Accelerator counters at the Select the Intel740™ Graphics Accelerator counters at the component installation configuration menucomponent installation configuration menu

Enabling Graphics ProfilingEnabling Graphics Profiling– Under “Configure”, under “Options” and “Sampling”, select Under “Configure”, under “Options” and “Sampling”, select

“Chronology Objects”“Chronology Objects”

– Enable the IntelEnable the Intel®® Graphics Counters (Intel740™ Graphics Accelerator) Graphics Counters (Intel740™ Graphics Accelerator)

– Double click on the Intel740 Chip Counters in the same menu to Double click on the Intel740 Chip Counters in the same menu to configure individual countersconfigure individual counters

– Finally, Under “Sampling”, select “Advanced” and enable “Collect Finally, Under “Sampling”, select “Advanced” and enable “Collect Chronology Data”Chronology Data”

OA Profiler Capability is Included with VTune 4.0 AnalyzerOA Profiler Capability is Included with VTune 4.0 Analyzer

Page 46: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

Tuning your App with the OA Tool Tuning your App with the OA Tool Set: Accounting for lost clocksSet: Accounting for lost clocks

IntelIntel®® Graphics Hardware Graphics Hardware– Maximum fill rate is 1 Pix/Clock - 66Meg Maximum fill rate is 1 Pix/Clock - 66Meg

Pixels/SecPixels/Sec

– Clocks between Cmd_Stream_Busy and Clocks between Cmd_Stream_Busy and 66M are clocks the Intel740™ chip sits 66M are clocks the Intel740™ chip sits IdleIdle

– 3D Clocks not producing a pixel 3D Clocks not producing a pixel potentially can be recovered by potentially can be recovered by modifying application codemodifying application code

0

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

70,000,000

Pixel/Clock 3D Busy Cmd Stream Idle Clocks

D3D*/OpenGL* DriverD3D*/OpenGL* Driver– Total number of CPU clocks used by the driverTotal number of CPU clocks used by the driver

– Duty cycle for average triangle sizes listed in the SUG can be used to Duty cycle for average triangle sizes listed in the SUG can be used to predict where your game should be runningpredict where your game should be running

– Total from each call backs can be observed to narrow down Total from each call backs can be observed to narrow down bottlenecks.bottlenecks.

– Typical bottlenecks: Triangle packet size, CPU/Intel740 chip Typical bottlenecks: Triangle packet size, CPU/Intel740 chip concurrency, locks to render targetsconcurrency, locks to render targets

*Third party marks and brands are the property of their respective owners*Third party marks and brands are the property of their respective owners

Page 47: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

What IntelWhat Intel®® Graphics Counters Graphics Counters Tell About Your AppTell About Your App

% 2D or 3D Cmd Stream Busy% 2D or 3D Cmd Stream Busy– Total amount of time the graphics hardware is in useTotal amount of time the graphics hardware is in use

% 3D Fill Engine Busy vs % 3D Fill Engine Stall% 3D Fill Engine Busy vs % 3D Fill Engine Stall– If very high, app can be fill rate limited (very large tris)If very high, app can be fill rate limited (very large tris)

– Contrast to see if busy but stalled indicating either waiting Contrast to see if busy but stalled indicating either waiting for pixel data or waiting for info to finish pixel calculationfor pixel data or waiting for info to finish pixel calculation

% 3D Pipeline Busy% 3D Pipeline Busy– If higher than %3D Fill Engine busy indicates too many If higher than %3D Fill Engine busy indicates too many

small triangles and setup limitedsmall triangles and setup limited

Graphics Counters Correlate GFX Hardware EventsGraphics Counters Correlate GFX Hardware Events

Page 48: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

What IntelWhat Intel®® Graphics Counters Graphics Counters Tell About Your AppTell About Your App Pixels (Z Tested) and (Z Failed)Pixels (Z Tested) and (Z Failed)

– Number of pixels processed by the gfx cardNumber of pixels processed by the gfx card

– Z failed / Z tested gives % Z buffer depthZ failed / Z tested gives % Z buffer depth

Z Writes to Z BufferZ Writes to Z Buffer– Counts the number of 16-bit Z writesCounts the number of 16-bit Z writes

Pixel Reads from Render BufferPixel Reads from Render Buffer– You can check what % of your scene gets alpha blended You can check what % of your scene gets alpha blended

when contrasted with Pixel Writeswhen contrasted with Pixel Writes

Color Calculator Stalled by Color ReadColor Calculator Stalled by Color Read– If this is high, alpha blending could be causing a If this is high, alpha blending could be causing a

bottleneck for local memory bandwidthbottleneck for local memory bandwidth

Counters Used in Combination Uncover Added InformationCounters Used in Combination Uncover Added Information

Page 49: ® GDC’99 Performance Tuning with Intel ® Graphics Tools Larry Wickstrom Sr. Software Engineer Judith Stanley Application Engineer Intel Corporation March

RR

®®

GDC’99GDC’99

What What IntelIntel®® Graphics Graphics CountersCounters Tell About Your AppTell About Your App

““Triangles Processed” & “Triangles Rendered”Triangles Processed” & “Triangles Rendered”– Triangles per second. A large discrepancy indicates zero Triangles per second. A large discrepancy indicates zero

pixel trianglespixel triangles

““AGP Texture Data Bytes Read”AGP Texture Data Bytes Read”– This is AGP bandwidth being used for textures in bytes.This is AGP bandwidth being used for textures in bytes.

““Texture Cache Busy” & “Texture Cache Fetch Stall”Texture Cache Busy” & “Texture Cache Fetch Stall”– All texel data goes through the texture cache so this All texel data goes through the texture cache so this

indicates texture usage. indicates texture usage.

– Texture Cache Fetch Stall - very high indicates AGP texture Texture Cache Fetch Stall - very high indicates AGP texture bandwidth is overrun - need mipmapping.bandwidth is overrun - need mipmapping.

What You Learn Can Apply to Other Graphics HardwareWhat You Learn Can Apply to Other Graphics Hardware