gpu tutorial

36
GPU Tutorial GPU Tutorial 이이이 Computer Game 2007 이이 2007 이 11 이 이이이 이 , 12 이 이이 이

Upload: kaleigh-logan

Post on 30-Dec-2015

43 views

Category:

Documents


3 download

DESCRIPTION

GPU Tutorial. 이윤진 Computer Game 2007 가을 2007 년 11 월 다섯째 주 , 12 월 첫째 주. Contents. Introduction to GPU High-level shading languages GPU applications. Introduction to GPU. 이윤진 Computer Game 2007 가을 2007 년 11 월 26 일. Slide Credits. Marc Olano (UMBC) SIGGRAPH 2006 Course notes - PowerPoint PPT Presentation

TRANSCRIPT

GPU TutorialGPU Tutorial이윤진

Computer Game 2007 가을2007 년 11 월 다섯째 주 , 12 월 첫째 주

ContentsContentsIntroduction to GPUHigh-level shading languagesGPU applications

Introduction to GPUIntroduction to GPU이윤진

Computer Game 2007 가을2007 년 11 월 26 일

Slide CreditsSlide CreditsMarc Olano (UMBC)

◦ SIGGRAPH 2006 Course notesDavid Luebke (University of Virginia)

◦ SIGGRAPH 2005, 2007 Course notesMark Kilgard (NVIDIA Corporation)

◦ SIGGRAPH 2006 Course notesRudolph Balaz and Sam Glassenberg

(Microsoft Corporation)◦ PDC 05

Randy Fernando and Cyril Zeller (NVIDIA Corporation)◦ I3D 2005

Americas Army

GPUGPUGPU: Graphics Processing Unit

◦Designed for real-time graphics◦Present in almost every PC◦Increasing realism and complexity

Growth of GPU (NVIDIA)Growth of GPU (NVIDIA)

Growth of GPU (NVIDIA)Growth of GPU (NVIDIA)Performance matrices

◦since 2000, the amount of horsepower applied to processing 3D vertices and fragments has been growing at a staggering rate

Computational PowerComputational PowerGPUs are fast…

◦ 3.0 GHz Intel Core2 Duo (Woodcrest Xeon 5160): Computation: 48 GFLOPS peak Memory bandwidth: 21 GB/s peak Price: $874 (chip)

◦ NVIDIA GeForce 8800 GTX: Computation: 330 GFLOPS observed • Memory bandwidth: 55.2 GB/s observed • Price: $599 (board)

GPUs are getting faster, faster◦ CPUs: 1.4× annual growth◦ GPUs: 1.7×(pixels) to 2.3× (vertices) annual

growth

Computational PowerComputational Power

Computational PowerComputational PowerWhy are GPUs getting faster so

fast?◦Arithmetic intensity

the specialized nature of GPUs makes it easier to use additional transistors for computation

◦Economics multi-billion dollar video game market is

a pressure cooker that drives innovation to exploit this property

Flexible and PreciseFlexible and PreciseModern GPUs are deeply

programmable◦Programmable pixel, vertex, and

geometry engines◦Solid high-level language support

Modern GPUs support “real” precision◦32 bit floating point throughout the

pipeline High enough for many (not all) applications Vendors committed to double precision soon

◦DX10-class GPUs add 32-bit integers

GPU Fundamentals: Graphics GPU Fundamentals: Graphics PipelinePipeline

A simplified graphics pipeline◦Note that pipe widths vary◦Many caches, FIFOs, and so on not

shown

GPUCPU

ApplicationApplication Transform& Light

Transform& Light RasterizeRasterize ShadeShade Video

Memory(Textures)

VideoMemory

(Textures)

Xfo

rmed, Lit V

ertice

s (2

D)

Graphics State

Render-to-texture

AssemblePrimitivesAssemblePrimitives

Vertice

s (3

D)

Scre

ensp

ace

triangle

s (2

D)

Fra

gm

ents (p

re-p

ixels)

Fin

al P

ixels (C

olo

r, D

epth

)

GPU

Transform& Light

Transform& Light

CPU

ApplicationApplication RasterizeRasterize ShadeShade VideoMemory

(Textures)

VideoMemory

(Textures)

Xfo

rmed, Lit V

ertice

s (2

D)

Graphics State

Render-to-texture

AssemblePrimitivesAssemblePrimitives

Vertice

s (3

D)

Scre

ensp

ace

triangle

s (2

D)

Fra

gm

ents (p

re-p

ixels)

Fin

al P

ixels (C

olo

r, D

epth

)

GPU Fundamentals: GPU Fundamentals: ModernModern Graphics Graphics PipelinePipeline

Programmable vertex processor!

Programmable pixel processor!

FragmentProcessorFragmentProcessor

VertexProcessor

VertexProcessor

GPUCPU

ApplicationApplication VertexProcessor

VertexProcessor RasterizeRasterize Fragment

ProcessorFragmentProcessor

VideoMemory

(Textures)

VideoMemory

(Textures)

Xfo

rmed, Lit V

ertice

s (2

D)

Graphics State

Render-to-texture

Vertice

s (3

D)

Scre

ensp

ace

triangle

s (2

D)

Fra

gm

ents (p

re-p

ixels)

Fin

al P

ixels (C

olo

r, D

epth

)

GPU Fundamentals: GPU Fundamentals: ModernModern Graphics Graphics PipelinePipeline

AssemblePrimitivesAssemblePrimitives

GeometryProcessorGeometryProcessor

Programmable primitive assembly!

More flexible memory access!

GPU Pipeline: TransformGPU Pipeline: TransformVertex processor (multiple in

parallel)◦Transform from “world space” to

“image space”◦Compute per-vertex lighting

GPU Pipeline: Assemble GPU Pipeline: Assemble PrimitivesPrimitivesGeometry processor

◦How the vertices connect to form a primitive

◦Per-Primitive Operations

GPU Pipeline: RasterizeGPU Pipeline: RasterizeRasterizer

◦Convert geometric rep. (vertex) to image rep. (fragment) Pixel + associated data: color, depth,

stencil, etc.

◦Interpolate per-vertex quantities across pixels

GPU Pipeline: ShadeGPU Pipeline: ShadeFragment processors (multiple in

parallel)◦Compute a color for each pixel◦Optionally read colors from textures

(images)

GPU ParallelismGPU Parallelism

GeForce 7900 GTX

GPU ProgrammingGPU ProgrammingSimplified

computational model◦ consistent as hardware

changesAll stages SIMDFixed conversion /

remapping between each stage

BufferBufferVertex (stream)Vertex

(stream)

Geometry(stream)

Geometry(stream)

Fragment(array)

Fragment(array)

ExampleExampleVertex shader

void main() { gl_FrontColor = gl_Color; gl_Position = gl_ProjectionMatrix * gl_ModelViewMatrix * gl_Vertex; }

Pixel shadervoid main() { gl_FragColor = gl_Color;}

BufferBufferVertex (stream)Vertex

(stream)

Geometry(stream)

Geometry(stream)

Fragment(array)

Fragment(array)

Vertex ShaderVertex ShaderOne element in / one outNo communicationCan select fragment addressInput:

◦ Vertex data (position, normal, color, …)◦ Shader constants, Texture data

Output: ◦ Required: Transformed clip-space position◦ Optional: Colors, texture coordinates, normals

(data you want passed on to the pixel shader)Restrictions:

◦ Can’t create new vertices

Pixel ShaderPixel ShaderBiggest computational resourceOne element in / 0 – 1 outCannot change destination addressNo communicationInput:

◦ Interpolated data from vertex shader ◦ Shader constants, Texture data

Output: ◦ Required: Pixel color (with alpha)◦ Optional: Can write additional colors to

multiple render targetsRestrictions:

◦ Can’t read and write the same texture simultaneously

ExampleExampleVertex shader

void main() {

vec4 v = vec4(gl_Vertex); v.z = 0.0; gl_Position = gl_ProjectionMatrix *

gl_ModelViewMatrix * gl_Vertex; }

Pixel shadervoid main() {

gl_FragColor = vec4(0.8,0.4,0.4,1.0); }

http://www.lighthouse3d.com/opengl/glsl/

Geometry ShaderGeometry ShaderOne element in / 0 to ~100 out

◦ Limited by hardware buffer sizesLike vertex:

◦ No communication◦ Can select fragment address

Input:◦ Entire primitive (point, line, or triangle)◦ Optional: Adjacency

Output:◦ Zero or more primitives (a homogenous list of

points/lines or triangles)Restrictions:

◦ Allow parallel processing but preserve serial order

Geometry ShaderGeometry ShaderApplications

◦Fur/fins, procedural geometry/detailing,

◦Data visualization techniques,◦Wide lines and strokes, …

Multiple PassesMultiple PassesCommunication

◦ None in one pass◦ Arbitrary read

addresses between passes

BufferBufferVertex (stream)Vertex

(stream)

Geometry(stream)

Geometry(stream)

Fragment(array)

Fragment(array)

ExampleExample

Image Space Silhouette Extraction Using Graphics Hardware [Wang 2005]

Depth buffer Normal buffer

Silhouettes Creases

Final result

GPU ApplicationsGPU ApplicationsBump/Displacement mapping

Height mapDiffuse light without bump Diffuse light with bump

GPU ApplicationsGPU ApplicationsVolume texture mapping

GPU ApplicationsGPU ApplicationsCloth simulation

GPU ApplicationsGPU Applications

GPU ApplicationsGPU ApplicationsReal-time renderingImage processingGeneral purpose GPU (GPGPU)…

ContentsContentsIntroduction to GPUHigh level shading languagesGPU applications

GPU ApplicationsGPU ApplicationsSoft Shadows

Percentage-closer soft shadows [Fernando 2005]