introduction - advanced graphics deferred rendering introduction 6. advanced graphics 2012-2013 fast...
TRANSCRIPT
Advanced Graphics 2012-2013 Advanced Graphics 2012-2013
Advanced Graphics
Introduction
Introduction
Advanced Graphics 2012-2013
Teacher
Marries van de Hoef
Under supervision of Arjan Egges and Frank van der Stappen
Introduction 2
Advanced Graphics 2012-2013
Teacher
Graduating Master student
– Graphics research since Bachelor
– Three graphics research projects in Master
Introduction 3
Advanced Graphics 2012-2013
Dynamic Lighting Architectures
Introduction 4
Advanced Graphics 2012-2013
Reprojection 3D & Head Tracking
Introduction 5
Advanced Graphics 2012-2013
Hybrid Deferred Rendering
Introduction 6
Advanced Graphics 2012-2013
Fast Dynamic Radiosity
Introduction 7
Advanced Graphics 2012-2013
Teacher
Graphics research internship at Nixxes – Worked on a major AAA-title
Assisted the Graphics course twice – Advised on practical assignment setup
– Gave multiple lectures
Much more I won’t bore you with
Introduction 8
Advanced Graphics 2012-2013 Advanced Graphics 2012-2013
COURSE OVERVIEW Introduction 9
Advanced Graphics 2012-2013
Focus
Computer Graphics research area is very wide – Computer animation
– Geometry
– Simulation
– Rendering
– Much more…
This course focuses purely on Rendering
Introduction 10
Advanced Graphics 2012-2013
Focus
The rendering research area can be divided into two extremes:
– Optimization for real-time performance
• Real-time global illumination
– Simulation of visual effects
• Realistic hair rendering
Introduction 11
Advanced Graphics 2012-2013
Focus
This course leans towards the real-time side, but also covers the principles of simulation – Relevance to games (and game-tools)
Context of GPGPU programming – Will become increasingly more important
– Future of the GPU (and CPU?)
– Also useful for other areas
Introduction 12
GPGPU = General-Purpose computing on the GPU
Advanced Graphics 2012-2013
Course Format
Meetings: – 8 lectures
– Guest lecture
– 8 paper presentation sessions
Presence is mandatory
Two practical assignments – Forum for questions
Introduction 13
Advanced Graphics 2012-2013
Grading
Grading rules:
40% - Paper presentation
10%+40% - Practical assignments
10% - Quality of attendance
Need at least 5.0 for presentation and practical assignments
Introduction 14
Advanced Graphics 2012-2013 Advanced Graphics 2012-2013
COURSE DETAILS Introduction
Advanced Graphics 2012-2013
Paper presentation
Form teams of two
– Post configuration on the forum
– Deadline: Saturday February 9th
– Required to continue with the course
– Can’t find a partner? Notify me
Introduction 16
Advanced Graphics 2012-2013
Paper presentation
Paper and presentation slot are assigned
– Fully randomized
Paper subject extends on a lecture subject
Requirements and details on the website
– Read it before you start preparing
Introduction 17
Advanced Graphics 2012-2013
Website
Download papers from the course website
Secured with password
– Username: student
– Password: the course code of this course
– (both are lower case)
Introduction
Advanced Graphics 2012-2013
Practical Assignments
Have to be done individually Focus on GPGPU-style programming Additional features
– Large part of your grade – Free to choose – Creativity is encouraged – Specialize in your own interest
Introduction 19
Advanced Graphics 2012-2013
Practical Assignments
Requirements
– C++
– Direct3D 11
– Visual Studio 2012
Works on DirectX 10 hardware and Windows 7
Work in BBL-175 or GameHall
Introduction 20
Advanced Graphics 2012-2013
Practical Assignments
Forum
– Ask questions
– Answer questions
– Saves a lot of time
– And headaches
FORUM!!!!
Introduction 21
Advanced Graphics 2012-2013
GPU Mandelbrot
Introduction 22
Advanced Graphics 2012-2013
GPU Mandelbrot
Just a warming up Create it from scratch
– Learn what’s really possible
A lot of work
– The deadline is soon
Read the assignment document
Introduction 23
Advanced Graphics 2012-2013
GPU Path Tracer
Introduction 24
Advanced Graphics 2012-2013
GPU Path Tracer
Practice advanced shader programming & learn the practical side of path tracing
Relevant for games – GPGPU-style programming
– Rays are commonly used in other techniques
– Several smaller parts directly usable in games
Theoretically more complex than Mandelbrot
Introduction 25
Advanced Graphics 2012-2013
Forum
http://www.uu.nl/blackboard
Click on the Advanced Graphics course – Post your paper presentation team
• In the Paper Presentation forum
– Ask/answer questions about the assignments…
– …or lectures
– Course announcements
Introduction 26
Advanced Graphics 2012-2013
Points
Gamification
Several ways to earn points during the course
Small competition amongst you and your friends
– Also converted to a small part of your grade
Introduction 27
Advanced Graphics 2012-2013
Points
Ways to earn points
– Questions about the previous lecture
• Each meeting will start with 10 random students each receiving 1 question. You have to study!
– Find errors in the slides
• Real mistakes, not disputable things or typos
– Help other students on the forum
• Top two helpers get points
Introduction 28
Advanced Graphics 2012-2013
Points
Conversion to points
– Quality of Attendance term (10%)
– Maximum QoA grade is 15 (out of 10)
• 5 bonus points compensate for any bad luck
– Maximum final grade is 10.5
Introduction 29
Advanced Graphics 2012-2013
Points
Calculation of QoA grade Lecture questions always give 1 opportunity
– And only a point when you are correct (and present)
Other points earned increment both variables equally
Introduction 30
uint point, opportunity; float qoaGrade = point / (float)opportunity * 15;
Advanced Graphics 2012-2013
Points
10 questions at the start of each meeting – Be on time! – No peeking and no telling – Student selection is fully random
• You could be selected twice in one meeting
Procedure – A student is selected (you) – Stand up – The slide with the question is shown – Think for a few seconds and answer – The correct answer is revealed
Introduction 31
Advanced Graphics 2012-2013
Points
Practice 3 questions
– Not for real
– Only easy questions
– Real questions will focus on facts (and insight)
– Hard questions (roughly equal difficulty)
• You must study!
– Thursday will be for real
• About the hardware part of this lecture
Introduction 32
Advanced Graphics 2012-2013
Practice Questions
Introduction 33
Advanced Graphics 2012-2013
Question 1
What is the first name of the teacher?
Introduction 34
Advanced Graphics 2012-2013
Question 2
Which one is not a graphics (related) API?
a) Direct3D
b) OpenGL
c) Havok
d) CUDA
e) Direct2D
Introduction 35
Advanced Graphics 2012-2013
Question 3
Given two perpendicular 3D vectors, how to make an orthogonal basis?
Introduction 36
Advanced Graphics 2012-2013
Questions
Introduction 37
Advanced Graphics 2012-2013 Advanced Graphics 2012-2013
Advanced Graphics
Hardware
Hardware
Advanced Graphics 2012-2013
Parallel Data Processing
We need to process a lot of data for graphics – Over 200 GB per second
– Using over 3TFLOPS
Requires a different programming architecture
Get a feeling for how this works – Important to write fast graphics code
Hardware 39
Advanced Graphics 2012-2013
Parallel Data Processing
Requirements for massive data processing
– Massive processing power
– Massive bandwidth to/from memory
• Deal with memory latency
We can do with simple processing units
– Limited instruction set
– Share instruction stream
Hardware 40
Advanced Graphics 2012-2013
Parallel Data Processing
Use architecture based on SIMD – Single Instruction, Multiple Data
Good when working with Vector3 (add, multiply, etc.) – Same instruction executed for all 3 components anyway
Can we go “wider”? – What if we can execute an instruction for
8 components simultaneously?
Hardware 41
Advanced Graphics 2012-2013
GPU’s don’t parallelize on Vector3 level – But instead on pixel/vertex level
Execute the same shader instructions
for all threads (pixels, vertices, etc.) – Also called SIMT
Keeps functional units slim
– And parallelization manageable
Parallel Data Processing
Hardware 42
Instruction feeding
F F F F
F F F F
F F F F
F F F F
Local memory
1 Core
Advanced Graphics 2012-2013
Parallel Hardware
Multiple functional units form a core – up to 64
Multiple cores on a GPU – up to 32
Conceptual number of threads executed simultaneously (warp)
– 16 (Intel) – 32 (Nvidia – warp) – 64 (AMD – wavefront)
(Disclaimer: many exceptions)
Not necessarily 1 thread per functional unit Some GPU’s execute multiple instructions simultaneously
Hardware 43
Advanced Graphics 2012-2013
Parallel Hardware
Consequences of thread-level SIMD
– Vector4 operations cost more than Vector3
– What about branching code?
Branches conflict with thread-level SIMD
– No good solution
– Execute both (diverging) branches for all threads
• Ignoring the results of the wrong branch
Hardware 44
Advanced Graphics 2012-2013
Threading SIMD Example
How can we program using this architecture?
Calculate the sum of 16 numbers in a list
– Assume a warp size of 8
Hardware 45
1 2 3 4 5 6 7 8
Advanced Graphics 2012-2013
Memory Latency
Waiting for memory takes very long
– It takes 400-800 processor cycles!
– Optimized for throughput at the cost of latency
Don’t wait, do something else
– Work on another warp
– …repeat if that warp needs memory access as well
Extended form of CPU hyper-threading
Hardware 46
Advanced Graphics 2012-2013
Memory Latency
Hardware 47
Warp 1
Warp 1
Stall Warp 2
Stall Warp 3
Stall Warp 4
Stall
Advanced Graphics 2012-2013
Memory Latency
Warp contexts are stored in local memory – Contains state for all functional units
– Context size depends on the shader
If the shader requires many temporary registers – Fewer warps, thus less latency hiding
Hardware 48
Instruction feeding
F F F F
F F F F
F F F F
F F F F
Local memory
1 Core
Ctx Ctx Ctx
Ctx Ctx Ctx
Advanced Graphics 2012-2013
Bandwidth Bottleneck
Latency hiding is great! No more stalling, right?
– Still limited by the available bandwidth
Very common bottleneck
– Bandwidth is more precious than computing power!
– If you can avoid reading a texture by doing more calculations: do it!
Hardware 49
Advanced Graphics 2012-2013
CPU SIMD
CPU also has SIMD capabilities
– SSE instructions
– Not threaded SIMD like the GPU
Can operate on 4 floats simultaneously
– Vector4 + Vector4 is 1 operation
– Always on 4 floats, even if you use 3 floats
Hardware 50
Advanced Graphics 2012-2013
Simple SIMD Example
Calculate the sum of 16 numbers in a list
– Store the numbers in 4 SSE vectors
– Add the vectors
– Use special instructions to do internal addition
Hardware 51
+ + + =
Advanced Graphics 2012-2013
CPU SIMD
SSE caveats
– Expensive to move data in/out of SSE registers
– Storage requires memory alignment
New AVX instructions double SIMD width
– Operations on 8 floats simultaneously!
– How can we use this for graphics?
Hardware 52
Advanced Graphics 2012-2013 Advanced Graphics 2012-2013
INTERACTING WITH THE GPU Hardware 53
Advanced Graphics 2012-2013
Graphics API
Use Graphics API to access GPU – Well defined programming interface
– Agreement between software developer and hardware vendor
Main Graphics API’s – Direct3D
– OpenGL
Hardware
Advanced Graphics 2012-2013
Direct3D vs. OpenGL
Religious war
OpenGL used to be “behind” Direct3D, however…
– Rise of mobile devices and Mac
– Direct3D hasn’t had new real features since 2009
Hardware 55
Advanced Graphics 2012-2013
Direct3D
We will use Direct3D
– Logic choice after using XNA
– Easier for hardware requirements
– Object oriented structure
Hardware 56
Advanced Graphics 2012-2013
Direct3D
Important Direct3D11 objects – DXGI
– ID3D11Device
– ID3D11DeviceContext
Matrices – Can use both Left/Right handed coordinate systems
– HLSL (Shader) default is column-major matrices
– DirectXMath/XNA is always row-major matrices
Hardware 57
Advanced Graphics 2012-2013
Below Direct3D
Hardware 58
Game
D3D11 User
Mode Driver
DXGI
KernelMode Driver
GPU PCI express
Advanced Graphics 2012-2013
GPU Architecture
Graphics API requires order preservation
– But the GPU is massively parallel!
Hardware 59
workload
Buffer reorder
Buffer reorder …etc
(Vertex Shader) (Rasterizer) (Pixel shader…)
Advanced Graphics 2012-2013
GPU Architecture
Hardware 60
Advanced Graphics 2012-2013
GPU Architecture
Hardware 61
Advanced Graphics 2012-2013
GPU Architecture
Hardware 62
Advanced Graphics 2012-2013 Advanced Graphics 2012-2013
GRAPHICS PIPELINE Hardware 63
Advanced Graphics 2012-2013
– Input Assembler
– Vertex Shader
– Primitive Assembler
Pipeline overview – XNA/D3D9
Hardware
VS IA PA RS PS OM
Memory
– Rasterizer
– Pixel Shader
– Output Merger
Advanced Graphics 2012-2013
– Input Assembler
– Vertex Shader
– Primitive Assembler
Pipeline overview – D3D10
Hardware
VS IA PA RS PS OM
Memory
– Geometry Shader
– Stream Output
– Rasterizer
– Pixel Shader
– Output Merger
GS
SO
Advanced Graphics 2012-2013
– Input Assembler
– Vertex Shader
– Primitive Assembler
– Hull Shader
– Tesselator
– Domain Shader
Pipeline overview – D3D11
Hardware
VS IA PA HS TS DS GS
SO
RS PS OM CS
Memory
– Geometry Shader
– Stream Output
– Rasterizer
– Pixel Shader
– Output Merger
– Compute Shader
PA
Advanced Graphics 2012-2013
Rasterizer
Hierarchical and parallel rasterization – Test for each rectangle/pixel if inside triangle
Interwoven with hierarchical early-z testing – Test depth buffer before executing pixel shader – Be careful when changing depth/depth settings
Hardware 67
Advanced Graphics 2012-2013
Rasterizer
Projection is not a linear transformation – Divide by w component – After interpolation over triangle – Perspective division is always done!
Perspective correct interpolation
Hardware 68
Advanced Graphics 2012-2013
Rasterizer
Vertex shader outputs “homogeneous clip space”
After persp. division: Normalized Device Coordinates
Viewport transform brings to pixel coordinates – Applies 0.5 pixel offset (DX10+)
Hardware 69
Advanced Graphics 2012-2013
Pixel Shader
Always executed in groups of 2x2
– Required for easy uv derivatives (for mipmapping)
– Some pixels might not be visible
– Small/thin triangles are a waste!
The shader compiler is an aggressive optimizer
– But don’t count on it
Hardware 70
Advanced Graphics 2012-2013
Compute Shader
Additional features
– Write at random positions in UAV (Unordered)
– Synchronization primitives
– Some atomic operations
– Thread Group Shared Memory
Threading SIMD program
Hardware 71
Advanced Graphics 2012-2013
Forum
Reminder:
– Check out the forum
– Work on first practical assignment
– Make paper presentation teams
• Post it on the forum
FORUM FORUM FORUM FORUM FORUM
Hardware 72