introduction to realtime ray tracing course 41 philipp slusallek peter shirley bill mark gordon...

48
Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Upload: angelica-rodgers

Post on 03-Jan-2016

221 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Introduction to Realtime Ray Tracing

Course 41

Introduction to Realtime Ray Tracing

Course 41

Philipp Slusallek Peter Shirley

Bill Mark Gordon Stoll Ingo Wald

Page 2: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Hardware for Realtime Ray TracingHardware for Realtime Ray Tracing

• Custom Hardware for Realtime Ray Tracing

– Characteristics and requirements

– RPU Design and Implementation

• GPU + Recursion + Custom Traversal HW

– Programming Model

– FPGA Prototype

– Performance and Scalability

Page 3: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Ray Tracing on CPUsRay Tracing on CPUs

• Characteristics

– Commodity, well understood HW

– High FP performance, yet still too slow

– Limited parallelism, bulky clusters

– Poor silicon usage (e.g. cache)

• Outlook

– Multi-core designs are coming

– Will still take too long

Page 4: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Ray Tracing on GPUsRay Tracing on GPUs

• Characteristics

– Very high raw FP performance

– High degree of parallelism

– Fast development cycle

• Stream programming model

– Still too limited for efficient ray tracing

• No support for recursion

• Limited memory access

Page 5: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Ray Tracing Characteristics: kd-Tree TraversalRay Tracing Characteristics: kd-Tree Traversal

• One-dimensional computation along ray

– Compute location of d relative to t_min / t_max

– Iterate or recurse with updated t_max / t_max

t_min

t_max

d

t_min

t_max

dsplit

t_min

t_max d

split

Near: t_min< t_max < d Both: t_min < d < t_max Far: d < t_min < t_max

Page 6: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Ray Tracing Characteristics: kd-Tree TraversalRay Tracing Characteristics: kd-Tree Traversal

• Inner traversal looptmp = node.split – ray.origin

d = tmp * 1/ray.direction

near = d > t_min

far = d < t_max

if (near & far) push(node.far, d, t_max)

if (near) iterate(node.near, t_min, d)

else iterate(node.far, d, t_max)

• Advantages of using kd-trees

– Simple and fast traversal & building algorithm

– Robust & very good handling of large scenes

t_min

t_max

d

split

Page 7: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Ray Tracing Characteristics: kd-Tree TraversalRay Tracing Characteristics: kd-Tree Traversal

• Traversal Processing

– 50-80 k-D steps per ray @ 10 instructions/step many instructions many clock cycles

– Serial dependency low pipeline efficiency, stalls, latency

– Limited but flexible control flow and memory access

Custom HW unit

– One clock tick per traversal step (fully pipelined)

– Up to 100:1 improvement

Page 8: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Ray Tracing Characteristics: IntersectionRay Tracing Characteristics: Intersection

• Intersection computation

– Triggered by traversal at every leaf node

• Called with: ray and address of geometry

– Option 1: Custom hardware [SaarCOR’05]

– Option 2: Software on programmable processor

• Can be implemented efficiently

• Enables arbitrary programmable primitives

Do not use costly dedicated hardware

Page 9: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Ray Tracing Characteristics: ShadingRay Tracing Characteristics: Shading

• Shading computation

– Triggered by finished ray traversal

• Called with: ray, hit point, shader-id, address of parameters

– Characteristics:

• General-purpose computation, many 3-/4-vectors

• Needs support for efficient texture and memory access

• Needs support for arbitrary recursive tracing rays

– E.g. support dependent ray tracing

Main feature of ray tracing: Do not put limits on it

Page 10: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Ray Tracing Characteristics: CoherenceRay Tracing Characteristics: Coherence

• Ray coherence

– Neighboring primary rays

• Traverse highly similar kd-node in same order

• Often hit same geometric primitives

• Often execute the same shader, access same textures, …

– Similar for shadow rays to one light source

– Often (but not always) applies for secondary rays

HW should take advantage of this coherence

Page 11: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Previous WorkPrevious Work

• SaarCOR I

– Fixed function ray tracing chip [GH’05]

Page 12: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU ApproachRPU Approach

• Take GPUs as basis and core component

– Highly parallel, highly efficient

• Improve programming model

– Add efficient recursion, conditionals

– Add memory access options

• Add custom traversal unit

– Slave to RPU

– Performs indirect, data dependent functions calls

Page 13: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU DesignRPU Design

Shader Processing Units (SPU)- General purpose computation

- For shading, geometry, lighting computations

- Operates on 4-component vectors

- Integer and float

- Dual issue, split vector

- GPU-like instruction set

- Arbitrary read/write

- Texture addressing mode

- No texture filtering SW

Page 14: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU DesignRPU Design

Shader Processing Units (SPU) Custom Ray Traversal Unit (TPU)

- Efficient traversal of k-D trees

- Communicates with SPU over dedicated registers

Page 15: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU DesignRPU Design

Shader Processing Units (SPU) Custom Ray Traversal Unit (TPU) Multi-Threading

- Increases usage of HW resources

- Hides latency due to

- Memory access

- Instruction dependencies

- Long traversal operations

- Separate thread pool for SPU & TPU

- Software scheduling (compiler)

- No overhead for switching threads

- Increases resources (mainly register file)

Page 16: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU DesignRPU Design

Shader Processing Units (SPU) Custom Ray Traversal Unit (TPU) Multi-Threading Chunking

- SIMD execution (SPUs & TPUs)

- Takes advantage of coherence

- Reduces hardware complexity

- Can combine of memory requests

- Reduces external bandwidth

- Must allow for incoherence

- Chunks may split at conditionals

- Inactive sub-chunk put on stack

- Masked execution

- Worst case: serial computation

Page 17: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU DesignRPU Design

Shader Processing Units (SPU) Custom Ray Traversal Unit (TPU) Multi-Threading Chunking Mailbox Processing (MPU)

Per thread caching mechanism Avoids multiple processing

of same kd-tree entry (e.g. triangle) 10x performance for some scenes

Page 18: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU ArchitectureRPU Architecture

SPUSPUSPU

SPUSPU

PCI/AGP/PCI-XINT ERFACE

SPU

DDR-RAM 1

VM U

T PUT -CACHE

READ ONLY

SPUR4-CACHE

R/W

M PUL-CACHE

READ ONLY

M EM -IF(...)

DDR-RAM N

T hread GeneratorDM A-IF

RPU 1

(...)

RPU Chip 1FastChip Inter-

Connect

RPU 2

SPUSPUSPU

SPUSPUSPU

T PUT -CACHE

READ ONLY

SPUR4-CACHE

R/W

M PUL-CACHE

READ ONLY

RPU N

Page 19: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RegisterStack

VEC4-ALU,Mul, Add, ...

I0 to I3 A =(A0,A1,A2,A3)

R0 to R15

ORG, DIR,(SID,MIN, MAX, ROOT)

C

C0 ... C15

C0 ... C15

P0 to P15

set of parameter registers

hardware managed register stack, only topmost registers are visible

output registers to control the TPU

input, output and address registers for memory control

register stack frame

SPU Vector RegistersSPU Vector Registers

• All registers have 4- component (float or integer)

• R0 to R15: General registers

– Index into a HW managed register stack

– Allows for single-cycle function call

• P0 to P15: shader parameters

• I0 to I3: data read from memory

• A = (A0,A1,A2,A3)

– Memory addressing

• ORG, DIR, ...

– TPU communication registers

Page 20: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Instruction Set of SPUInstruction Set of SPU

Ray traversal operation trace

Conditional instructions (paired) if <condition> jmp label if <condition> call <fun> If <condition> return

Dual issue (pairing) 3/1 and 2/2 arithmetic splitting Arithmetic + load Arithmetic + conditional jump,

call, return

Short vector instruction set mov, add, mul, mad, frac dph2, dp3, dph3, dp4

Input modifiers Swizzeling, negation, masking Multiply with power of 2

Special operations (modifiers) rcp, rsq, sat

Fast 2D texture lookups texload, texload4x

Read from and write to memory load, load4x, store

Page 21: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Instruction Set of SPUInstruction Set of SPU

Ray traversal operation trace

Conditional instructions (paired) if <condition> jmp label if <condition> call <fun> If <condition> return

Dual issue (pairing) 3/1 and 2/2 arithmetic splitting Arithmetic + load Arithmetic + conditional jump,

call, return

Short vector instruction set mov, add, mul, mad, frac dph2, dp3, dph3, dp4

Input modifiers Swizzeling, negation, masking Multiply with power of 2

Special operations (modifiers) rcp, rsq, sat

Fast 2D texture lookups texload, texload4x

Read from and write to memory load, load4x, store

Page 22: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Instruction Set of SPUInstruction Set of SPU

Ray traversal operation trace

Conditional instructions (paired) if <condition> jmp label if <condition> call <fun> If <condition> return

Dual issue (pairing) 3/1 and 2/2 arithmetic splitting Arithmetic + load Arithmetic + conditional jump,

call, return

Short vector instruction set mov, add, mul, mad, frac dph2, dp3, dph3, dp4

Input modifiers Swizzeling, negation, masking Multiply with power of 2

Special operations (modifiers) rcp, rsq, sat

Fast 2D texture lookups texload, texload4x

Read from and write to memory load, load4x, store

Page 23: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Instruction Set of SPUInstruction Set of SPU

Ray traversal operation trace

Conditional instructions (paired) if <condition> jmp label if <condition> call <fun> If <condition> return

Dual issue (pairing) 3/1 and 2/2 arithmetic splitting Arithmetic + load Arithmetic + conditional jump,

call, return

Short vector instruction set mov, add, mul, mad, frac dph2, dp3, dph3, dp4

Input modifiers Swizzeling, negation, masking Multiply with power of 2

Special operations (modifiers) rcp, rsq, sat

Fast 2D texture lookups texload, texload4x

Read from and write to memory load, load4x, store

Page 24: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Instruction Set of SPUInstruction Set of SPU

Ray traversal operation trace

Conditional instructions (paired) if <condition> jmp label if <condition> call <fun> If <condition> return

Dual issue (pairing) 3/1 and 2/2 arithmetic splitting Arithmetic + load Arithmetic + conditional jump,

call, return

Short vector instruction set mov, add, mul, mad, frac dph2, dp3, dph3, dp4

Input modifiers Swizzeling, negation, masking Multiply with power of 2

Special operations (modifiers) rcp, rsq, sat

Fast 2D texture lookups texload, texload4x

Read from and write to memory load, load4x, store

Page 25: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Instruction Set of SPUInstruction Set of SPU

Ray traversal operation trace

Conditional instructions (paired) if <condition> jmp label if <condition> call <fun> If <condition> return

Dual issue (pairing) 3/1 and 2/2 arithmetic splitting Arithmetic + load Arithmetic + conditional jump,

call, return

Short vector instruction set mov, add, mul, mad, frac dph2, dp3, dph3, dp4

Input modifiers Swizzeling, negation, masking Multiply with power of 2

Special operations (modifiers) rcp, rsq, sat

Fast 2D texture lookups texload, texload4x

Read from and write to memory load, load4x, store

Page 26: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Ray Triangle IntersectionUnit-Triangle TestRay Triangle IntersectionUnit-Triangle Test

; load triangle transformation load4x A.y,0

; transform ray dp3_rcp R7.z,I2,R3 dp3 R7.y,I1,R3 dp3 R7.x,I0,R3 dph3 R6.x,I0,R2 dph3 R6.y,I1,R2 dph3 R6.z,I2,R2

; compute hit distance mul R8.z,-R6.z,S.z + if z <0 return

; barycentric coordinates mad R8.xy,R8.z,R7,R6 + if or xy (<0 or >=1) return

; hit if u + v < 1 add R8.w,R8.x,R8.y + if w >=1 return

; hit distance closer than last one? add R8.w,R8.z,-R4.z + if w >=0 return

; save hit information mov SID,I3.x + mov MAX,R8.z mov R4.xyz,R8 + return

InputArithmetic (dot products)Multi-issue (arith. & cond.)

Page 27: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Read Instruction

Read 3 Source Registers

Swizzeling mov R0,R1* mov R2,R3

* mov R0,R2

Masking

Writeback

*

Memory Access

Writeback I0 – I3

* * *+ + +

+

ClampThread ControlBranching

StackControl

RCP,RSQ

Writeback

Masking

Shader Processing UnitPipeliningShader Processing UnitPipelining

Page 28: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU Programming ModelRPU Programming Model

• ↨: Direct function calls

• ↔: Indirect function calls via TPU

...

PrimaryRay

Shader

TPU/MPU

Top-LevelObject

Intersector

TPU/MPU

Surface/BRDFShader

LightSourceShader

LightingShader

TPU/MPU

LightSourceShader

GeometryIntersector

primary

ray

secondary

raysSPU Processing

TPU / MPU Processing

... TPU/MPU

shadow

rays

Page 29: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU Programming ModelRPU Programming Model

PrimaryRay

Shader

TPU/MPU

Top-LevelObject

Intersector

TPU/MPU

Surface/BRDFShader

LightSourceShader

LightingShader

TPU/MPU

LightSourceShader

GeometryIntersector

primary

ray

secondary

rays

TPU/MPU

shadow

rays

Page 30: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU Programming ModelRPU Programming Model

• Threads are started for each pixel

• Registers initialized from an input stream

– 2D Hilbert curve generator sampling the screen

– Memory stream for multi-pass

• Shader computes ray

PrimaryRay

Shader

TPU/MPU

Top-LevelObject

Intersector

TPU/MPU

Surface/BRDFShader

LightSourceShader

LightingShader

TPU/MPU

LightSourceShader

GeometryIntersector

primary

ray

secondary

rays

TPU/MPU

shadow

rays

Page 31: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU Programming ModelRPU Programming Model

• Threads are started

• Registers initialized from an input stream

– 2D Hilbert curve generator sampling the screen

– Memory stream for multi-pass

PrimaryRay

Shader

TPU/MPU

Top-LevelObject

Intersector

TPU/MPU

Surface/BRDFShader

LightSourceShader

LightingShader

TPU/MPU

LightSourceShader

GeometryIntersector

primary

ray

secondary

rays

TPU/MPU

shadow

rays

Page 32: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU Programming ModelRPU Programming Model

• Shooting Primary Rays

– Ray traversal performed onthe TPU

– Started in top-level kd-tree

– Intersector transforms ray into local coordinate system

PrimaryRay

Shader

TPU/MPU

Top-LevelObject

Intersector

TPU/MPU

Surface/BRDFShader

LightSourceShader

LightingShader

TPU/MPU

LightSourceShader

GeometryIntersector

primary

ray

secondary

rays

TPU/MPU

shadow

rays

top-level kd-tree

Page 33: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU Programming ModelRPU Programming Model

PrimaryRay

Shader

TPU/MPU

Top-LevelObject

Intersector

TPU/MPU

Surface/BRDFShader

LightSourceShader

LightingShader

TPU/MPU

LightSourceShader

GeometryIntersector

primary

ray

secondary

rays

TPU/MPU

shadow

rays

top-level kd-tree

Page 34: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU Programming ModelRPU Programming Model

PrimaryRay

Shader

TPU/MPU

Top-LevelObject

Intersector

TPU/MPU

Surface/BRDFShader

LightSourceShader

LightingShader

TPU/MPU

LightSourceShader

GeometryIntersector

primary

ray

secondary

rays

TPU/MPU

shadow

rays

top-level kd-tree

Page 35: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU Programming ModelRPU Programming Model

• Shooting Primary Rays (II)

– Transformed ray traversed through object kd-tree on TPU

– Geometry intersection performed on programmable SPU

– Programmable geometry: triangles, spheres, bicubic splines, quadrics, …

PrimaryRay

Shader

TPU/MPU

Top-LevelObject

Intersector

TPU/MPU

Surface/BRDFShader

LightSourceShader

LightingShader

TPU/MPU

LightSourceShader

GeometryIntersector

primary

ray

secondary

rays

TPU/MPU

shadow

rays

object-level kd-tree

Page 36: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU Programming ModelRPU Programming Model

PrimaryRay

Shader

TPU/MPU

Top-LevelObject

Intersector

TPU/MPU

Surface/BRDFShader

LightSourceShader

LightingShader

TPU/MPU

LightSourceShader

GeometryIntersector

primary

ray

secondary

rays

TPU/MPU

shadow

rays

object-level kd-tree

Page 37: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU Programming ModelRPU Programming Model

• Surface shading performed on programmable SPU

– Surface shader is called directly from primary shader

– Arguments passed on HW stack

– May trace secondary rays at any time: reflection, refraction, …

– Writing shaders is easy due to global access to the scene and physically-based computation

PrimaryRay

Shader

TPU/MPU

Top-LevelObject

Intersector

TPU/MPU

Surface/BRDFShader

LightSourceShader

LightingShader

TPU/MPU

LightSourceShader

GeometryIntersector

primary

ray

secondary

rays

TPU/MPU

shadow

rays

Page 38: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

RPU Programming ModelRPU Programming Model

• Light properties and illumination can be abstracted using function calls

• Illumination shader iterates over all light sources

• For each light source a Light source shader is called

PrimaryRay

Shader

TPU/MPU

Top-LevelObject

Intersector

TPU/MPU

Surface/BRDFShader

LightSourceShader

LightingShader

TPU/MPU

LightSourceShader

GeometryIntersector

primary

ray

secondary

rays

TPU/MPU

shadow

rays

Page 39: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Prototype ImplementationPrototype Implementation

SPUSPUSPU

SPUSPU

PCI/AGP/PCI-XINT ERFACE

SPU

DDR-RAM 1

VM U

T PUT -CACHE

READ ONLY

SPUR4-CACHE

R/W

M PUL-CACHE

READ ONLY

M EM -IF(...)

DDR-RAM N

T hread GeneratorDM A-IF

RPU 1

(...)

RPU Chip 1FastChip Inter-

Connect

RPU 2

SPUSPUSPU

SPUSPUSPU

T PUT -CACHE

READ ONLY

SPUR4-CACHE

R/W

M PUL-CACHE

READ ONLY

RPU N

Page 40: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

PrototypePerformancePrototypePerformance

• FPGA prototype

– Xilinx Virtex II 6000

– 128 MB DDR-RAM at 350 MB/s

– PCI bus for up-/download (no VGA)

• Single RPU at only 66 MHz

– Up to 4 million rays per second

– Up to 20 fps @ 512x384

– Same ray tracing performance as Intel P4 @ 2.66 GHz

Page 41: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

ScalabilityScalability

• Larger Chunk Size

– Less ray coherence

– More data is accessed

– Increased cache bandwidth

– Larger caches

Page 42: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

ScalabilityScalability

• Larger Chunk Size

• Multiple RPUs on a Chip

– Limited by

• VLSI technology

• Memory bandwidth

– FPGA prototype versus current GPUs

• Floating point units 50x

• Memory bandwidth 100x

• Clock rate 7x

RPU

RPU RPU

RPU

RAMRAM

Page 43: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

ScalabilityScalability

• Larger Chunk Size

• Multiple RPUs on a Chip

• Multiple chips on a board

– Fast interconnect for data exchange

– Cache sizes accumulate

– Managed through virtual memory [Schmittler’2003]

– Limited through external bandwidth due to scene changes

RAMRAM RPU C HIP RAMRAM RPU C HIP

RPU board

Page 44: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

ScalabilityScalability

• Larger Chunk Size

• Multiple RPUs on a Chip

• Multiple chips on a board

• Multiple boards in a PC

– Similar to today’s PC clusters in a much smaller form factor

RAMRAM RPU C HIP RAMRAM RPU C HIP

RPU board

RAMRAM RPU C HIP RAMRAM RPU C HIP

RPU board

RAMRAM RPU C HIP RAMRAM RPU C HIP

RPU board

RAMRAM RPU C HIP RAMRAM RPU C HIP

RPU board

Page 45: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

VideoVideo

Page 46: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Future WorkFuture Work

• Support for fully dynamic scenes

– Vertex shader + building kd-trees

• Efficient photon mapping

– kd-tree construction + kNN filtering

• OpenRT-API [Dietrich’03]

• ASIC prototype

Page 47: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald

Questions?Questions?

http://graphics.cs.uni-sb.de

http://www.OpenRT.de

http://www.SaarCOR.de

Page 48: Introduction to Realtime Ray Tracing Course 41 Philipp Slusallek Peter Shirley Bill Mark Gordon Stoll Ingo Wald