gpu architecture - rochester institute of...

GPU Architecture Chris Vuong Long Pham

Agenda

1. What is GPU?

a. Dedicated vs Integrated GPUs

b. GPU structure vs CPU

2. How does GPU work?

3. History & Evolution of GPUs

a. Background

b. 1980’s

c. 1990’s

d. 2000’s

e. 2010’s and beyond

4. OpenGL vs DirectX

5. Recent and Future Trends

1.What is GPU?

- A graphics processing unit.

- Accelerates creation of images.

- Used in embedded systems, mobile

phones, desktops, workstations and game

consoles.

a.Dedicated Card vs Integrated Card

- Interfaces with motherboard by means of

an expansion slot such as PCIe or AGP

- Easily replaceable or upgradeable

- Has its own RAM

- Produces much more heat than IGPs

Multiprocessor Structure:

- N multiprocessors with M cores each

- SIMD (Single Instruction Multiple Data) -

Cores share an Instruction Unit with other

cores in the same multiprocessor

- Shared memory, constant cache, and

texture cache

How is a pixel drawn on the screen?

Example: 1 million triangles * 100 pixels per triangle * 10 lights * 4 cycles per light computation = 4 billion cycles

3. History & Evolution of GPUs

a)Background Information

b) 1980’s

c) 1990’s

d) 2000’s

e) 2010’s and beyond

f) Trends

a)Background Information

- Graphics pipeline: The stages through which the graphics data is sent

+ Usually consists of CPU software + GPU cores

+ 3D coordinates => 2D pixel space

+ Stages in between: Geometry, Rendering

- Adopted by major GPU manufacturers such as NVIDIA, ATI

- Original GPUs used graphics pipeline with GPU performing Rendering only

- Later on GPUs started to take more tasks in the pipeline

Early GPU Pipeline

b) 1980’s

- GPUs were “integrated time buffers”

- IBM Professional Graphics Controller (PGA)

+ One of first PC’s 2D/3D video cards

+ Despite mass-market failings, became pivotal in GPU evolution

- Features were added to early GPUs by 1987

- Silicon Graphics Inc. (SGI) emergence

+ Creation of API and OpenGL

c) 1990’s

- Generation 0:

+ SGI’s RealityEngine

+ Cheap Hardware & Games Combo

+ Performance improvements

- Generation I:

+ 3dfx Voodoo (1996)

c) 1990’s (continued)

- Generation II: Breakthroughs in the field

+ Released cards could perform the entire pipeline

+ Used Accelerated Graphics Port (AGP) in place of PCI

+ New graphics features

+ Propelled computer gaming and GPU hardware markets

+ Still have room for performance improvements (fixed-function pipeline)

3dfx Voodoo (1996)

- 1 million transistors - 4 MB of 64-bit DRAM

- Core clock 50 MHz

NVIDIA’s GeForce 256 (1999) - 23 million transistors - 32 MB of 128-bit DRAM - Core clock 120 MHz

d) 2000’s

- Generation III:

+ GeForce 3, Radeon 8500: First GPUs

with programmable pipeline

+ Still limited in programmability

- Generation IV:

+ 2002 - GeForce FX, Radeon 9700: Fully

programmable

- Generation V:

+ GeForce 6, Radeon X800

Improved GPU Pipeline

d) 2000’s (continued)

- Generation VI:

+ GeForce 8 series (namely GeForce

8800): Unified shaders

+ SM (Streaming Multiprocessor):

Calculation of vertex, pixel, geometry

- Generation VII:

+ Fermi architecture: More

programmable

+ GPGPU (General Purpose GPU)

Parallelism in CPUs vs GPUs

CPUs

- Task parallelism

- Multiple tasks map to multiple threads

- Tasks run different instructions

- 10s of relatively heavyweight threads

run on 10s of cores

- Each thread managed and scheduled

explicitly

- Each thread has to be individually

programmed

GPUs

- Data parallelism

- SIMD model

- Same instruction on different data

- 10,000s of lightweight threads on

100 cores

- Threads are managed and

scheduled by hardware

- Programming done for batches of

threads(ie, 1 pixel shader per group

of pixels, or draw call)

Why Unify?

e) 2010’s and beyond

- GPU consisted of highly parallel and programmable cores

+ Essentially multi-core, general purpose CPUs

- New cards characterized this:

+ NVIDIA’s Fermi-based GTX 580

+ AMD’s Fusion (CPU+GPUs=APU)

+ Intel’s Larrabee & SandyBridge CPUs integrated GPU

- Both APIs rely on the use of traditional graphics pipeline.

- DirectX is more than just a graphics API (OpenGL is), it has tools to deal with

sound, music, input networking and multimedia.

- DirectX is exclusively to Windows platform whereas OpenGL is completely

cross platform.

- OpenGL is faster because of smoother and efficient pipeline.



- Moore’s Law applies to the

GPU transistors as well

- The number of transistors

have stopped increasing

recently due to

manufacturing constraints


- Unified Shader Architecture (center around flexible processor core).

- Extremely high parallel stream processing.

- Higher programmable capability.

References Sources:

http://mcclanahoochie.com/blog/wp-content/uploads/2011/03/gpu-hist-paper.pdf

http://www.cs.virginia.edu/~gfx/papers/pdfs/59_HowThingsWork.pdf

http://s09.idav.ucdavis.edu/talks/02_kayvonf_gpuArchTalk09.pdf


http://cs.nyu.edu/courses/fall15/CSCI-GA.3033-004/ http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf

Images:

http://s08.idav.ucdavis.edu/luebke-nvidia-gpu-architecture.pdf

http://www.hardwarezone.com.sg/feature-nvidia-geforce-8800-gtx-gts-g80-worlds-first-dx10-gpu/embracing-unified-shader-architecture

https://www.cs.utah.edu/~jeffp/teaching/MCMD/S20-GPU.pdf

https://www.directron.com/blog/what-is-pcie/















http://cs.nyu.edu/courses/fall15/CSCI-GA.3033-004/





























































gpu architecture - rochester institute of...

Documents