adding gpu computing to computer organization courses

17
Adding GPU Computing to Computer Organization Courses Karen L. Karavanic Portland State University with David Bunde, Knox College and Jens Mache, Lewis & Clark College

Upload: hinto

Post on 23-Feb-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Adding GPU Computing to Computer Organization Courses. Karen L. Karavanic Portland State University with David Bunde , Knox College a nd Jens Mache, Lewis & Clark College. Our Backgrounds in CUDA Education. Karavanic (PSU) new course “Multicore Computing” in 2008 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing toComputer Organization Courses

Karen L. KaravanicPortland State University

with David Bunde, Knox College

andJens Mache, Lewis & Clark College

Page 2: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing to Computing Organization Courses

2

Our Backgrounds in CUDA Education

• Karavanic (PSU) – new course “Multicore Computing” in 2008– “General Purpose GPU Computing” in 2010 – Mixed graduate/undergraduate

• Mache (Lewis & Clark)– Special topics course in CUDA – Project with students “Game of Life” Module

• Bunde (Knox) – Modules for teaching CUDA within existing courses

• SC12 HPC Educators [Full-Day] Session:– An Educators Toolbox for CUDA

Page 3: Adding GPU Computing to Computer  Organization Courses

Why Teach Parallel Computing with GPUs?

• It is here– Students have GPUs (on desk/ on lap/ in pocket)– Inexpensive (no need to pay $$$ or to build)

• We see the future– Massively parallel: 100s of cores– Ahead of the curve (how many cores in your CPU?)

• We see pay-off– Performance improvements– Knowledge of computer architecture helps

Page 4: Adding GPU Computing to Computer  Organization Courses

4

Example CUDA program

Adding two vectors, A and B

N elements in A and B, and N threads

(without code to load arrays with data)

#define N 256

__global__ void vecAdd(int *A, int *B, int *C) { int i = threadIdx.x; C[i] = A[i] + B[i];

} int main (int argc, char **argv ) {

int size = N *sizeof( int);int *a, *b, *c, *devA, *devB, *devC; a = (int*)malloc(size); b = (int*)malloc(size); c =

(int*)malloc(size);

cudaMalloc( (void**)&devA, size) );cudaMalloc( (void**)&devB, size );cudaMalloc( (void**)&devC, size );

cudaMemcpy( devA, a, size, cudaMemcpyHostToDevice);cudaMemcpy( devB, b size, cudaMemcpyHostToDevice);

vecAdd<<<1, N>>>(devA, devB, devC);

cudaMemcpy( c, devC size, cudaMemcpyDeviceToHost);cudaFree( devA); cudaFree( devB); cudaFree( devC);free( a ); free( b ); free( c );return (0);

}

2

1

3

4

5

6

Page 5: Adding GPU Computing to Computer  Organization Courses

Why teach GPUs in Computer Organization?

• “Feed me”– Thread “execution” configuration

(threads, blocks)– Transfer CPU – GPU– Explicit cache management• “Conflict”– Architecture leads to large penalties for naïve

code– synchronization

Page 6: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing to Computing Organization Courses

6

Mache - Unit goals

• Idea of parallelism

• Benefits and costs of system heterogeneity

• Data movement and NUMA

• Generally, the effect of architecture on program performance

Page 7: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing to Computing Organization Courses

7

Bunde – Module Design

• Brief time: Course has lots of other goals– One 70-minute lab and parts of 2 lectures

• Relatively inexperienced students– Some just out of CS 2– Many didn’t know C or Unix programming

Page 8: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing to Computing Organization Courses

8

Bunde: Approach taken• Introductory lecture– GPUs: massively parallel, outside CPU, kernels, SIMD

• Lab illustrating features of CUDA architecture– Data transfer time– Thread divergence– Memory types (next time)

• “Lessons learned” lecture– Reiterate architecture– Demonstrate speedup with Game of Life– Talk about use in Top 500 systems

Page 9: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing to Computing Organization Courses

9

Bunde: Survey results: Good news

• Asked to describe CPU/GPU interaction:– 9 of 11 mention both data movement and

invoking kernel– Another just mentions invoking the kernel

• Asked to explain experiment illustrating data movement cost:– 9 of 12 say comparing computation and

communication cost– 2 more talk about comparing different operations

Page 10: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing to Computing Organization Courses

10

Bunde: Survey results: Not so good news

• Asked to explain experiment illustrating thread divergence:– 2 of 9 were correct– 2 more seemed to understand, but misused

terminology– 3 more remembered performance effect, but said

nothing about the cause

Page 11: Adding GPU Computing to Computer  Organization Courses

Convey’s Game of Life

• Rules• Visual• Demo

Page 12: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing to Computing Organization Courses

12

Game of Life Module - Results

1=strongly disagree

7=strongly agree

Page 13: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing to Computing Organization Courses

13

Game of Life Module - Results

1=strongly disagree

7=strongly agree

Page 14: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing to Computing Organization Courses

14

Game of Life Module - Results

1=strongly disagree

7=strongly agree

Page 15: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing to Computing Organization Courses

15

Conclusions

• Bunde:– Unit was mostly successful, but thread divergence is a harder

concept– Students interested in CUDA and about half the class

requested more of it• Mache:– What students say

• It’s not easy, it’s worthwhile, more please– What instructors think

• We’ll do it again, focus, use new resources

• Bottom line: A brief introduction is possible even to students with limited background

Page 16: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing to Computing Organization Courses

16

Future Work

• Bunde– Will add constant memory and a small assignment

to next offering • Mache and Karavanic– Continuing Collaboration for summer 2013 course

at PSU• Versions of CUDA & Hardware

Page 17: Adding GPU Computing to Computer  Organization Courses

Adding GPU Computing to Computing Organization Courses

17

Thank You• We thank Barry Wilkinson for helpful input throughout our

collaboration, and Julian Dale for his help in creating the GoL exercise and website. This material is based upon work supported by the National Science Foundation under grants 1044932, 1044299 and 1044973; by Intel; and by a PSU Miller Foundation Sustainability Grant.

• More information– Game of Life Exercise

lclark.edu/~jmache/parallel– Authors

• Karen L. Karavanic karavan at cs.pdx.edu• David Bunde dbunde at knox.edu• Jens Mache jmache at lclark.edu