cs 270 cs 270: computer organization course overview instructor: professor stephen p. carl

41
CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

Upload: jade-barker

Post on 11-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

CS 270: Computer Organization

Course Overview

Instructor: Professor Stephen P. Carl

Page 2: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Quote of the Day

640 Kbytes [of main memory] ought to be enough foranybody.

– Bill Gates, 1981

Page 3: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Course Perspective Most Systems Courses are Builder-Centric; this Course is

Programmer-Centric How one can become a more effective programming by knowing

more about the underlying system Enable you to:

Write programs that are more reliable and efficient Incorporate features that require hooks into OS

– E.g., concurrency, signal handlers (in CS 428) Not just a course for dedicated hackers

(But - might just bring out the hidden hacker in you) You won’t see most of this material in any other course

Page 4: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Course Components Lectures

Higher level concepts

Labs and Assignments The heart of the course 1 - 3 weeks Provide in-depth understanding of an aspect of systems Programming and problem solving

Exams (2 + final) Test your understanding of concepts & mathematical principles

Page 5: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Timeliness Grace days

4 for the course Covers scheduling crunch, out-of-town trips, illnesses, minor setbacks Save them until late in the term!

Lateness penalties Once grace days used up, get penalized 10% per day Typically shut off all handins 4 days after due date

Catastrophic events Major illness, death in family, … Work with your professor on plan for getting back on track

Advice Once you start running late, it’s really hard to catch up

Page 6: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Cheating What is cheating?

Sharing code: either by copying, retyping, looking at, or supplying a copy of a file.

Coaching: helping your friend to write a lab, line by line. Copying code from previous course or from elsewhere on WWW

Only allowed to use code we supply, or from CS:APP website What is NOT cheating?

Explaining how to use systems or tools. Helping others with high-level design issues.

Penalty for cheating: Case remanded to Honor Council.

Detection of cheating: We do check and our tools for doing this are much better than you

might think

Page 7: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Policies: Attendance and Grading Presence in lectures: strongly advised

(cut warning after 2 unexcused absences)

Presence in design/help sessions: voluntary

Exams: weighted 15, 15, 15 (final)

Labs: completed/not completed

Guaranteed: > 90%: A > 80%: B > 70%: C

Page 8: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Introduction: Themes and Concepts

A bird’s eye view of computer system organization

Processors and technological progress

Abstraction vs. Reality Internal data representations - it’s all just numbers Why knowing Assembly language is a Good Thing Computer storage is large but not infinite Asymptotic complexity is only part of the story Computers compute, but they also communicate

Page 9: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

5 main components of any Computing System

Sun Blade Computer

Processor

Computer

Control(“brain”)

Datapath(“brawn”)

Memory

(where programs, data live whenrunning)

Devices

Input

Output

Keyboard, Mouse

Display, Printer

Disk (where programs, data live whennot running)

Page 10: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

5 main components of Computer Systems

Datapath Performs operations on signals moving through the CPU

Control Circuitry Routes signals into, through, and out of the CPU

Memory Volatile (RAM) Permanent Storage (hard drives, DVD-ROM, etc.)

Input devices Mouse, keyboard, etc.

Output devices Monitor, printers, etc.

Page 11: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Page 12: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

March of Progress: Moore’s Law

Moore’s Law: the number of transistors on a single integrated circuit, also called chip density, doubles every 18 months

Applies to any kind of semiconductor chip: memory (RAM), microprocessors, GPUs, etc.

Trend first described by Intel co-founder Gordon Moore in 1965.

Page 13: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

The March of Progress: DRAM capacity

year size (Mbit)

1980 0.0625

1983 0.25

1986 1

1989 4

1992 16

1996 64

1998 128

2000 256

2002 512

• Now 1.4X/yr, or 2X every 2 years.• 8000X since 1980!

Size in bits of single-chip Dynamic Random Access Memory (DRAM )

Page 14: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

The March of Progress: Microprocessor Complexity

Alpha 21264: 15 millionPowerPC 620: 6.9 millionPentium Pro: 5.5 million(1995)Sparc Ultra: 5.2 million

Athlon (K7): 22 Million

Itanium 2: 41 Million

Intel 8088: < 50,000 (1979)

Intel 80486: 1 million (1989)

Page 15: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

The March of Progress: Processor Performance

0100200300400500600700800900

87 88 89 90 91 92 93 94 95 96 97

DEC Alpha 21264/600

DEC Alpha 5/500

DEC Alpha 5/300

DEC Alpha 4/266

IBM POWER 100

1.54X/yr

Intel P4 2000 MHz(Fall 2001)

year

Perfo

rman

ce measu

re

Page 16: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

The March of Progress: Dramatic Changes

° Memory• DRAM capacity: 64x size improvement in last decade.

° Processor• Speed: 100X performance in last decade.

° Disk• Capacity doubled every year since 1997: 250X in last decade.

Page 17: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

The March of Progress: Dramatic Changes

What will the state-of-the-art PC be when you graduate?

• Processor clock speed: 5000 MegaHertz (5.0 GigaHertz)• Memory capacity: 4000 MegaBytes (4.0 GigaBytes)• Disk capacity: 2000 GigaBytes (2.0 TeraBytes)

• Time to learn some new units! • Mega => Giga• Giga => Tera• Tera => Peta• Peta => Exa• Exa => Zetta• Zetta => Yotta = 1024

Page 18: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

The Death of Progress? The Power Wall

Page 19: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Multicore Processors

Page 20: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Page 21: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Computing Systems as Abstractions

Most CS courses emphasize abstraction Abstract data types (CS 157/257) Asymptotic analysis (CS 320)

Computer Systems are organized according to layers of abstraction. In general, abstraction helps engineers of all sorts to

manage complexity Abstraction helps insulate programmers from differences between

various hardware platforms

Page 22: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Our Theme: Abstraction Is Important But Don’t Forget Reality

Abstractions have limits Especially in the presence of bugs Understanding details of the underlying implementations: sometimes,

you just need to

Useful outcomes Become a more effective programmer!

Find and eliminate bugs efficiently Understand and tune for program performance

Preparation for later “systems” classes in CS Operating Systems, Networking

Page 23: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Reality #1: Ints are not Integers, Floats are not Reals

Example 1: Is x2 ≥ 0? Floats: Yes! Ints:

40000 * 40000 --> 1600000000 50000 * 50000 --> ??

Example 2: Is (x + y) + z = x + (y + z)? Unsigned & Signed Ints: Yes! Floats:

(1e20 + -1e20) + 3.14 --> 3.14 1e20 + (-1e20 + 3.14) --> ??

Page 24: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Code Security Example

Similar to code found in FreeBSD’s implementation of getpeername

There are legions of smart people trying to find vulnerabilities in programs

/* Kernel memory region holding user-accessible data */#define KSIZE 1024char kbuf[KSIZE];

/* Copy at most maxlen bytes from kernel region to user buffer */int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len;}

Page 25: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Typical Usage/* Kernel memory region holding user-accessible data */#define KSIZE 1024char kbuf[KSIZE];

/* Copy at most maxlen bytes from kernel region to user buffer */int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len;}

#define MSIZE 528

void getstuff() { char mybuf[MSIZE]; copy_from_kernel(mybuf, MSIZE); printf(“%s\n”, mybuf);}

Page 26: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Malicious Usage/* Kernel memory region holding user-accessible data */#define KSIZE 1024char kbuf[KSIZE];

/* Copy at most maxlen bytes from kernel region to user buffer */int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len;}

#define MSIZE 528

void getstuff() { char mybuf[MSIZE]; copy_from_kernel(mybuf, -MSIZE); . . .}

Page 27: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Computer Arithmetic Arithmetic operations have important mathematical properties

Cannot assume all the usual mathematical properties, due to finiteness of representations Integer operations satisfy “ring” properties

Commutativity, associativity, distributivity Floating point operations satisfy “ordering” properties

Monotonicity, values of signs

Observation Need to understand which abstractions apply in which contexts Important issues for compiler writers and serious application programmers

Page 28: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Great (Grim?) Reality #2: You Need to Know Assembly Chances are, you’ll never write program in assembly

Compilers are much better & more patient than you are

But understanding assembly is key to understanding the machine-level execution model. When bugs happen, high-level language model breaks down Tuning program performance

What optimizations are done/not done by the compiler? What are the sources of program inefficiency?

Implementing system software Compiler has machine code as target Operating systems must manage process state

Creating / fighting malware x86 assembly is the language of choice!

Page 29: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Assembly Code Example Time Stamp Counter

Special 64-bit register in Intel-compatible machines Incremented every clock cycle Read with rdtsc instruction

Application Measure time (in clock cycles) required by a procedure:

double t;start_counter();P(); // function to be timedt = get_counter();printf("P required %f clock cycles\n", t);

Page 30: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Code to Read Counter Add a small amount of assembly code using GCC’s asm facility This inserts assembly code into machine code generated by compiler:

static unsigned cyc_hi = 0;static unsigned cyc_lo = 0;

/* Set *hi and *lo to the high and low order bits of the cycle counter. */

void access_counter(unsigned *hi, unsigned *lo){ asm("rdtsc; movl %%edx,%0; movl %%eax,%1"

: "=r" (*hi), "=r" (*lo) :: "%edx", "%eax");

}

Page 31: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Grim Reality #3: “Random Access Memory” is a non-physical abstraction Memory is not unbounded

It must be allocated and managed Many applications are memory dominated

Bugs due to memory referencing errors are especially pernicious Effects are distant in both time (error may show up long after erroneous

instruction executes) and space (effect may be outside the bounds of the executing program)

Memory performance is not uniform Cache and virtual memory effects can greatly affect program performance Adapting program to characteristics of memory system can lead to major

speed improvements

Page 32: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Example of a Memory Referencing Bug

double fun(int i){ volatile double d[1] = {3.14}; volatile long int a[2]; a[i] = 1073741824; /* Possibly out of bounds */ return d[0];}

fun(0) –> 3.14fun(1) –> 3.14fun(2) –> 3.1399998664856fun(3) –> 2.00000061035156fun(4) –> 3.14, then segmentation fault

Page 33: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Memory Referencing Bug Exampledouble fun(int i){ volatile double d[1] = {3.14}; volatile long int a[2]; a[i] = 1073741824; /* Possibly out of bounds */ return d[0];}

fun(0) –> 3.14fun(1) –> 3.14fun(2) –> 3.1399998664856fun(3) –> 2.00000061035156fun(4) –> 3.14, then segmentation fault

Saved State

MSB of d[0]

LSB of d[0]

a[1]

a[0] 0

1

2

3

4

Location accessed by fun(i)

Explanation:

Page 34: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Memory Referencing Errors Unlike Java, C and C++ do not protect memory at all

Out of bounds array references Invalid pointer values Abuses of malloc/free (e.g., memory leaks)

This usually leads to nasty bugs Whether or not bug has any effect depends on system and compiler Action at a distance

Corrupted object logically unrelated to one being accessed Effect of bug may be first observed long after it is generated

How can I deal with this? Program in Scheme, Java or Python Understand what possible interactions may occur Use tools that detect referencing errors

Page 35: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Memory System Performance Example

Hierarchical memory organization Performance depends on access patterns

Including how we step through multi-dimensional arrays

void copyji(int src[2048][2048], int dst[2048][2048]){ int i,j; // access in column-major order for (j = 0; j < 2048; j++) for (i = 0; i < 2048; i++) dst[i][j] = src[i][j];}

void copyij(int src[2048][2048], int dst[2048][2048]){ int i,j; // access in row-major order for (i = 0; i < 2048; i++) for (j = 0; j < 2048; j++) dst[i][j] = src[i][j];}

21 times sloweron Pentium 4

Page 36: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

The Memory Mountain

s1

s3

s5

s7

s9

s11

s13

s15

8m

2m 512k 12

8k 32k 8k

2k

0

200

400

600

800

1000

1200

Read throughput (MB/s)

Stride (words) Working set size (bytes)

Pentium III Xeon550 MHz

16 KB on-chip L1 d-cache16 KB on-chip L1 i-cache

512 KB off-chip unifiedL2 cache

L1

L2

Mem

Page 37: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Reality #4: There’s more to system performance than asymptotic complexity

Constant factors matter too!

And even exact op count does not predict performance Easily see 10:1 performance range depending on how code was written Must optimize at multiple levels: algorithm, data representations,

procedures, and loops

Must understand system to optimize performance How are programs compiled and executed How do we measure program performance and identify bottlenecks How do we improve performance without destroying code modularity and

generality

Page 38: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Example: Matrix Multiplication

Standard desktop computer, vendor compiler, using optimization flags Both implementations have exactly the same operations count (2n3) What is going on?

0

5

10

15

20

25

30

35

40

45

50

0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000

matrix size

Matrix-Matrix Multiplication (MMM) on 2 x Core 2 Duo 3 GHz (double precision)Gflop/s

160x

Triple loop

Best code (K. Goto)

Page 39: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

MMM Plot: Analysis

0

5

10

15

20

25

30

35

40

45

50

0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000

matrix size

Matrix-Matrix Multiplication (MMM) on 2 x Core 2 Duo 3 GHzGflop/s

Memory hierarchy and other optimizations: 20x

Vector instructions: 4x

Multiple threads: 4x

Each speedup due to taking advantage of increasingly complex system resources

Effect: less register spills, less L1/L2 cache misses, less TLB misses

Page 40: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Reality #5: Computers do more than just execute programs

They need to get data in and out I/O system is critical to program reliability and performance

They communicate with each other over networks Many system-level issues arise in presence of network

Concurrent operations by autonomous processes Coping with unreliable media Cross platform compatibility Complex performance issues

Page 41: CS 270 CS 270: Computer Organization Course Overview Instructor: Professor Stephen P. Carl

CS 270

Slide Acknowledgements

Slides to accompany textbook due to Bryant and O’Hallaron

Technology Trends slides due to Dr. XXX Carle of UC/Berkeley

Some graphs from Hennesey and Patterson: Computer Organization and Design, 4th Ed.