1 machine-level programming i: basics computer systems organization andrew case slides adapted from...

36
1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

Upload: bethanie-wood

Post on 23-Dec-2015

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

1

Machine-Level Programming I: Basics

Computer Systems OrganizationAndrew Case

Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

Page 2: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

2

x86 Processors Totally dominate laptop/desktop/server market

Evolutionary design Backwards compatible all the way to 8086, introduced in 1978 Added more features as time goes on

Complex instruction set computer (CISC) Many different instructions with many different formats By contrast, ARM architecture (in cell phones) is RISC (Reduced

Instruction Set Computers) Generally better for low power devices

Page 3: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

3

Intel x86 Evolution: Milestones

Name Transistors MHz 8086 (1978) 29K 5-10

First 16-bit processor. Basis for IBM PC & DOS 1MB address space

386 (1985) 275K 16-33 First 32 bit processor , referred to as IA32 Capable of running Unix

Pentium 4F (2004) 125M 2800-3800 First Intel 64-bit processor, referred to as x86-64

Core i7 (2008) 731M 2667-3333

We cover both IA32 and x86-64. Labs are done in IA32.

Instruction Set Architecture

Page 4: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

4

Intel x86 Processors: Overview

X86-64 / EM64t

X86-32/IA32

X86-16 8086

286

386486PentiumPentium MMX

Pentium III

Pentium 4

Pentium 4E

Pentium 4F

Core 2 DuoCore i7

IA: often redefined as latest Intel architecture

time

Architectures Processors

MMX

SSE

SSE2

SSE3

SSE4

Page 5: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

5

Ia64 - Itanium

Name Date Transistors Itanium 200110M

First shot at 64-bit architecture: first called IA64 Not 32-bit compatible (emulation)

Basic reason for failure

Page 6: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

6

x86 Clones: Advanced Micro Devices (AMD)

Developed x86-64, their own extension to 64 bits AMD implementation name: AMD64 Intel implementation name: Intel64/EM64T

Page 7: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

7

Today: Machine Programming I: Basics History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move Intro to x86-64

Page 8: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

8

CPU

Assembly Programmer’s View

Execution context PC: Program counter

Address of next instruction Called “EIP” (IA32) or “RIP” (x86-64)

Registers Heavily used program data

Condition codes Info of recent arithmetic operation Used for conditional branching

PC Registers

Memory

Object CodeProgram DataOS Data

Addresses

Data

InstructionsConditionCodes

Page 9: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

9

Assembly Data Types “Integer” data of 1, 2, or 4 bytes

Represent either data value or address (untyped pointer)

Floating point data of 4, 8, or 10 bytes

No arrays or structures

Page 10: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

10

3 Kind of Assembly Operations Perform arithmetic on register or memory data

Add, subtract, multiplication…

Transfer data between memory and register Load data from memory into register Store register data into memory

Transfer control Unconditional jumps to/from procedures Conditional branches

Page 11: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

11

text

text

binary

binary

Compiler (gcc –S)

Assembler (gcc –c)

Linker

C program (p1.c p2.c)

Asm program (p1.s p2.s)

Object program (p1.o p2.o)

Executable program (p)

Static libraries (.a)

Turning C into Object Code

Code in files p1.c p2.c Compile with command: gcc –O1 p1.c p2.c -o p

Optimization level

Output file is p

Page 12: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

12

Compiling Into Assemblysum.c

int sum(int x, int y){ int t = x+y; return t;}

sum.s

sum: pushl %ebp movl %esp,%ebp movl 12(%ebp),%eax addl 8(%ebp),%eax popl %ebp ret

80483c4: 55 89 e5 8b 45 0c 03 45 08 5d c3

sum.o

gcc –c sum.c

gcc –S sum.c

gcc –c sum.s

objdump –d sum.o

(gdb) disassemble sum

Page 13: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

13

Compiling Into Assemblysum.c

int sum(int x, int y){ int t = x+y; return t;}

sum.s

sum: pushl %ebp movl %esp,%ebp movl 12(%ebp),%eax addl 8(%ebp),%eax popl %ebp ret

80483c4: 55 89 e5 8b 45 0c 03 45 08 5d c3

sum.o

Refer to register %eax

Refer to memoryat address %ebp+8

Page 14: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

14

Machine Instruction Example C Code

Add two signed integers Assembly

Add 2 4-byte integers “Long” words in GCC parlance Same instruction whether signed

or unsigned Operands:

x: Register %eaxy: Memory M[%ebp+8]t: Register %eax

–Return function value in %eax Object Code

3-byte instruction Stored at address 0x80483ca

int t = x+y;

addl 8(%ebp),%eax

0x80483ca: 03 45 08

Similar to expression: x += y

More precisely:int eax;

int *ebp;

eax += ebp[2]

Page 15: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

15

Today: Machine Programming I: Basics History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move Intro to x86-64

Page 16: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

16

Integer Registers (IA32)%eax

%ecx

%edx

%ebx

%esi

%edi

%esp

%ebp

%ax

%cx

%dx

%bx

%si

%di

%sp

%bp

%ah

%ch

%dh

%bh

%al

%cl

%dl

%bl

16-bit virtual registers(backwards compatibility)

gene

ral p

urpo

se

accumulate

counter

data

base

source index

destinationindex

stack pointer

basepointer

Origin(mostly obsolete)

Page 17: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

17

Moving Data: IA32

movl Source, Dest:

Operand Types Immediate: Constant integer data

Example: $0x400, $-533 Like C constant, but prefixed with ‘$’

Register: One of 8 integer registers Example: %eax, %edx But %esp and %ebp reserved for special use Others have special uses for particular instructions

Memory: 4 consecutive bytes of memory at address given by register Example: (%eax)

%eax

%ecx

%edx

%ebx

%esi

%edi

%esp

%ebp

Page 18: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

18

movl Operand Combinations

No memory-to-memory instruction

movl

Imm

Reg

Mem

RegMem

RegMem

Reg

Source Dest C Analog

movl $0x4,%eax temp = 0x4;

movl $-147,(%eax) *p = -147;

movl %eax,%edx temp2 = temp1;

movl %eax,(%edx) *p = temp;

movl (%eax),%edx temp = *p;

Src,Dest

Page 19: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

19

Simple Memory Addressing Modes Normal (R) Mem[Reg[R]]

Register R specifies memory address

movl (%ecx),%eax

Displacement D(R) Mem[Reg[R]+D] Register R specifies start of memory region Constant displacement D specifies offset

movl 8(%ebp),%edx

Page 20: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

20

Using Simple Addressing Modes

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

swap:pushl %ebpmovl %esp,%ebppushl %ebx

movl 8(%ebp), %edxmovl 12(%ebp), %ecxmovl (%edx), %ebxmovl (%ecx), %eaxmovl %eax, (%edx)movl %ebx, (%ecx)

popl %ebxpopl %ebpret

Body

SetUp

Finish

Point to first argumentPoint to second argument

Page 21: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

21

Understanding Swap

0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

123

456

Address0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp 0x104movl 8(%ebp), %edx # edx = xpmovl 12(%ebp), %ecx # ecx = ypmovl (%edx), %ebx # ebx = *xp (t0)movl (%ecx), %eax # eax = *yp (t1)movl %eax, (%edx) # *xp = t1movl %ebx, (%ecx) # *yp = t0

Page 22: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

22

Understanding Swap

0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

123

456

Address0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

0x124

0x104

0x120

movl 8(%ebp), %edx # edx = xpmovl 12(%ebp), %ecx # ecx = ypmovl (%edx), %ebx # ebx = *xp (t0)movl (%ecx), %eax # eax = *yp (t1)movl %eax, (%edx) # *xp = t1movl %ebx, (%ecx) # *yp = t0

Page 23: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

23

Understanding Swap

0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

123

456

Address0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

0x120

0x104

0x124

0x124

movl 8(%ebp), %edx # edx = xpmovl 12(%ebp), %ecx # ecx = ypmovl (%edx), %ebx # ebx = *xp (t0)movl (%ecx), %eax # eax = *yp (t1)movl %eax, (%edx) # *xp = t1movl %ebx, (%ecx) # *yp = t0

Page 24: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

24

456

Understanding Swap

0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

123

456

Address0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

0x124

0x120

123

0x104movl 8(%ebp), %edx # edx = xpmovl 12(%ebp), %ecx # ecx = ypmovl (%edx), %ebx # ebx = *xp (t0)movl (%ecx), %eax # eax = *yp (t1)movl %eax, (%edx) # *xp = t1movl %ebx, (%ecx) # *yp = t0

Page 25: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

25

Understanding Swap

0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

123

456

Address0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

456

0x124

0x120

0x104

123

123

movl 8(%ebp), %edx # edx = xpmovl 12(%ebp), %ecx # ecx = ypmovl (%edx), %ebx # ebx = *xp (t0)movl (%ecx), %eax # eax = *yp (t1)movl %eax, (%edx) # *xp = t1movl %ebx, (%ecx) # *yp = t0

Page 26: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

26

456

456

Understanding Swap

0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

Address0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

456456

0x124

0x120

123

0x104

123

movl 8(%ebp), %edx # edx = xpmovl 12(%ebp), %ecx # ecx = ypmovl (%edx), %ebx # ebx = *xp (t0)movl (%ecx), %eax # eax = *yp (t1)movl %eax, (%edx) # *xp = t1movl %ebx, (%ecx) # *yp = t0

Page 27: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

27

Understanding Swap

0x120

0x124

Rtn adr

%ebp 0

4

8

12

Offset

-4

456

123

Address0x124

0x120

0x11c

0x118

0x114

0x110

0x10c

0x108

0x104

0x100

yp

xp

%eax

%edx

%ecx

%ebx

%esi

%edi

%esp

%ebp

456

0x124

0x120

0x104

123123

movl 8(%ebp), %edx # edx = xpmovl 12(%ebp), %ecx # ecx = ypmovl (%edx), %ebx # ebx = *xp (t0)movl (%ecx), %eax # eax = *yp (t1)movl %eax, (%edx) # *xp = t1movl %ebx, (%ecx) # *yp = t0

Page 28: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

28

General Memory Addressing Modes Most General Form

D ( Rb, Ri, S ) Mem[Reg[Rb]+S*Reg[Ri]+ D]

Special Cases(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]]D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D](Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]]

Constantdisplacement

Base registerIndex register

(no %esp)

Scale (1,2,4,8)

Page 29: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

29

Today: Machine Programming I: Basics History of Intel processors and architectures C, assembly, machine code Assembly Basics: Registers, operands, move Intro to x86-64

Page 30: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

30

Size of C objects on IA32 and x86-64

C Data Type Generic 32-bit Intel IA32 x86-64

unsigned 4 44

int 4 44

long int 4 48

char 1 11

short 2 22

float 4 44

double 8 88

char * 4 48– Or any other pointer

Page 31: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

31

%rsp

x86-64 Integer Registers

%eax

%ebx

%ecx

%edx

%esi

%edi

%esp

%ebp

%r8d

%r9d

%r10d

%r11d

%r12d

%r13d

%r14d

%r15d

%r8

%r9

%r10

%r11

%r12

%r13

%r14

%r15

%rax

%rbx

%rcx

%rdx

%rsi

%rdi

%rbp

Exte

nd e

xisti

ng re

gist

ers

Add 8 new ones

Page 32: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

32

Instructions

New instructions for 8-byte types: movl ➙ movq addl ➙ addq sall ➙ salq etc.

32-bit instructions that generate 32-bit results Set higher order bits of destination register to 0 Example: addl

Page 33: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

33

32-bit code for swap

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;} Body

SetUp

Finish

swap:pushl %ebpmovl %esp,%ebppushl %ebx

movl 8(%ebp), %edxmovl 12(%ebp), %ecxmovl (%edx), %ebxmovl (%ecx), %eaxmovl %eax, (%edx)movl %ebx, (%ecx)

popl %ebxpopl %ebpret

Page 34: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

34

64-bit code for swap

Operands passed in registers (why useful?) First (xp) in %rdi, second (yp) in %rsi 64-bit pointers

32-bit data (int) Data held in %eax and %edx (instead of %rax and %rdx) movl operation instead of movq

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

Body

SetUp

Finish

swap:

…movl (%rdi), %edxmovl (%rsi), %eaxmovl %eax, (%rdi)movl %edx, (%rsi)…ret

Page 35: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

35

64-bit code for long int swap

64-bit data Data held in registers %rax and %rdx movq operation

“q” stands for quad-word

void swap(long *xp, long *yp) { long t0 = *xp; long t1 = *yp; *xp = t1; *yp = t0;}

Body

SetUp

Finish

swap_l:

…movq (%rdi), %rdxmovq (%rsi), %raxmovq %rax, (%rdi)movq %rdx, (%rsi)…ret

Page 36: 1 Machine-Level Programming I: Basics Computer Systems Organization Andrew Case Slides adapted from Jinyang Li, Randy Bryant and Dave O’Hallaron

36

Machine Programming I: Summary History of Intel processors and architectures

Evolutionary design leads to many quirks and artifacts C, assembly, machine code

Compiler must transform statements, expressions, procedures into low-level instruction sequences

Assembly Basics: Registers, operands, move The x86 move instructions cover wide range of data movement

forms Intro to x86-64

A major departure from the style of code seen in IA32