introduction to computer systemsmsakr/15213-f09/lectures/class08.pdf · introduction to computer...
TRANSCRIPT
Carnegie Mellon
Introduction to Computer Systems15‐213, fall 20098th Lecture, Sep. 16th
Instructors:
Majd Sakr and Khaled Harras
Carnegie Mellon
Last Time
Arrays
Nested
Multi‐level
int val[5]; 1 5 2 1 3
x x + 4 x + 8 x + 12 x + 16 x + 20
int pgh[4][5];
int *univ[3]
Carnegie Mellon
Dynamic Nested Arrays
StrengthCan create matrix of any size
ProgrammingMust do index computation explicitly
PerformanceAccessing single element costly
Must do multiplication
int * new_var_matrix(int n){return (int *) calloc(sizeof(int), n*n);
}
int var_ele(int *a, int i, int j, int n)
{return a[i*n+j];
}
movl 12(%ebp),%eax # imovl 8(%ebp),%edx # aimull 20(%ebp),%eax # n*iaddl 16(%ebp),%eax # n*i+jmovl (%edx,%eax,4),%eax # Mem[a+4*(i*n+j)]
Carnegie Mellon
Dynamic Array MultiplicationPer iteration:
Multiplies: 3
2 for subscripts
1 for data
Adds: 4
2 for array indexing
1 for loop index
1 for data
/* Compute element i,k ofvariable matrix product */
int var_prod_ele(int *a, int *b,int i, int k, int n)
{int j;int result = 0;for (j = 0; j < n; j++)result +=a[i*n+j] * b[j*n+k];
return result;}
a b
i‐th row
j‐th columnx
Carnegie Mellon
Optimizing Dynamic Array MultiplicationOptimizations
Performed when set optimization level to -O2
Code MotionExpression i*n can be computed outside loop
Strength ReductionIncrementing j has effect of incrementing j*n+k by n
Operations count4 adds, 1 mult
{int j;int result = 0;for (j = 0; j < n; j++)result +=a[i*n+j] * b[j*n+k];
return result;}
{int j;int result = 0;int iTn = i*n;int jTnPk = k;for (j = 0; j < n; j++) {result +=a[iTn+j] * b[jTnPk];
jTnPk += n;}return result;
}
4 adds, 1 mult
4 adds, 3 mults
Carnegie Mellon
Today
Structures
Alignment
Unions
Carnegie Mellon
struct rec {int i;int a[3];int *p;
};
IA32 Assembly# %eax = val# %edx = rmovl %eax,(%edx) # Mem[r] = val
void set_i(struct rec *r,
int val){r->i = val;
}
Structures
ConceptContiguously‐allocated region of memory
Refer to members within structure by names
Members may be of different types
Accessing Structure Member
Memory Layouti a p
0 4 16 20
Carnegie Mellon
# %ecx = idx# %edx = rleal 0(,%ecx,4),%eax # 4*idxleal 4(%eax,%edx),%eax # r+4*idx+4
int *find_a(struct rec *r, int idx){return &r->a[idx];
}
Generating Pointer to Structure Memberstruct rec {int i;int a[3];int *p;
};i a p
0 4 16 20
r+4+4*idxr
Will disappearblackboard?
What does it do?
Carnegie Mellon
# %ecx = idx# %edx = rleal 0(,%ecx,4),%eax # 4*idxleal 4(%eax,%edx),%eax # r+4*idx+4
int *find_a(struct rec *r, int idx){return &r->a[idx];
}
Generating Pointer to Array Element
Offset of each structure member determined at compile time
Generating Pointer to Structure Memberstruct rec {int i;int a[3];int *p;
};i a p
0 4 16 20
r+4+4*idxr
Carnegie Mellon
struct rec {int i;int a[3];int *p;
};
# %edx = rmovl (%edx),%ecx # r->ileal 0(,%ecx,4),%eax # 4*(r->i)leal 4(%edx,%eax),%eax # r+4+4*(r->i)movl %eax,16(%edx) # Update r->p
void set_p(struct rec *r){r->p =&r->a[r->i];
}
Structure Referencing (Cont.)C Code
i a p
0 4 16 20
i a
0 4 16 20
Element i
What does it do?
Carnegie Mellon
Today
Structures
Alignment
Unions
Carnegie Mellon
Alignment
Aligned DataPrimitive data type requires K bytes
Address must be multiple of K
Required on some machines; advised on IA32
treated differently by IA32 Linux, x86‐64 Linux, and Windows!
Motivation for Aligning DataMemory accessed by (aligned) chunks of 4 or 8 bytes (system dependent)
Inefficient to load or store datum that spans quad word boundaries
Virtual memory very tricky when datum spans 2 pages
CompilerInserts gaps in structure to ensure correct alignment of fields
Carnegie Mellon
Specific Cases of Alignment (IA32)1 byte: char, …
no restrictions on address
2 bytes: short, …lowest 1 bit of address must be 02
4 bytes: int, float, char *, …lowest 2 bits of address must be 002
8 bytes: double, …Windows (and most other OS’s & instruction sets):lowest 3 bits of address must be 0002
Linux:lowest 2 bits of address must be 002i.e., treated the same as a 4‐byte primitive data type
12 bytes: long doubleWindows, Linux:lowest 2 bits of address must be 002i.e., treated the same as a 4‐byte primitive data type
Carnegie Mellon
Specific Cases of Alignment (x86‐64)1 byte: char, …
no restrictions on address
2 bytes: short, …lowest 1 bit of address must be 02
4 bytes: int, float, …lowest 2 bits of address must be 002
8 bytes: double, char *, …Windows & Linux:lowest 3 bits of address must be 0002
16 bytes: long doubleLinux:lowest 3 bits of address must be 0002i.e., treated the same as a 8‐byte primitive data type
Carnegie Mellon
struct S1 {char c;int i[2];double v;
} *p;
Satisfying Alignment with StructuresWithin structure:
Must satisfy element’s alignment requirement
Overall structure placementEach structure has alignment requirement K
K = Largest alignment of any element
Initial address & structure length must be multiples of K
Example (under Windows or x86‐64):K = 8, due to double element
c i[0] i[1] v3 bytes 4 bytes
p+0 p+4 p+8 p+16 p+24
Multiple of 4 Multiple of 8
Multiple of 8 Multiple of 8
Carnegie Mellon
Different Alignment Conventions
x86‐64 or IA32 Windows:K = 8, due to double element
IA32 LinuxK = 4; double treated like a 4‐byte data type
c i[0] i[1] v3 bytes 4 bytes
p+0 p+4 p+8 p+16 p+24
c i[0] i[1] v3 bytes
p+0 p+4 p+8 p+12 p+20
struct S1 {char c;int i[2];double v;
} *p;
Carnegie Mellon
Saving SpacePut large data types first
Effect (example x86‐64, both have K=8)
c i[0] i[1] v3 bytes 4 bytes
p+0 p+4 p+8 p+16 p+24
struct S1 {char c;int i[2];double v;
} *p;
struct S2 {double v;int i[2];char c;
} *p;
ci[0] i[1]v
p+0 p+8 p+16
Carnegie Mellon
Arrays of Structures
Satisfy alignment requirement for every element
struct S2 {double v;int i[2];char c;
} a[10];
ci[0] i[1]v
a+24 a+32 a+40
a[0]
a+0
a[1] a[2]
a+24 a+48 a+36
• • •
7 bytes
a+48
Carnegie Mellon
Accessing Array ElementsCompute array offset 12i
Compute offset 8 with structure
Assembler gives offset a+8Resolved during linking
i jv
a+12i a+12i+8
2 bytes
struct S3 {short i;float v;short j;
} a[10];
2 bytes
a[0]a+0
a[i]a+12i
• • • • • •
short get_j(int idx){return a[idx].j;
}
# %eax = idxleal (%eax,%eax,2),%eax # 3*idxmovswl a+8(,%eax,4),%eax
Carnegie Mellon
Today
Structures
Alignment
Unions
Carnegie Mellon
Union AllocationAllocate according to largest element
Can only use ones field at a time
union U1 {char c;int i[2];double v;
} *up; up+0 up+4 up+8
struct S1 {char c;int i[2];double v;
} *sp;
c i[0] i[1] v3 bits 4 bits
sp+0 sp+4 sp+8 sp+16 sp+24
ci[0] i[1]
v
Carnegie Mellon
typedef union {float f;unsigned u;
} bit_float_t;
float bit2float(unsigned u) {bit_float_t arg;arg.u = u;return arg.f;
}
uf
0 4
unsigned float2bit(float f) {bit_float_t arg;arg.f = f;return arg.u;
}
Using Union to Access Bit Patterns
Same as (float) u ? Same as (unsigned) f ?
Carnegie Mellon
So Far:
Structures
Alignment
Unions
struct rec {int i;int a[3];int *p;
};
Memory Layouti a p
0 4 16 20
c i[0] i[1] v3 bits 4 bits
p+0 p+4 p+8 p+16 p+24
struct S1 {char c;int i[2];double v;
} *p;
union U1 {char c;int i[2];double v;
} *up; up+0 up+4 up+8
ci[0] i[1]
v
Carnegie Mellon
SummaryArrays in C
Contiguous allocation of memory
Aligned to satisfy every element’s alignment requirement
Pointer to first element
No bounds checking
StructuresAllocate bytes in order declared
Pad in middle and at end to satisfy alignment
UnionsOverlay declarations
Way to circumvent type system