v vrph h[dpsohvfaculty.csuci.edu/peter.smith/162notes/notes101619.pdf · eujh / dggd 7 [ dgg[ l eu...

13
Comp 162 Notes Page 1 of 13 October 16, 2019 Wednesday, October 16, 2019 Topics for today Arrays and Indexed Addressing Global arrays Local arrays Buffer exploit attacks Arrays and indexed addressing (section 6.4) So far we have looked at scalars (int, char, bool) but no composite types. We look next at arrays and see why there is a need for some of the addressing modes we have not yet seen. The table on page 336 is a useful summary of all 8 of the addressing modes in Pep/9. So far we have seen Immediate Direct Stack-relative Stack-relative deferred. We will introduce the other 4 modes as needed Global arrays Declarations: translations of global array declarations are not complicated; we just need to determine how many bytes the array occupies and possibly initialize memory locations. Here are some examples. Accessing elements: In order to access elements of global arrays we need to use a new addressing mode Indexed addressing ,x High-level Language Pep/9 int A[4]; A: .block 8 char B[12]; B: .block 12 int C[]={2,4,1} C: .word 2 .word 4 .word 1

Upload: others

Post on 19-May-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 1 of 13 October 16, 2019

Wednesday, October 16, 2019 Topics for today Arrays and Indexed Addressing Global arrays Local arrays Buffer exploit attacks Arrays and indexed addressing (section 6.4) So far we have looked at scalars (int, char, bool) but no composite types. We look next at arrays and see why there is a need for some of the addressing modes we have not yet seen. The table on page 336 is a useful summary of all 8 of the addressing modes in Pep/9. So far we have seen

Immediate Direct Stack-relative Stack-relative deferred.

We will introduce the other 4 modes as needed Global arrays Declarations: translations of global array declarations are not complicated; we just need to determine how many bytes the array occupies and possibly initialize memory locations. Here are some examples. Accessing elements: In order to access elements of global arrays we need to use a new addressing mode Indexed addressing ,x

High-level Language Pep/9 int A[4];

A: .block 8

char B[12];

B: .block 12

int C[]={2,4,1}

C: .word 2 .word 4 .word 1

Page 2: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 2 of 13 October 16, 2019

Consider the following loop that uses one of the example arrays above. High-level language for (i=0; i<12; i++) input(B[i]); // read 12 characters into array

Pep/9 ldwx 0,i ; index is initially zero top: cpwx 12,i ; more to read? brge next ; branch if no more ldba charIn,d ; get next character stba B,x ; mode specifies a base address (B) and byte offset addx 1,i ; increment index br top next:

In index mode (,x) the operand is memory[operand + Register X] (This is why register A and register X, otherwise interchangeable, are not equivalent.) In the code above we read characters into locations B, B+1, B+2 and so on When we are dealing with arrays in which each element is one byte long (as in the char array B) the byte-offset of an array element (its distance from the beginning of the array in memory) is the same as the index of the element. So B[5] is 5 bytes from the beginning of the array, B[11] is 11 bytes from the beginning of the array and so on. However, if each element of the array is larger than one byte then we have to distinguish between the array index (the high-level language view) and the byte-offset (used in calculating the address at the machine level). Generally if each element occupies k bytes then the offset for array element i is k*i bytes from the beginning of the array. In the case of arrays of integers where we allocate 2 bytes for each element (see example arrays A and C) we have to double the index to get the appropriate byte offset. Compare the following diagrams of arrays B and A.

Page 3: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 3 of 13 October 16, 2019

B[0] A[0] B[1] B[2] A[1] B[3] B[4] A[2] B[5] B B[6] A[3] C B[7] B[8] B[9] B[10] B[12} The following example shows the consequences of this in coding a loop to read integers into array A. High-level language for (j=0; j<4; j++) input(A[j]) // read 4 integers into array

Pep/9 (version 1) ldwx 0,i top: cpwx 4,i ; more to read? brge done ; branch if no more aslx ; turn index into byte offset – double it deci A,x ; for use in accessing the array asrx ; and then back into index – halve it addx 1,i br top done:

Page 4: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 4 of 13 October 16, 2019

Here is another way we could translate the same loop Pep/9 (version 2) ldwx 0,i top: cpwx 8,i ; Reg x will contain byte offset 0,2,4,6 brge done ; so if 8 or greater, we are finished deci A,x ; access the array addx 2,i ; offset is incremented by 2 br top

Another self-modifying program Before indexing and index registers were invented, self modifying programs were one way to process an array. Consider the following self-modifying program to output the contents of a 5-element array, ldwx 5,i ; count of array elements is in register X top: deco table,d subx 1,i breq done ; branch if no more to output ldwa 4,d ; this three instruction adda 2,i ; sequence modifies bytes 4 and 5 which contain the stwa 4,d ; operand of deco br top done: stop table: .word 2 .word 3 .word 5 .word 7 .word 11

There is no need for such tricks with indexing. Our two examples illustrated sequential processing through an array. Random access to an array is also accomplished using indexing. In the following, a user selects elements of array vector to output. input(j) while (j >= 0) { output(vector[j]) input(j) }

Page 5: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 5 of 13 October 16, 2019

In Pep/9 this could be deci j,d brlt quit top: ldwx j,d ; index entered by user aslx ; turned into byte offset deco vector,x ; access array deci j,d ; get another j value brge top ; repeat if not negative quit:

Processing order in a loop Sometimes it might be faster to process an array from the last element back to the first. Consider the following two loops each of which sums in register A the contents of the 36-element integer array T. The byte offset of the last element is 70. The one on the left sums T[35]+T[34]+…+T[0], the one on the right sums T[0]+T[1]+...+T[35].

ldwa 0,i ldwa 0,i ldwx 70,i ldwx 0,i

L: adda T,x L: cpwx 70,i subx 2,i brgt end brge L adda T,x

addx 2,i br L end:

Because we have a built-in way to check when a register reaches zero, the loop on the left is shorter. Local arrays Consider function example with local array table. int example(int A, char B) { int counter; int table[3]; . . }

If this function is called as in T = example(V,'*');

Page 6: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 6 of 13 October 16, 2019

then after the subprogram allocates the 8 bytes of local space (2 for counter and 6 for table), the stack might be depicted as follows counter (2 bytes) table (6 bytes) The asterisk V (2 bytes) Space for returned value (2 bytes) How do we access the elements of table? Suppose the complete example function is int example(int A, char B) { int counter; int table[3]; /* fill table from input */ for (counter=0; counter<3; counter++) input(table[counter]); }

A Pep/9 translation of example is: example: subsp 8,i ; for locals ldwx 0,i stwx 0,s ; counter=0 loop: cpwx 3,i ; counter < 3? brge done ; branch if input loop finished aslx ; make counter into byte offset deci 2,sx ; new mode - stack indexing ldwx 0,s addx 1,i stwx 0,s ; increment the loop variable br loop done: addsp 8,i ; deallocate locals ret

*

Return

Address

Page 7: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 7 of 13 October 16, 2019

The new mode (,sx) is stack-relative indexing. The operand is memory[operand + SP + register X]

This is a natural combination of the ,s and ,x modes we have already seen. Thus, in the case of deci 2,sx

we are accessing the array that starts 2 bytes down from the top of the stack and using register X to select a particular element within the array, that is memory[2 + SP + register X]

A buffer exploit in Pep/9 We can exploit the lack of array bound checking to cause Pep/9 to execute arbitrary code. One way to do this is to overflow a local array on the stack and overwrite the return address. On returning from the subroutine, control can be transferred to a section of memory above the SP that still contains our input. Source Code call x ; first call: regular input call x ; second call: to demonstrate exploit stop ; ; ; Subroutine x reads a zero-terminated sequence of integers into ; a local array then outputs them in reverse order ; ; because there is no bound checking, the number sequence can overwrite ; the return address which is next to it on the stack. ; Our input can be executable instructions stored in ; the array. We overwrite the return address with the address of the start of ; sequence so that when the RETn instruction executes, control is ; transferred to the instruction sequence we input. ; ; x: subsp 12,i ; for local temp and 5-element array AR

ldwx 0,i loop: deci 0,s

breq output ; see if number input into temp is terminator ldwa 0,s stwa 2,sx ; non-terminator stored in array addx 2,i br loop

output: subx 2,i ; outputting array in reverse order loop2: deco 2,sx ldba '\n',i ; newline stba charOut,d ; is output after each number

subx 2,i brge loop2 ; branch if more to output addsp 12,i ret

.end

Page 8: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 8 of 13 October 16, 2019

Program run 1 2 3 0 3 2 1 14336 1848 7 14336 1792 -1149 0 -1149 1792 14336 7 1848 14336 777

Where does the 777 come from??

The stack during a call of X can be depicted temp AR Return address

Page 9: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 9 of 13 October 16, 2019

Stack during first call of X Stack during second call of X temp AR In the second call of X, the numbers input by the user is the translation of

deco 7,i ; 38 00 07 deco 7,i ; 38 00 07 deco 7,i ; 38 00 07 stop ; 00 <value of SP+2> ; FB83 0 ; terminator for input

We overwrite the return address with the address of our “exploit code” on the stack Now we have seen 6 of the 8 addressing modes: i, d, x, s, sf, sx. Only two more to go! Reading Global and local arrays are discussed on pages 336 through 344.

00

00

00

01

00

00

38

00

00 07

02 38

00

03

00

07

?? 38

?? 00

??

??

07

00

Return

Address

FB

83

Page 10: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 10 of 13 October 16, 2019

Review Questions 1. How many bytes does the Pep/9 version of int M[8][8] occupy? 2. Translate the following into Pep/9 assembly code char days [7][4] = { “Sat”,”Sun”,”Mon”,”Tue”,”Wed”,”Thu”,”Fri”};

3. The Pep/9 equivalent of int RAIN [365] contains the rainfall in Camarillo for each day

in 2015.

(a) Write assembly code that processes the array forwards and leaves the total rainfall in total.

(b) Write assembly code that processes the array backwards and leaves the total rainfall in total.

(c) Determine for each of your answers to (a) and (b), the number of bytes the code occupies and the number of instructions executed when it runs

4. Consider the two local variables in function X below void X() { int one[5], two[5] }

Assuming that these are the only local variables in X, write code that copies the values in one to the corresponding elements in two.

5. A global character array S (30 bytes) contains a null-terminated string. The string may be

shorter than 30 characters but will contain at least one non-null character. Function Y has a local character array void Y() { char copy[30]; }

Write code for Y that copies the string in S (including the null byte) to local variable copy. 6. An array can be added from first element to last or from last element to first. Is it possible

for one to fail and the other succeed?

Page 11: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 11 of 13 October 16, 2019

7. Suppose Pep/9 array T contains 100 2-byte integers. Write Pep/9 code that moves the contents as follows

T[0] ← T[1], T[1] ← T[2] . . . T[98] ← T[99]. 8. Consider array T of question 7. Suppose we also had a label on the second element thus T: .block 2 T2: .block 198

How could this additional label be used to reduce the size of the code need to perform the operation of question7?

9. In a Pep/9 program, byte array M contains the text of an error message. For each of the following cases, write a Pep/9 loop that outputs the appropriate text:

(a) the message is terminated by a null byte. (b) the integer variable MESSLEN contains the number of characters in the message. (c) the output is the message up to and including the first exclamation point.

Page 12: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 12 of 13 October 16, 2019

Review Answers 1. 128 bytes 8*8*2 2. days:.ascii “Sat\x00” .ascii “Sun\x00” .ascii “Mon\x00” .ascii “Tue\x00” .ascii “Wed\x00” .ascii “Thu\x00” .ascii “Fri\x00”

3. (a) ldwa 0,i 1 ldwx 0,i 1 loop: adda RAIN,x 365 cpwx 728,I 365 breq done 365 addx 2,i 364 br loop 364 done: stwa total,d 1

Space: 24 bytes Instructions: 1825 (b) ldwa 0,i 1 ldwx 728,i 1 loop: adda RAIN,x 365 subx 2,I 365 brge loop 365 stwa total,d 1

Space 18 bytes Instructions: 1098 4. ldwx 8,i loop: ldwa 2,sx stwa 12,sx subx 2,i brge loop

5. ldwa 0,i ldwx 0,i loop: ldba S,x ; read from the global stba 2,sx ; write to the local breq done ; finished if just copied the null byte addx 1,i br loop

done:

6. Yes. Suppose contents are { 30000, 30000, -30000 } one overflows, the other doesn’t.

Page 13: V VRPH H[DPSOHVfaculty.csuci.edu/peter.smith/162notes/Notes101619.pdf · eujh / dggd 7 [ dgg[ l eu / hqg %hfdxvh zh kdyh d exlow lq zd\ wr fkhfn zkhq d uhjlvwhu uhdfkhv ]hur wkh orrs

Comp 162 Notes Page 13 of 13 October 16, 2019

7. ldwx 2,i ; offset of T[1] loop: ldwa T,x subx 2,i

stwa T,x ; T[i] = T[i+1] addx 4,i ; offset of T[i+2]

cpwx 198,i brle loop

8. ldwx 0,i loop: ldwa T2,x ; T[i+1] stwa T,x ; T[i] = T[i+1] addx 2,i cpwx 196,i brle loop

9.

(a) ldwx 0,i

L: ldba T,x breq done ; null byte reached

stba charOut,d addx 1,i br L

done: nop0

(b) ldwx 0,i

L: cpwx MESSLEN,d brgt done ; all characters output ldba T,x stba charOut,d addx 1,i br L

done: nop0

(c) ldwx 0,i

L: ldba T,x stba charOut,d cpba ‘!’,i breq done ; last character output addx 1,i br L

done: nop0