stacks & subroutines

Stacks & Subroutines

• Content

Stacks• A stack is a Last In First Out (LIFO) buffer containing a

number of data items usually implemented as a block of n consecutive bytes

• The address of the last data item to be placed into the stack is pointed to by the Stack Pointer (SP)

• Application of stacks:– Temporary storage of variables– Temporary storage of program addresses– Communication with subroutines

• CPU need this storage area because there are only a limited number of registers.

Stacks

AVR Stack

• Stack addresses begin in high memory ($085F for example) and are pushed toward low memory ($085D for example). i.e. AVR stacks grow into low memory.

• Other CPUs might do this in the reverse order (grow in high memory).

• In AVR, SP is an independent register for stack purpose only.

• AVR stack item size:– 1 byte for data.– 2 byte for addresses.

The Stack Pointer• SP is a special register where we need it for stack application.• The SP is 16 bit wide and is implemented as two register which are SPH

and SPL.• SPH register presents the high byte of SP while the SPL register presents

the lower byte.• Both SPH and SPL are 8 bit wide.• The stack pointer must be wide enough to address all the RAM.• In AVRs with more than 256 bytes of memory,

– SP is made of two 8bit register (SPL and SPH)• In AVRs with less than 256 bytes of memory,

– SP is made of only SPL register

Push & Pop• The stack grows upward toward the low address when items

are pushed to the top of the stack.• The stack pointer always points to the top item on the stack.• When an item is pushed,

– the new item is added onto the stack– then the stack pointer is decreased to point to the consecutive

memory above• When an item is popped,

– the stack pointer is increased to point to the consecutive memory below

– then the item on the top is copied to destination

Stack Push Operations• The stack pointer (SP) points to the top of the stack (TOS).• As we push data onto the stack, the data are saved where

the SP point to, and the SP is decremented by one.• This is the same as with many other microprocessor,

notable x86 processors.• To push a register onto stack we use the PUSH instruction.

– PUSH Rr ;Rr can be any general purpose register• For example,

– PUSH R10 ;store R10 onto stack and decrement SP

Stack Pop Operation

• Popping the contents of the stack back into a given register is the opposite process of pushing.

• When the POP instruction is executed, the SP is incremented and the top location of the stack is copied back to the register.

• That means the stack is LIFO (last in first out) memory.• To retrieve a byte of data back from stack we use the POP

instruction– POP Rr ;Rr can be any general purpose register

• For example,– POP R16 ;increment SP, and then load the top of stack to R10

Initializing The Stack Pointer

• When the AVR is powered up, the SP register contain the value 0, which is the address of R0.

• Therefore, we must initialize the SP at the beginning of the program so that it points to somewhere in the internal SRAM.

• In AVR, the stack grows from higher memory location to lower memory location.

• So, it is common to initialize the SP to the uppermost memory location.


• Different AVRs have different amounts of RAM.• In the AVR assembler RAMEND represents the

address of the last memory location.• So, if we want to initialize the SP so that it points to

the last memory location, we can simply load RAMEND into the SP.

• Notice that SP is made of two registers, SPH and SPL.• So we need to load the high byte of RAMEND into

SPH, and low byte of RAMEND into the SPL.


.INCLUDE “M32DEF.INC”.ORG 0

;initialize the SP to the last location of RAMLDI R16, HIGH(RAMEND)OUT SPH, R16LDI R16, LOW(RAMEND)OUT SPL, R16

…;R16 is used as temporary storage to initialize SP

The upper limit of the stack

• As mentioned earlier, we can define the stack anywhere in the general purpose memory.

• In AVR, the stack can as big as its RAM.• Note that we must not define the stack in the register

memory, nor in the I/O memory.• So, the SP must be set to point above 0x60.• Stack content is important as it used to store

information when we calling a subroutine.• Stack overflow will occur when the content of the stack

is exceed the upper limit.

Example of Stack • This example shows the stack and stack pointer and the register

used after the execution of each instruction.

.INCLUDE “M32DEF.INC”.ORG 0LDI R16, HIGH(RAMEND)OUT SPH, R16LDI R16, LOW(RAMEND)OUT SPL, R16LDI R31, 0LDI R20, 0x21LDI R22, 0x66PUSHR20PUSHR22LDI R20, 0LDI R22, 0POP R22POP R31

Subroutines Basics

• A subroutine is a sequence of, usually, consecutive instructions that carries out a single specific function or a number of related functions needed by calling programs.

• A subroutine can be called from one or more locations in a program.

Subroutine Basic

• Subroutine are often used to perform tasks that need to be performed frequently.

• This makes a program more structured in addition to saving memory space.

• In AVR, the instructions to call subroutine is as below:– RCALL – Relative Subroutine Call– ICALL – Indirect – CALL – Direct Subroutine Call– RET – Subroutine Return

Flow of Calling a Subroutine

• When a subroutine is called, the processor first saves the address of the instruction just below the CALL instruction on the stack, and then transfer control to that subroutine.

• When the execution of the function is finishes, the RET instruction at the end of the subroutine is executed, the address of the instruction below the CALL is loaded into the PC, and the instruction below the CALL instruction is executed.

Role of Stack in CALL Instruction

• Stack is used to temporarily store address when CPU execute the CALL instruction.

• This is how the CPU know where to resume when it return from the called subroutine.

• Hence, we must very careful when manipulating the stack contents.• For AVRs whose program counter is not longer than 16 bits (e.g. ATmega128,

ATmega32), the value of the program counter is broken into 2 bytes. – The higher byte is pushed onto the stack first, and then the lower byte is pushed.

• For AVRs whose program counter is longer than 16 bit but shorter than 24 bit, the value of the program counter is broken up into 3 bytes.– The highest byte is pushed first, then the middle byte is pushed, and finally the

lowest byte is push.• In both cases, the higher bytes are pushed first.

CALL Instruction• In this 4-byte (32-bit) instruction, 10 bits are used for the opcode and

the other 22 bits, k21-k0, are used for the address of the target subroutine, just as in the JMP instruction.

• Therefore, CALL can be used to call subroutines located at anywhere within the 4M address space of $000000-$3FFFFF for the AVR.

• To make sure that the AVR knows where to come back after execution of the called subroutine, the microcontroller automatically saves on the stack the address of the instruction immediately below the CALL.

• When a subroutine is called, control is transferred to that subroutine, and the processor saves the PC of the next instruction on the stack and begin to fetch instructions from the new location. After finishing execution of the subroutine, the RET instruction transfer control back to the caller. Every subroutine need RET as the last instruction.

CALL Instruction• Long Call to a Subroutine• CALL k

– 0 ≤ k ≤ 64K (device with 16bit PC)• Calls to a subroutine within the entire program memory.• The return address (to the instruction after the CALL) will be stored onto the stack.• The stack pointer uses a post-decrement scheme during CALL.• Flags affected : None• CPU Cycle : 4• Example :

mov r16,r0;Copy r0 to r16call check ;Call subroutinenop…

check: cpi r16, $42 ;Check if r16 has a special valuebreg error ;Branch if equalret ;Return from subroutine…

error: rjmp error ;Infinite loop

Machine Code of CALL Instruction

RCALL Instruction• Relative Call to a Subroutine

– PC ← PC + k + 1• RCALL k

– -2K ≤ k ≤ 2K• Relative call to an address within PC – 2K + 1 and PC + 2K (words).• The return address (the instruction after the RCALL) is stored onto the stack.• In the assembler, labels are used instead of relative operand.• For AVR with program memory not exceeding 4K word (8K bytes) this instruction can address the

entire memory from every address location.• The stack pointer uses a post-decrement scheme during RCALL.• Flags affected : None• CPU Cycle : 3• Example :

rcall routine ;Call subroutine…

routine: push r14 ;save r14 on the stack…pop r14 ;Restore r14ret ;Return from subroutine

RCALL Instruction

• There is no difference between RCALL and CALL in terms of saving the program counter on the stack of the function of the RET instruction

• The only difference is that the target address for CALL can be anywhere within the 4M address space of the AVR while the target address of RCALL must be within a 4K range.

• Many AVRs on-chip ROM is as low as 4K.• In such cases, the use of RCALL instead of CALL can

save a number of bytes of program ROM space.

ICALL Instruction• Indirect Call to a Subroutine• ICALL• Indirect call of a subroutine pointed to by the Z pointer (16bit) register in the

register file.• The Z-pointer register is 16 bit wide and allows calls to a subroutine within the

lowest 64K word (128K bytes) section in the program memory space.• The stack pointer uses a post-decrement scheme during ICALL.• This instruction is not available in all devices. (This instruction is available in

ATmega32)• Flags affected : None• CPU Cycle : 3• Example :

mov r30, r0 ;Set offset to call tableicall ;Call routine pointed to by r31:r30

RET Instruction• Return from Subroutine• RET• Return from subroutine.• The return address is loaded from the stack.• The stack pointer uses a pre-increment scheme during RET.• Flags affected : None• CPU Cycle : 4• Example :

call routine ;Call subroutine…

routine: push r14 ;Save r14 on the stack…pop r14 ;Restore r14ret ;Return from subroutine

Programming Subroutines• Why use subroutines?

– Code re-use– Easier to understand code (readability)– Divide and conquer

• Complex tasks are easier when broken down into smaller tasks– Simplify the code debugging process.

• How do we call a subroutine in assembly?– Place the parameters somewhere known– CALL to jump to the subroutine– RET to return

• Examples of subroutines:– Convert binary to ASCII– Convert Fahrenheit to Celcius– Perform output to 7-segment– Hex to 7-segment conversion

Programming Subroutines

C code

main() {int a, b;a = 5;b = sqr(a);}

int sqr(int val) {int sqval;sqval = val * val;return sqval;}

Assembly;initialize SP

LDI R16, HIGH(RAMEND)OUT SPH, R16LDI R16, LOW(RAMEND)OUT SPL, R16

;main routineMain: LDI R16, 5

CALL sqrExit: JMP Exit;Subroutine for sqrsqr: MUL R16, R16

MOV R16, R0MOV R17, R1RET

;Result is 16 bit store in R17:R16

Calling Many Subroutine from the Main Program

• In assembly language programming, it is common to have one main routine and many subroutines that are called from the main program.

• This allow us to make each subroutine into a separate module.

• Each module can be tested separately and then brought together with the main program.

• We can also CALL a subroutine inside a subroutine, and this is refer as nested subroutine.

• However, we have to take care of the capacity left for stack to avoid stack overflow.

Nested Subroutines

; main program

Main: …CALL Sub1

N: …Exit: JMP Exit

Sub1: …CALL Sub2

M: …RET

Sub2: …RET

Nested Subroutines

Passing Parameters to Subroutines• Parameters may be passed to a subroutine by using:– Data and Address Registers:

• Efficient, position-independent.• It reduces the number of registers available for use by the

programmer.– Memory locations:

• This is similar to using static or global data in high level languages.• Does not produce position independent code and may produce

unexpected side effects.– Stacks:

• This is the standard, general-purpose approach for parameter passing.• Similar to the approach used by several high-level languages including

C.

Passing Parameters in Registers

• The number to be squared is in R16.

• The result is returned to R17:R16

;initialize SPLDI R16, HIGH(RAMEND)OUT SPH, R16LDI R16, LOW(RAMEND)OUT SPL, R16


CALL sqrExit: JMP Exit;Subroutine for sqrsqr: MUL R16, R16

MOV R16, R0MOV R17, R1RET

;Result is 16 bit store in R17:R16

Passing Parameters in Memory

• The number to be squared is stored in sqrvarhigh (8bit).

• The result is return to sqrvarlow and sqrvarhigh (16 bit)

.EQU sqrvarlow = 0x9A

.EQU sqrvarhigh = 0x99;initialize SP



STS sqrvarhigh, R18CALL sqr

Exit: JMP Exit;Subroutine for sqrsqr: LDS R16, sqrvarhigh

MUL R16, R16STS sqrvarlow, R0STS sqrvarhigh, R1RET

Parameter Passing on the Stack

• If we use registers to pass our parameters:– Limit of 32 parameters to/from any subroutine.– We use up registers so they are not available to our

program.• So, instead we push the parameters onto the

stack.• Our conventions:– Parameters are passed on the stack– One return value can be provided in R16.

First Things First…• Both the subroutine and the main program must know how

many parameters are being passed!– In C we would use a prototype:

• int power (int number, int exponent);– In assembly, you must take care of this yourself.

• Things to do:– Push parameters onto the stack– Access parameters on the stack using indexed addressing mode– Draw the stack to keep track of subroutine execution

• Parameters• Return address

– Clean the stack after a subroutine call

Passing Parameters On The Stack;initialize SP



PUSH R16LDI R16, 50PUSH R16LDI R16, 20PUSH R16CLR R16 IN ZL, SPLIN ZH, SPHCALL sum

Exit: JMP Exit

;subroutine for sumsum: LDD R18, Z+1

ADD R16, R18LDD R18, Z+2 ADD R16, R18LDD R18, Z+3 ADD R16, R18RET

Writing Transparent Subroutines

• A transparent subroutine doesn’t change any registers except R16, R17, Y and Z.

• If we need more registers than this, we must save the register values when we enter the subroutine and restore them later.

• Where do we store them? – You guessed it: the stack.

Two Mechanisms For Passing Parameters

• By Value:– Actual value of the parameter is transferred to the

subroutine .– This is the safest approach unless the parameter needs

to be updated.– Not suitable for large amounts of data.

• By Reference:– The address of the parameter is transferred.– This is necessary if the parameter is to be changed.– Recommended in the case of large data volume.

Passing By Value & Reference

• We pushed the value of NUM1, NUM2, and NUM3 on the stack.

• What if we want to change the input parameter values?

• For example, what if we want to write a subroutine that will multiply all of its arguments’ values by 2, actually changing the values in memory?

• We must pass the parameters by reference…

Using Parameters Passed By Reference

.include "m32def.inc"

.CSEG

.ORG 0x0000JMP RESET

RESET: LDI R16, HIGH(RAMEND)OUT SPH, R16LDI R16, LOW(RAMEND)OUT SPL, R16

main: LDI R16,$30STS first, R16LDI R16,$50STS second, R16LDI R16,$70STS third, R16

LDI R16, low(first)PUSH R16LDI R16, high(first)PUSH R16LDI R16, low(second)PUSH R16LDI R16, high(second)PUSH R16LDI R16, low(third)PUSH R16LDI R16, high(third)PUSH R16CLR R16IN YL, SPLIN YH, SPHCALL double

Exit: JMP Exit

Using Parameters Passed By Referencedouble: LDD ZH, Y+1

LDD ZL, Y+2LD R0, ZLSL R0ST Z, R0LDD ZH, Y+3LDD ZL, Y+4LD R0, ZLSL R0ST Z, R0LDD ZH, Y+5LDD ZL, Y+6LD R0, ZLSL R0ST Z, R0RET

;NOTE: Stack is used to store variable ;address instead of value

.DSEG

.ORG 0x0100first: .BYTE 1second: .BYTE 1third: .BYTE 1

SP Stack Content0x08570x0858 000x0859 200x085A 010x085B 020x085C 010x085D 010x085E 010x085F 00

Characteristics Of Good Subroutines

• Generality – can be called with any arguments– Passing arguments on the stack does this.

• Transparency – you have to leave the registers like you found them, except for R16,R17,Y and Z..

• Readability – well documented.– Provide proper comment in the source code.

• Re-entrant – subroutine can call itself if necessary (recursive function)– This is done using stack frames.

Summary

stacks & subroutines

Documents

stack purpose

stack pointersp

stack pointerwhen

stack tos

stack application

stack item size

sp register

stack pointerdifferent