a comprehensive compiler-assisted thread abstraction for resource- constrained systems alexander...

40
A Comprehensive Compiler- Assisted Thread Abstraction for Resource-Constrained Systems Alexander Bernauer, ETH Zürich Kay Römer, University of Lübeck Meng-Lin 2013/05/06 1

Upload: albert-jacobson

Post on 22-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

1

A Comprehensive Compiler-Assisted Thread Abstraction for Resource-Constrained Systems

Alexander Bernauer, ETH ZürichKay Römer, University of Lübeck

Meng-Lin 2013/05/06

2

Outline

• Motivation• Overview• Translation Scheme• Evaluation• Conclusion

3

Outline

• Motivation• Overview• Translation Scheme• Evaluation• Conclusion

4

Multi-Tasking Paradigms

• Scarce memory for embedded System• Challenging for multi-tasks• Event-based is most common now• Thread-based is better managed

Event-based

CooperativeThreads

EfficientExecution

ComfortableDevelopment

???

5

Threads Events

6

Threads Events

7

Threads Events

8

Motivation

• Event-based• Hard to manage task’s control flow• Tasks’ contexts to be manually managed and preserved between events

• Thread-based• Sequential control flow• Tasks’ contexts stored in automatic (local) variables

• But existing thread libraries• Provide incomplete thread semantics• Introduce a significant context overhead

• So…?

9

Event-based

RuntimeCooperativeThreads

EfficientExecution

ComfortableDevelopment

Compiler-AssistedCooperativeThreads???Cooperative

Threads

10

Outline

• Motivation• Overview• Translation Scheme• Evaluation• Conclusion

11

Overview

• Propose a dedicated compiler Ocram• Translate a thread-based application (T-code) into an equivalent event-based

application (E-code)• https://github.com/copton/ocram

• Leverage platform abstraction layers (PAL) to bind Ocram to Contiki and TinyOS

• Show the feasibility of compiler-assisted threads for three different WSN applications

• Verify the correctness of the transformation• Measure the resource costs of this abstraction compared to both native

event-based implementations and thread libraries

12

System Overview

• Compiler-Assisted Thread• Translate thread-based programs (T-code) into equivalent event-based ones

(E-code)

13

Equivalence

E-code is equivalent to T-code

if and only if

every possible observable behavior of the E-code

corresponds to

one possible observable behavior of the T-code

14

Outline

• Motivation• Overview• Translation Scheme• Evaluation

15

Translation Scheme

• Transform valid T-code into equivalent E-code• Data flow and Control flow• Non-interruptible functions passed through unchanged• Interruptible functions translated into an intermediate representation

(IR) (E-code)• Uniqueness of identifiers• while and for loops replaced by if and goto• Two normal form for interruptable calls

• interruptable_call(parameters);• expression = interruptable_call(parameters);

16

Data Flow

Global State

Task N

Instruction Pointer

Stack (for context switch)

T-Code

Global State

Task N

Call-back Function

E-Code

17

Example of Data Flow

• For each interruptible function1. Find the critical variables2. Generate T-stack frame

• Continuation (line 2)• Return value (line 3)• Parameters (line 4)• Critical variables (line 5)• Callees (line 6-9)

3. Generate E-stack frame

18

Control Flow T-Code

function callfunction return

API call

f1

f2

f3

f4

E-Code

f1

thread_n() {

}

f2

f3

f4

gotogoto

API call

19

Example of Control Flow• Inline the bodies of all interruptible functions in each thread into one

common thread execution function, serve as a single event handler

Blinky

Blinky

Wait

Wait

Wait

Interruptible call of an interruptible function Interruptible call of a T-code API function

20

Correctness

• Define observable behavior (OB) as the order of all API calls including the values of all input parameters

• Data flow transformation• Keep OB unchanged• Preserve effects of each statement

• Control flow transformation• Preserve the sequence of statement execution

21

Outline

• Motivation• Overview• Translation Scheme• Evaluation• Conclusion

22

Evaluation

• Executed via COOJA/MSPSim, logged by extra COOJA plugin• Applications (case studies)

• Data Collection Application (DCA)• CoAP implementation (COAP)• Remote Procedure Calls (RPC)

• Variants• Event-based (NAT) - Native Contiki using protothreads with only 2 bytes

overhead per thread• Run-time threads (TL) - TinyThreads ported to Contiki• Compiler-assisted threads (GEN)

23

RAM

• Size of the data(initialized)+bss section(uninitialized) + max. stack size• Almost same data & common bss

• All use Contiki

• Equal maximum stack size• No runtime stack

• Larger bss for TL• Expensive thread libraries

• Gen• 1% more overhead compared to Nat.

24

CPU cycles

• Number of CPU cycles• TL’s scheduler adds up to 12 % of

CPU cycles compared to Nat• Gen only 2% more CPU cycles

compared to Nat• Less CPU cycles if excluding PAL

25

Text

• Size of code• Larger text for TL

• Overhead of thread scheduler

• Larger text for Gen• Overhead of PAL• 3% more than Nat

26

Text Per Thread

• Various worker threads for RPC• Lowest slope for TL

• General thread libraries• Only involves starting new thread

• Slope for Nat• Originates from protothread

• Steep slope for Gen• PAL also originates from protothread• Generated E-code pay extra for re-

entrant function (not hold any global variable in function)

27

RAM Per Thread

• Size of the data(initialized)+bss section(uninitialized) + max. stack size• Various worker threads for RPC• Slope for Gen same as Nat• Steep slope for TL

• Safety margin for threads’ stack• Generality for all possible cases

28

Limitations

• Not allowed to take the address of an interruptible function• Use case differentiation

• Interruptible functions may not be recursive which causes stack consumption undecidable

• Uncommon in embedded system

• Unable to support dynamic thread creation, but have to statically assign threads with thread start functions

• WSN applications tend to have a fixed set of tasks

29

Conclusion

• Comprehensive thread abstraction• +1% RAM, +2% CPU, +3% ROM

• Strength• Hard work

• Weakness• Might susceptible to application dependency• Unable to guarantee timing issues

30

Q&A

Thanks for your listening

31

Reference

1. Mishra, S. and Yang, R., Thread-based vs event-based implementation of a group communication service, Parallel Processing Symposium, 1998

32

Back-up Slide

33

Processes and Threads

• Both processes and threads are independent sequences of execution.• Threads (of the same process) run in a shared memory space, while

processes run in separate memory spaces.

34

Process state

http://en.wikipedia.org/wiki/Process_state

35

Cooperative V.S. Preemptive Thread

• Cooperative style• Context switching only occurs at yield points, that is once a thread is given

control it continues to run until it explicitly yields control or it blocks• The saved context only consists of the variables that are read after the yield

point• Unable to guarantee timings, priorities

• Preemptive style• Context switching allowed to step in and hand control from one thread to

another at any time, usually based on priorities• The saved context consists of the thread’s complete stack and all CPU

registers

36

Event-based V.S. Thread-based

• Event-based [1] is most common now• A single-threaded event loop performs event demultiplexing and event

handler dispatching in response to the occurrence of multiple events• No need to allocate memory for per-thread stacks

• Thread-based [1] is better managed• A separate thread is spawned for each event type, say etype, in the program.

This thread waits for an event of type etype to occur and takes appropriate actions on the occurrence of that event

• Require context saved during context switching

37

Trade-off between Programming Paradigm • Event-based paradigm is efficient since no need to allocate memory for per-

thread stacks, but not scales well with growing complexity• Hard to manage complex control flow of a task, which formed by a causal chain of

events and event handler functions• Contexts of tasks to be manually managed and preserved between events

• Thread-based paradigm remedies this situation by • Sequential control flow• Contexts of tasks can be stored in automatic (local) variables

• But existing thread libraries• Provide incomplete thread semantics• Introduce a significant resource overhead by storing contexts in automatic (local)

variables• Hard to estimate the maximum size of a stack supporting multiple threads

38

Control Flow

• The order in which the individual statements, instructions or function calls of an imperative or a declarative program are executed or evaluated

• http://en.wikipedia.org/wiki/Control_flow• Labels• Goto• Subroutines/Functions

39

Reentrancy 

• A function is called reentrant if it can be interrupted in the middle of its execution and then safely called again ("re-entered") before its previous invocations complete execution.

• http://en.wikipedia.org/wiki/Reentrancy_(computing)• Rules for reentrancy

• Reentrant code may not hold any static (or global) non-constant data.• Reentrant code may not modify its own code.• Reentrant code may not call non-reentrant computer programs or routines.

40

Terminology

• Context (process, thread ...)• the minimal set of data used by this task that must be saved to allow a task

interruption at a given date, and a continuation of this task at the point it has been interrupted and at an arbitrary future date

• Value of the CPU registers, process state, program counter…

• Automatic variable• A variable which is allocated and deallocated automatically when program

flow enters and leaves the variable's context

• Instantiation• Creation of a real instance or particular realization of an abstraction or

template such as a class of objects or a computer process