11 chapter 2: process/thread instructor: hengming zou, ph.d. in pursuit of absolute simplicity...

11

Chapter 2: Process/ThreadInstructor: Hengming Zou, Ph.D.

In Pursuit of Absolute Simplicity 求于至简，归于永恒

22

Content

Processes

Threads

Inter-process communication

Classical IPC problems

Scheduling

33

Definition of A Process

Informal– A program in execution

– A running piece of code along with all the things the program can read/write

Formal– One or more threads in their own address space

Note that process != program

44

The Need for Process

What is the principle motivation for inventing process?– To support multiprogramming

55

The Process Model

Conceptual viewing of the processes

Concurrency– Multiple processes seem to run concurrently

– But in reality only one active at any instant

Progress– Every process makes progress

66

Programs in memory Conceptual view Time line view

Multiprogramming of 4 programs

77

Process Creation

Principal events that cause process creation

1. System initialization

2. Execution of a process creation system

3. User request to create a new process

88

Process Termination

Conditions which terminate processes

1. Normal exit (voluntary)

2. Error exit (voluntary)

3. Fatal error (involuntary)

4. Killed by another process (involuntary)

99

Process Hierarchies

Parent creates a child process

Child processes can create its own process

Processes creation forms a hierarchy– UNIX calls this a "process group"

– Windows has no concept of such hierarchy, i.e. all processes are created equal

1010

Process States

Running

Blocked Ready

Possible process states– running, blocked, ready

Process blocks for input

Scheduler picks another process

Input becomes available

Scheduler picks this process

1111

Process Space

Also called Address Space

All the data the process uses as it runs

Passive (acted upon by the process)

Play analogy: – all the objects on the stage in a play

1212

Process Space

Is the unit of state partitioning– Each process occupies a different state of the computer

Main topic: – How multiple processes spaces can share a single physical memory efficiently and safely

1313

Manage Process & Space

Who manages process & space?– The operating systems

How does OS achieve it?– By maintain information about processes

– i.e. use process tables

1414

Fields of A Process Table

Registers, Program counter, Status word

Stack pointer, Priority, Process ID

Parent group, Process group, Signals

Time when process started

CPU time used, Children’s CPU time, etc.

1515

Problems with Process

While supporting multiprogramming on shared hardware

Itself is single threaded!– i.e. a process can do only one thing at a time

– blocking call renders entire process unrunnable

1616

Threads

Invented to support multiprogramming on process level

Manage OS complexity– Multiple users, programs, I/O devices, etc.

Each thread dedicated to do one task

1717

Thread

Sequence of executing instructions from a program – i.e. the running computation

Play analogy: one actor on stage in a play

1818

Threads

Processes decompose mix of activities into several parallel tasks (columns)

Each job can work independently of others

job1 job2 job3

Thread 1 Thread 2 Thread 3

1919

The Thread Model

3 processes each with 1 thread

1 process with 3 threads

Kernel

Thread

Proc 1 Proc 2 Proc 3

Userspace

Kernelspace

Kernel

Thread

Process

2020

The Thread Model

Some items shared by all threads in a process

Some items private to each thread

2121

Shared and Private Items

Per process items Per thread items

Address space Program counter

Global variables Registers

Open files Stack

Child processes State

Pending alarms

Signals and signal handlers

Accounting information

2222

Shared and Private Items

Each thread has its own stack

2323

Input thread

Backup thread

Display thread

A Word Processor w/3 Threads

2424

Implementation of Thread

How many options to implement thread?

Implement in kernel space

Implement in user space

Hybrid implementation

2525

Kernel-Level Implementation

Completely implemented in kernel space

OS acts as a scheduler

OS maintains information about threads– In additions to processes

2626


2727


Advantage:– Easier to programming

– Blocking threads does not block process

Problems:– Costly: need to trap into OS to switch threads

– OS space is limited (for maintaining info)

– Need to modifying OS!

2828

User-Level Implementation

Completely implemented in user space

A run-time system acts as a scheduler

Threads voluntarily cooperate– i.e. yield control to other threads

OS need not know existence of threads

2929


3030


Advantage:– Flexible, can be implemented on any OS

– Faster: no need to trap into OS

Problems:– Programming is tricky

– blocking threads block process!

3131


How do we solve the problem of blocking thread blocks the whole process?

Modifying system calls to be unblocking

Write a wrap around blocking calls– i.e. call only when it is safe to do so

3232

Scheduler Activations

A technique that solves the problem of blocking calls in user-level threads

Method: use up-call

Goal – mimic functionality of kernel threads– gain performance of user space threads

3333


Kernel assigns virtual processors to process

Runtime sys. allocate threads to processors

Blocking threads are handled by OS upcall– i.e. OS notify runtime system about blocking calls

3434


Problem:

Reliance on kernel (lower layer) calling procedures in user space (higher layer)

Violates layered structure of OS design

3535

Hybrid Implementation

Can we have the best of both worlds– i.e. kernel-level and user-level implementations

While avoiding the problems of either?

Hybrid implementation

3636


User-level threads are managed by runtime systems

Kernel-level threads are managed by OS

Multiplexing user-level threads onto kernel- level threads

3737


3838

Multiple Threads

Can have several threads in a single address space– That is what thread is invented for

Play analogy: several actors on a single set– Sometimes interact (e.g. dance together)

– Sometimes do independent tasks

3939

Multiple Threads

Private state for a thread vs. global state shared between threads– What private state must a thread have?

– Other state is shared between all threads in a process

4040

Multiple Threads

Per process items Per thread items

Address space Program counter

Global variables Registers

Open files Stack

Child processes State

Pending alarms

Signals and signal handlers

Accounting information

4141

Multiple Threads

Many programs are written in single-threaded processes

Make them multithreaded is very tricky– Can cause unexpected problems

4242

Multiple Threads

Conflicts between threads over the use of a global variable

4343

Multiple Threads

Many solutions:

Prohibit global variables

Assign each thread its private global variables

4444

Multiple Threads

Threads can have

private global variables

4545

Cooperating Threads

Often we create threads to cooperate

Each thread handles one request

Each thread can issue a blocking disk I/O, – wait for I/O to finish

– then continue with next part of its request

4848

Cooperating Threads

Ordering of events from different threads is non-deterministic

– e.g. after 10 seconds, different threads may have gotten differing amounts of work done

thread A --------------------------------->

thread B - - - - >

thread C - - - - - - - - - - - >

4949

Cooperating Threads

Example– thread A: x=1

– thread B: x=2

Possible results?

Is 3 a possible output?– yes

5050

Cooperating Threads

3 is possible because assignment is not atomic and threads can interleave

if assignment to x is atomic– then only possible results are 1 and 2

5151

Atomic: indivisible– Either happens in its entirety without interruption or has yet to happen at all

No events from other threads can happen in between start & end of an atomic event

Atomic Operations

5252

Atomic Operations

On most machines, memory load & store are atomic

But many instructions are not atomic– double-precision floating on a 32-bit machine

– two separate memory operations

5353

Atomic Operations

If you don’t have any atomic operations– you can’t make one

Fortunately, the hardware folks give us atomic operations– and we can build up higher-level atomic primitives from there

5454

Atomic Operations

Another example:– thread A thread B

– i=0 i=0

– while (i < 10) { while (i > -10) {

– i++ i--

– } }

– print “A finished” print “B finished”

Who will win?

5555

Atomic Operations

Is it guaranteed that someone will win?

What if threads run at exactly the same speed and start close together?

Is it guaranteed that it goes on forever?

5656

Atomic Operations

Arithmetic example– (initially y=10)

– thread A: x = y + 1;

– thread B: y = y * 2;

Possible results?

5757

Thread Synchronization

Must control the inter-leavings between threads– Or the results can be non-deterministic

All possible inter-leavings must yield a correct answer

5858

Thread Synchronization

Try to constrain the thread executions as little as possible

Controlling the execution and order of threads is called “synchronization”

5959

Gold Fish Problem

Problem definition:

Tracy and Peter want to keep a gold fish alive– By feeding the fish properly

if either sees that the fish is not fed,– she/he goes to feed the fish

The fish must be fed once and only once each day

6060

Correctness Properties

Someone will feed the fish if needed

But never more than one person feed the fish

6161

Solution #0

No synchronization– Peter: Tracy:

– if (noFeed) { if (noFeed) {

– feed fish feed fish

– } }

6262

Problem with Solution #0

Peter Tracy

3:00 look at the fish (no feed)

3:05 look at the fish (no feed)

3:10 feed fish

3:25 feed fish

Fish over eat!

6363

Problem with Solution #0

Over eat occurred because:– They execute the same code at the same time

– i.e. if (noFeed) feed fish

This is called race– Two or more threads try to access the same shared resource at the same time

6464

Critical Section

The shared resource or code is called a critical section– the region where race condition occurs

Access to it must be coordinated– i.e. only once process can access it at a time

6565

Critical Section

In solution #0, critical section is – “if (noFeed) feed fish”

Peter and Tracy must NOT be inside the critical section at the same time

6666

1st Type of Synchronization

Ensure only one person in critical section– i.e. only 1 person goes shopping at a time

This is called mutual exclusion:– only 1 thread is doing a certain thing at one time

– others are excluded

7373

Gold Fish Solution #1

Idea: – leave note that you’re going to check on the fish status, so other person doesn’t also feed

7474

Solution #1

Peter: Tracy:

if (noNote) { if (noNote) {

leave note leave note

if (noFeed) { if (noFeed) {

feed fish feed fish

} }

remove note remove note

} }

7575

Solution #1

Does this work?

If not, when could it fail?

Is solution #1 better than solution #0?

7676

Solution #2

Idea: change the order of “leave note” and “check note”

This requires labeled notes– otherwise you’ll see your own note

– and think it was the other person’s note

7777

Solution #2

Peter: Tracy:

leave notePeter leave noteTracy

if (no noteTracy) { if (no notePeter) {


feed fish feed fish

} }

} }

remove notePeter remove noteTracy

7878

Solution #2

Does this work? – Yes it solves the over eat problem

If not, when could it fail?– But it introduces starvation problem

7979

Solution #3

Idea: – have a way to decide who will feed fish when both leave notes at the same time.

Approach:– Have Peter hang around to make sure job is done

8080

Solution #3

Peter: Tracy:

leave notePeter leave noteTracy

while (noteTracy) {

do nothing

}

if (no notePeter) {


feed fish feed fish

} }

}

remove notePeter remove noteTracy

8181

Solution #3

Peter’s “while (noteTracy)” prevents him from running his critical section at same time as Tracy’s

It also prevents Peter from running off without make sure that someone feeds fish

8585

Solution #3

Correct, but ugly– Complicated (non-intuitive) to prove correct

Asymmetric – peter and Tracy runs different code

– This makes coding difficult

8686

Solution #3

Wasteful– Peter consumes CPU time while waiting for Tracy to remove note

Constantly checking some status while waiting on something is called busy-waiting– Very very bad

8787

Higher-Level Synchronization

What is the solution?– Raise the level of abstraction

Make life easier for programmers

8888

Higher-Level synchronization

Low-level atomic operations provide by hardware(i.e. load/store, interrupt enable/disable, test & set)

High level synchronization operations providedby software (i.e. semaphore, lock, monitor)

Concurrent programs

8989

Lock (Mutex)

A Lock prevents another thread from entering a critical section– e.g. lock the fridge while you’re shopping so that both Peter and Tracy don’t go shopping

9090

Lock (Mutex)

Two operations– lock(): wait until lock is free, then acquire it

– do {

– if (lock is free) {

– acquire lock; break

– }

– } while (1)

unlock(): release lock

9191

Lock (Mutex)

Why was the “note” in Gold Fishsolutions #1 and #2 not a good lock?

Does it meet the 4 conditions?

9292

Four Elements of Locking

Lock is initialized to be free

Acquire lock before entering critical section

Release lock when exiting critical section

Wait to acquire lock if another thread already holds it

9393

Solve “Too much fish” with lock

Peter: Tracy:

lock() lock()


feed fish feed fish

} }

unlock() unlock()

9494

Solve “Too much fish” with lock

But this prevents Tracy from doing stuff while Peter is shopping.– I.e. critical section includes the shopping time.

How to minimize the time the lock is held?

9595

Producer-Consumer Problem

Problem: – producer puts things into a shared buffer

– Consumer takes them out

– Need synchronization for coordinating producer and consumer

producer consumer

9696


Buffer between producer and consumer allows them to operate somewhat independently

9797


Otherwise must operate in lockstep– producer puts 1 thing in buffer, then consumer takes it out

– then producer adds another, then consumer takes it out, etc

9898


Coke machine example– delivery person (producer) fills machine with cokes

– students (consumer) feed cokes and drink them

– coke machine has finite space (buffer)

9999

Solution for PC Problem

What does producer do when buffer full?

What does consumer do when buffer empty?

The busy waiting solution in Solution 3 is unacceptable

100100

Solution for PC Problem Use

Use Sleep and Wakeup primitives

Consumer sleeps if buffer is empty– Wake up sleeping producers otherwise

Producer sleeps if buffer is full– Wake up sleeping consumers otherwise

101101

Sleep and Wakeup

#define N 100 /*# of slots in the buffer*/

Int count=0; /*# of items in the buffer*/

Void producer(void)

{ Int item;

While(TRUE) {

Item=produce_item();

If(count==N) sleep();

Insert_item(item);

Count=count+1;

If(count==1) wakeup(consumer);

}

}

102102

Sleep and Wakeup

Void consumer(void)

{ Int item;

While(TRUE) {

If(count==0) sleep();

Item=remove_item();

Count=count-1;

If(count==N-1) wakeup(producer);

Consume_item(item);

}

}

103103

What are the problem?

Producer-consumer problem with fatal race condition– Access to “count” is unrestrainted

– Wakeup call could get lost

Solution is to use semaphore

104104

Semaphore

Semaphores are like a generalized lock

It has a non-negative integer value

Semaphore supports the following ops:– down, up

105105

Down Operation

Wait for semaphore to become positive

Then decrement semaphore by 1– Originally called “P” operation for the Dutch proberen

106106

Up Operation

Increment semaphore by 1– originally called “V”, for the Dutch verhogen

This wakes up a thread waiting in down– if there are any

107107

Semaphores

Can also set the initial value of semaphore

The key parts in down() and up() are atomic

Two down() calls at the same time can’t decrement the value below 0

108108

Binary Semaphores

Value is either 0 or 1

down() waits for value to become 1– then sets it to 0

up() sets value to 1– waking up waiting down (if any)

109109

Initial value is 1 (or more generally, N)– down()

– <critical section>

– up()

Like lock/unlock, but more general

Mutual Exclusion w/ Semaphore

110110

Semaphores for Ordering

Usually (not always) initial value is 0

Example: thread A wants to wait for thread B to finish before continuing– Semaphore initialized to 0

– A B

– down() do task

– continue execution up()

111111

PC Problem with Semaphores

Let’s solve the producer-consumer problem in semaphores

112112

Semaphore Assignments

mutex: – ensures mutual exclusion around code that manipulates buffer queue (initialized to 1)

full: – counts the # of full buffers (initialized to 0)

113113


empty: – counts the number of empty buffers (initialized to N)

Why do we need different semaphores for full and empty buffers?

114114

Solve PC with Semaphores

#define N 100 /*# of slots in the buffer*/

typedef int semaphore; /*a special kind of int*/

semaphore mutex =1; /*controls access to critical */

semaphore empty =N; /*counts empty buffer slots*/

int full = 0; /*counts full buffer slots */

115115


void producer(void)

{ int item;

while(TRUE) {

item=produce_item();

down(&empty);

down(&mutex);

insert_item(item);

up(&mutex)

up(&full);

}

}

116116


void consumer(void)

{ int item;

while(TRUE) {

down(&full);

down(&mutex);

item=remove_item();

up(&mutex);

up*(&empty);

consume_item(item);

}

}

117117


Does the order of the down() function calls matter in the consumer (or the producer)?

Does the order of the up() function calls matter in the consumer (or the producer)?

118118


What (if anything) must change to allow multiple producers and/or multiple consumers?

What if there’s 1 full buffer, and multiple consumers call down(full) at the same time?

119119

Problem with Semaphore

Is there any problem with semaphore?

The order of down and up ops are critical– Improper ordering could cause deadlock

– i.e. programming in semaphore is tricky

How to make programming easier?

120120

Monitor

A programming language construct

A collection of procedures, variables, data structures

Access to it is guaranteed to be exclusive– By the compiler (not programmer)

121121

Monitor

Monitors use separate mechanisms for the two types of synchronization

use locks for mutual exclusion

use condition variables for ordering constraints

122122

Monitor

A monitor = a lock + the condition variables associated with that lock

123123

Condition Variables

Main idea: – make it possible for thread to sleep inside a critical section

Approach:– by atomically releasing lock, putting thread on wait queue and go to sleep

124124

Condition Variables

Each variable has a queue of waiting threads– threads that are sleeping, waiting for a condition

Each variable is associated with one lock

125125

Monitors

Monitor example

integer i;

condition c;

procedure producer();

…

end;

procedure consumer();

…

end

end monitor

126126

Ops on Condition Variables

wait(): – atomically release lock

– put thread on condition wait queue, go to sleep

– i.e. start to wait for wakeup

127127


signal():– wake up a thread waiting on this condition variable if any

broadcast():– wake up all threads waiting on this condition variable if any

128128


Note that thread must be holding lock when it calls wait() or signal()

To avoid problems when both threads are inside monitor– the signal() must be the last statement

129129


What to do when a thread wakes up?

Two options:

Let the wakeup thread run– i.e. signaler release the lock

Let the caller and callee compete for lock

130130

Mesa vs. Hoare Monitors

The first option is called Hoare Monitor – It gives special priority to the woken-up waiter

– signaling thread gives up lock

– woken-up waiter acquires lock

– signaling thread re-acquires lock after waiter unlocks

131131


The second option is Mesa Monitor– when waiter is woken, it must contend for the lock with other threads

– hence must re-check condition

– Whoever wins get to run

132132


We’ll stick with Mesa monitors– as do most operating systems

– Because it is more flexible

133133

Programming with Monitors

List shared data needed to solve problem

Decide how many locks are needed

Decide which locks will protect which data

134134


More locks allows different data to be accessed simultaneously– i.e. protecting finer-grained data

– but is more complicated

One lock usually enough in this class

137137


Call wait() when thread needs to wait for a condition to be true;

Use a while loop to re-check condition after wait returns

Call signal when a condition changes that another thread might be interested in

138138

Producer-Consumer in Monitor

Variables:

numCokes (number of cokes in machine)

One lock to protect this shared data– cokeLock

fewer locks make the programming simpler – but allow less concurrency

139139


Ordering constraints:

Consumer must wait for producer to fill buffer if all buffers are empty

Producer must wait for consumer to empty buffer if buffers is completely full

140140


monitor ProducerConsumer condition full, empty; integer count; procedure insert (item: integer); begin if count == N then wait(full); insert_item(item); count++; if count ==1 then signal(empty); end; function remove: integer; begin if count ==0 then wait(empty); remove = remove_item; count--; if count==N-1 then signal(full); end; count:=0;end monitor;

141141


procedure producer;begin while true do begin item = produce_item; ProducerConsumer.insert(item); endend;procedure consumer;begin while true do begin item=ProducerConsumer.remove; consume_item(item); endend;

145145

Condition variables are more flexible than using semaphores for ordering constraints

Condition variables: – can use arbitrary condition to wait

Semaphores: – wait if semaphore value == 0

Condition Variable vs. Semaphore

147147

Problems with Monitor

Many languages do not support

Don’t resolve problem when multiple CPUs or computer are involved– This applies to semaphore too

Solution?– Message passing

148148

Message Passing

A high-level primitive for IPC

It uses two primitives: send and receive– Send(destination, &message)

– Receive(source, &message)

They are system calls (not language constructs)

Can either block or return immediately

149149

Message Passing

#define N 100 /*# of slots in the buffer*/Void

producer(void)

{ Int item;

message m; /*message buffer*/

While(TRUE) {

Item=produce_item();

receive(consumer, &m); /*wait for an empty */

Build_messasge(&m, item); /*construct a msg to*/

Send(consumer, &m);

}

}

150150

Message Passing

Void consumer(void)

{ Int item;

message m;

for(i=0;i<N;i++) send(producer, &m); /* send N empties */

While(TRUE) {

receive(producer, &m);

item=extract_item(&m);

send(producer, &m);

consume_item(item);

}

}

151151

Problems with Message Passing

Have many challenging problems

Message loss– What happens when a message is lost

Authentication– How to determine the identity of the sender?

Performance

152152

Barriers

Another synchronization primitive

Intended for a group of processes

All processes must reach the barrier for the applications to proceed to next phase

153153

Barriers

– processes approaching a barrier

– all processes but one blocked at barrier

– last process arrives, all are let through

154154

Implementing Locks

So far we have used locks extensively

We assumed that lock operations are atomic

But how atomicity of lock is implemented?

155155

Implementing Locks

Low-level atomic operations provide by hardware(i.e. load/store, interrupt enable/disable, test & set)

High level synchronization operations providedby software (i.e. semaphore, lock, monitor)

Concurrent programs

Lock must be implemented on hardware ops

156156

Use Interrupt Disable/Enable

On uniprocessor, operation is atomic as long as – context switch doesn’t occur in middle of operation

How does thread get context switched out?– interrupt

Prevent context switches at wrong time by preventing these events

157157

Use Interrupt Disable/Enable

With interrupt disable/enable to ensure atomicity, – why do we need locks?

User program could call interrupt disable – before entering critical section

– and call interrupt enable after leaving critical section

– and make sure not to call yield in the critical section

158158

Lock Implementation #1

Disable interrupts with busy waiting

159159


lock() {

disable interrupts

while (value != FREE) {

enable interrupts

disable interrupts

}

value = BUSY

enable interrupts

}

Why does lock() disable interrupts in the beginning of the function?

Why is it ok to disable interrupts in lock()’s critical section

it wasn’t ok to disable interrupts while user code was running?

160160


unlock() {

disable interrupts

value = FREE

enable interrupts

}

Do we need to disable interrupts in unlock()?

161161

Problems with Interrupt Approach

Interrupt disable works on a uniprocessor– by preventing current thread from being switched out

But this doesn’t work on a multi-processor

Disabling interrupts on one processor doesn’t prevent other processors from running

Not acceptable (or provided) to modify interrupt disable to stop other processors from running

162162

Read-modify-write Instructions

Another atomic primitive

Use atomic load / atomic store instructions – remember Gold Fishsolution #3

163163


Modern processors provide an easier way– with atomic read modify-write instructions

Read-modify-write atomically – reads value from memory into a register

– Then writes new value to that memory location

164164


test_and_set– atomically writes 1 to a memory location (set)

– and returns the value that used to be there (test)

test_and_set(X) {

tmp = X

X = 1

return(tmp)

}

165165


Test & set with busy waiting(value is initially 0)

lock() {

while (test_and_set(value) == 1) {}

}

unlock() {

value = 0

}

166166


If lock is free (value = 0)– test_and_set sets value to 1 and returns 0,

– so the while loop finishes

If lock is busy (value = 1)– test_and_set doesn’t change the value and returns 1,

– so loop continues

167167

Strategy for Reducing Busy-Waiting

In method 1 & 2, Waiting thread uses lots of CPU time just checking for lock to become free

Better for it to sleep and let other threads run

168168


Interrupt disable, no busy-waiting

Waiting thread gives up processor so that other threads (e.g. thread with lock) can run more quickly

Someone wakes up thread when the lock is free

169169


lock() {

disable interrupts

if (value == FREE) {

value = BUSY

} else {

add thread to queue of threads waiting for this lock

switch to next runnable thread

}

enable interrupts

}When should lock() re-enable interrupts before calling switch?

170170


unlock() {

disable interrupts

value = FREE

if (any thread is waiting for this lock) {

move waiting thread from waiting queue to ready queue

value = BUSY

}

enable interrupts

}

172172

Interrupt Disable/Enable Pattern

Enable interrupts before adding thread to wait queue?

lock() {

disable interrupts

...

if (lock is busy) {

enable interrupts

add thread to lock wait queue

switch to next run-able thread

}

When could this fail?

173173


Enable interrupts after adding thread to wait queue, but before switching to next thread?

174174


lock() {

disable interrupts

...

if (lock is busy) {

add thread to lock wait queue

enable interrupts


}

175175


But this fails if interrupt happens after thread enable interrupts– lock() adds thread to wait queue

– lock() enables interrupts

– interrupt causes preemption,

– i.e. switch to another thread.

176176


Preemption moves thread to ready queue– Now thread is on two queues (wait and ready)!

Also, switch is likely to be a critical section– Adding thread to wait queue and switching to next thread must be atomic

177177

Solution

Waiting thread leaves interrupts disabled – when it calls switch

Next thread to run has the responsibility of– re-enabling interrupts before returning to user code

When waiting thread wakes up– it returns from switch with interrupts disabled from the last thread

178178

Invariant

All threads promise to have interrupts disabled when they call switch

All threads promise to re-enable interrupts after they get returned to from switch

179179

Thread A Thread B yield() { disable

interrupts switch enable interrupts}<user code runs>lock() { disable interrupts ... switch back from switch enable interrupts } <user code runs> unlock() (move

thread A to ready queue) yield() { disable

interrupts switch back from switch enable interrupts}

180180


Test & set, minimal busy-waiting

Can’t implement locks using test & set without some amount of busy-waiting– but can minimize it

Idea:– use busy waiting only to atomically execute lock code

– Give up CPU if busy

181181


lock() {

while(test&set(guard)) {

}

if (value == FREE) {

value = BUSY

} else {

add thread to queue of threads waiting for this lock


}

guard = 0

}

182182


unlock() {

while (test&set(guard)) {

}

value = FREE

if (any thread is waiting for this lock) {

move waiting thread from waiting queue to ready queue

value = BUSY

}

guard = 0

}

183183

CPU Scheduling

How should one choose next thread to run? What are the goals of the CPU scheduler?

Bursts of CPU usage alternate with periods of I/O wait

184184

CPU Scheduling

(a) CPU-bound process (b) I/O bound process

185185

CPU Scheduling

Minimize average response time– average elapsed time to do each job

Maximize throughput of entire system– rate at which jobs complete in the system

Fairness– share CPU among threads in some “equitable” manner

186186

Scheduling Algorithm Goals

All systems

Fairness:– giving each process a fair share of the CPU

Policy enforcement:– seeing that stated policy is carried out

Balance:– keeping all parts of the system busy

187187


Batch systems

Throughput:– maximize jobs per hour

Turnaround time:– minimize time between submission and termination

CPU utilization:– keep the CPU busy all the time

188188


Interactive systems– Response time – respond to requests quickly

– Proportionality – meet users’ expectations

Real-time systems– Meeting deadlines – avoid losing data

– Predictability – avoid quality degradation in multimedia systems

189189

FCFS

First-Come, First-Served

FIFO ordering between jobs

No preemption (run until done)– thread runs until it calls yield() or blocks on I/O

– no timer interrupts

190190

FCFS

Pros and cons

+ simple

- short jobs get stuck behind long jobs

- what about the user’s interactive experience?

191191

FCFS

Example– job A takes 100 seconds

– job B takes 1 second

– time 0: job A arrives and starts

– time 0+: job B arrives

– time 100: job A ends (response time=100); job B starts

– time 101: job B ends (response time = 101)

average response time = 100.5

192192

Round Robin

Goal: – improve average response time for short jobs

Solution: – periodically preempt all jobs (viz. long-running ones)

Is FCFS or round robin more “fair”?

193193

Round Robin

Example

job A takes 100 seconds

job B takes 1 second

time slice of 1 second – a job is preempted after running for 1 second

194194

Round Robin

– time 0: job A arrives and starts

– time 0+: job B arrives

– time 1: job A is preempted; job B starts

– time 2: job B ends (response time = 2)

– time 101: job A ends (response time = 101)

average response time = 51.5

195195

Round Robin

Does round-robin always achieve lower response time than FCFS?

196196

Round Robin

Pros and cons

+ good for interactive computing

- round robin has more overhead due to context switches

197197

Round Robin

How to choose time slice?

big time slice: – degrades to FCFS

small time slice: – each context switch wastes some time

198198

Round Robin

typically a compromise– 10 milliseconds (ms)

if context switch takes .1 ms– then round robin with 10 ms slice wastes 1% of CPU

199199

STCF

Shortest Time to Completion First

STCF: run whatever job has the least amount of work to do before it finishes or blocks for an I/O

200200

STCF

STCF-P: preemptive version of STCF

if a new job arrives that has less work than the current job has remaining– then preempt the current job in favor of the new one

201201

STCF

Idea is to finish the short jobs first

Improves response time of shorter jobs by a lot– Doesn’t hurt response time of longer jobs by too much

202202

STCF

STCF gives optimal response time among non-preemptive policies

STCF-P gives optimal response time among preemptive policies (and non-preemptive policies)

203203

STCF

Is the following job a “short” or “long” job?– while(1) {

– use CPU for 1 ms

– use I/O for 10 ms

– }

204204

STCF

Pros and cons

+ optimal average response time

- unfair. Short jobs can prevent long jobs from ever getting any CPU time (starvation)

- needs knowledge of future

205205

STCF

STCF and STCF-P need knowledge of future– it’s often very handy to know the future :-)

– how to find out the future time required by a job?

206206

Example

job A

compute for 1000 seconds

job B

compute for 1000 seconds

job C

while(1) {

use CPU for 1 ms

use I/O for 10 ms

}

207207

Example

C can use 91% of the disk by itself

A or B can each use 100% of the CPU

What happens when we run them together?

Goal: keep both CPU and disk busy

208208

FCFS

if A or B run before C,

they prevent C from issuing its disk I/O for – up to 2000 seconds

209209

Round Robin

with 100 ms time slice– CA---------B---------CA---------B---------...

– |--|

– C’s

– I/O

Disk is idle most of the time that A & B are running – about 10 ms disk time every 200 ms

210210

Round Robin

with 1 ms time slice– CABABABABABCABABABABABC...

– |--------| |--------|

– C’s C’s

– I/O I/O

C runs more often, so it can issue its disk I/O almost as soon as its last disk I/O is done

211211

Round Robin with 1ms

Disk is utilized almost 90% of the time

Little effect on A or B’s performance

General principle: – first start the things that can run in parallel

problem: – lots of context switches (and context switch overhead)

212212

STCF-P

Runs C as soon as its disk I/O is done– because it has the shortest next CPU burst

– CA-------CA-------------CA--------- ...

– |--------| |--------| |---------|

– C’s C’s C’s

– I/O I/O I/O

213213

Real-Time Scheduling

So far, we’ve focused on average-case analysis– average response time, throughput

Sometimes, the right goal is to get each job done before its deadline– irrelevant how much before deadline job completes

214214


Video or audio output. – E.g. NTSC outputs 1 TV frame every 33 ms

Control of physical systems– e.g. auto assembly, nuclear power plants

215215


This requires worst-case analysis

How do we do this in real life?

216216

EDF

Earliest-deadline first

Always run the job that has the earliest deadline– i.e. the deadline coming up next

If a new job arrives with an earlier deadline than the currently running job– preempt the running job and start the new one

217217

EDF

EDF is optimal– it will meet all deadlines if it’s possible to do so

218218

EDF

Example

job A: takes 15 seconds, deadline is 20 seconds after entering system

job B: takes 10 seconds, deadline is 30 seconds after entering system

job C: takes 5 seconds, deadline is 10 seconds after entering system

219219

EDF

time--->

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85

A +

B +

C +

220220

Not all set of tasks are schedulable in RT Systems

Given– m periodic events

– event i occurs within period Pi and requires Ci seconds

Then the load can only be handled if1

1m

i

i i

C

P

Schedulability in Real-Time Systems

221221

Scheduling in Batch Systems

First come first serve

Shortest job first

Shortest remaining time next

Three-level scheduling

222222


An example of shortest job first scheduling

223223


Three level scheduling

224224

Scheduling in Interactive Systems

Round-robin scheduling

Priority scheduling– Run highest priority process until it blocks or exits

– Alternative, decrease priority at each clock tick

Multiple queue– Dividing priority into classes

– Dynamically adjust a process’s priority class

225225


Shortest process next– Estimate the running time of processes

Guaranteed scheduling– Each process runs 1/n fraction of time

Lottery scheduling– Issue lottery tickets to process & schedule accordingly

Fair share scheduling per user

226226


Round Robin Scheduling– list of runnable processes

– list of runnable processes after B uses up its quantum

227227

A scheduling algorithm with four priority classes


228228

Policy versus Mechanism

Separate what is allowed to be done – with how it is done

Important in thread scheduling– a process knows which of its children threads are important and need priority

229229

Policy versus Mechanism

Scheduling algorithm parameterized– mechanism in the kernel

Parameters filled in by user processes– policy set by user process

230230

Thread Scheduling

Possible scheduling of user-level threads

231231

Thread Scheduling

Possible scheduling of kernel-level threads

Thoughts Change Life意念改变生活

11 chapter 2: process/thread instructor: hengming zou, ph.d. in pursuit of absolute simplicity...

Documents