cpu scheduling. cpu scheduling 101 the cpu scheduler makes a sequence of “moves” that determines...

CPU SchedulingCPU Scheduling

CPU Scheduling 101CPU Scheduling 101

The CPU scheduler makes a sequence of “moves” that determines the interleaving of threads.

• Programs use synchronization to prevent “bad moves”.

• …but otherwise scheduling choices appear (to the program) to be nondeterministic.

These moves are dictated by a scheduling policy.

Schedulerready pool

Wakeup orReadyToRun GetNextToRun()

SWITCH()

Thread States and TransitionsThread States and Transitions

running

readyblocked

Scheduler::Run

Thread::Wakeup

Thread::Sleep(voluntary)

Thread::Yield(voluntary or involuntary)

CPU Sharing: The BasicsCPU Sharing: The Basics

• Preemptive vs. cooperative multitasking

• Preemptive multitasking employs a scheduler policy to assign “tasks” to CPUs/processors/cores.

• Exclusive resource multiplexed by time

• quantum == timeslice

• Hyperthreaded CPUs: simultaneous tasks per core

• Multiprocessing: multiple cores to schedule

context switch

Processor AllocationProcessor Allocation

The key issue is: how should an OS allocate its CPU resources among contending demands?

• Some flavor of “virtual processor” abstraction

• We are concerned with policy: how the OS uses underlying mechanisms to meet design goals.

• Focus on OS kernel : user code can decide how to use the processor time it is given.

• Kernel policy: what to allocate a free CPU to, and for how long?

Allocating processors to what?Allocating processors to what?

What are the schedulable entities? (contexts)

• Kernel-supported threads

• Mach/OS-X, Windows, Solaris

• Processes or jobs

• Classic Unix, Linux, and predecessors

• Scheduler activations [Anderson91]: state of the art

Any of these can host user-level thread libraries.

We can consider kernel scheduling policy (processor allocation) independent of the abstraction.

CPU Scheduling: Policy vs. MechanismCPU Scheduling: Policy vs. Mechanism

What are the underlying mechanisms?

• Timer interrupts

• Regular CPU clock tick interrupts (e.g., 10ms)

• On some CPUs, kernel-settable registers that decrement regularly and interrupt at time 0.

• I/O interrupts may change state of a thread or task based on external events.

• Thread context switch by register save/restore

• Scheduler/dispatch code invoked at key events.

• Wakeup, timer expire, terminate, yield, block, IPC, change to attributes such as priority or affinity, etc.

Scheduler Policy GoalsScheduler Policy Goals

• Response time or latency, responsiveness

How long does it take to do what I asked? (R)

• Throughput

How many operations complete per unit of time? (X)

Utilization: what percentage of time does the CPU (and each device) spend doing useful work? (U)

• Fairness

What does this mean? Divide the pie evenly? Guarantee low variance in response times? freedom from starvation?

• Meet deadlines and guarantee jitter-free periodic tasks

PriorityPriority

Most modern OS schedulers use priority scheduling.

• Each thread in the ready pool has a priority value.

• The scheduler favors higher-priority threads.

• Threads inherit a base priority from the associated application/process/task.

• User-settable relative importance within task

• Internal priority adjustments as an implementation technique within the scheduler.

improve fairness, use idle resources, stop starvation

How many priority levels? 32 (Windows) to 128 (OS X)

Internal Priority AdjustmentInternal Priority Adjustment

Continuous, dynamic, priority adjustment in response to observed conditions and events.

• Adjust priority according to recent usage.

• Decay with usage, rise with time

• Boost threads that already hold resources that are in demand.

e.g., internal sleep primitive in Unix kernels

• Boost threads that have starved in the recent past.

• May be visible/controllable to other parts of the kernel

PreemptionPreemption

Scheduling policies may be preemptive or non-preemptive.

Preemptive: scheduler may unilaterally force a task to relinquish the processor before the task blocks, yields, or completes.

• timeslicing prevents jobs from monopolizing the CPU

Scheduler chooses a job and runs it for a quantum of CPU time.

A job executing longer than its quantum is forced to yield by scheduler code running from the clock interrupt handler.

• use preemption to honor priorities

Preempt a job if a higher priority job enters the ready state.

A Simple Policy: FCFSA Simple Policy: FCFS

The most basic scheduling policy is first-come-first-served, also called first-in-first-out (FIFO).

• FCFS is just like the checkout line at the QuickiMart.

Maintain a queue ordered by time of arrival.

GetNextToRun selects from the front of the queue.

• FCFS with preemptive timeslicing is called round robin.

Wakeup orReadyToRun GetNextToRun()

ready list

List::Append

RemoveFromHead

CPU

Evaluating FCFSEvaluating FCFS

How well does FCFS achieve the goals of a scheduler?

• throughput. FCFS is as good as any non-preemptive policy.

….if the CPU is the only schedulable resource in the system.

• fairness. FCFS is intuitively fair…sort of.“The early bird gets the worm”…and everyone else is fed eventually.

• response time. Long jobs keep everyone else waiting.

3 5 6

D=3 D=2 D=1

Time

GanttChart

R = (3 + 5 + 6)/3 = 4.67

Preemptive FCFS: Round RobinPreemptive FCFS: Round Robin

Preemptive timeslicing is one way to improve fairness of FCFS.

If job does not block or exit, force an involuntary context switch after each quantum Q of CPU time.

Preempted job goes back to the tail of the ready list.

With infinitesimal Q round robin is called processor sharing.

D=3 D=2 D=1

3+ε 5 6

R = (3 + 5 + 6 + ε)/3 = 4.67 + ε

In this case, R is unchanged by timeslicing.Is this always true?

quantum Q=1

preemptionoverhead = ε

FCFS-RTC

round robin

Evaluating Round RobinEvaluating Round Robin

• Response time. RR reduces response time for short jobs.

For a given load, wait time is proportional to the job’s total service demand D.

• Fairness. RR reduces variance in wait times.

But: RR forces jobs to wait for other jobs that arrived later.

• Throughput. RR imposes extra context switch overhead.CPU is only Q/(Q+ε) as fast as it was before.

Degrades to FCFS-RTC with large Q.

D=5 D=1R = (5+6)/2 = 5.5

R = (2+6 + ε)/2 = 4 + ε

Two Schedules for CPU/DiskTwo Schedules for CPU/Disk

CPU busy 25/25: U = 100%Disk busy 15/25: U = 60%

5 5 1 1

4CPU busy 25/37: U = 67%Disk busy 15/37: U = 40%

33% performance improvement

1. Naive Round Robin

2. Round Robin with internal boost for I/O completion

Estimating Time-to-YieldEstimating Time-to-Yield

How to predict which job/task/thread will have the shortest demand on the CPU?

• If you don’t know, then guess.

Weather report strategy: predict future D from the recent past.

Don’t have to guess exactly: we can do well by using adaptive internal priority.

• Common technique: multi-level feedback queue.

• Set N priority levels, with a timeslice quantum for each.

• If quantum expires, drop priority down one level.

• If a job yields or blocks, bump priority up one level.

Multilevel Feedback QueueMultilevel Feedback Queue

Many systems (e.g., Unix variants) implement priority and incorporate SJF by using a multilevel feedback queue.

• multilevel. Separate queue for each of N priority levels.Use RR on each queue; look at queue i-1 only if queue i is empty.

• feedback. Factor previous behavior into new job priority.

high

low

I/O bound jobs waiting for CPU

CPU-bound jobs

jobs holding resoucesjobs with high external priority

ready queuesindexed by priority

GetNextToRun selects jobat the head of the highestpriority queue. constant time, no sorting

Priority of CPU-boundjobs decays with systemload and service received.

University of Toronto Department of Computer Science

CSC444 Lec02 19

Mars PathfinderMission

Demonstrate new landing techniques

parachute and airbagsTake picturesAnalyze soil samplesDemonstrate mobile robot

technologySojourner

Major success on all fronts

Returned 2.3 billion bits of information

16,500 images from the Lander550 images from the Rover15 chemical analyses of rocks &

soilLots of weather dataBoth Lander and Rover outlived

their design lifeBroke all records for number of

hits on a website!!!

© 2001, Steve Easterbrook


CSC444 Lec02 20

Remember these pictures?



CSC444 Lec02 21

Pathfinder had Software ErrorsSymptoms: software did total systems resets and some data was lost

each timeSymptoms noticed soon after Pathfinder started collecting meteorological data

Cause3 Process threads, with bus access via mutual exclusion locks (mutexes):

High priority: Information Bus ManagerMedium priority: Communications Task Low priority: Meteorological Data Gathering Task

Priority Inversion:Low priority task gets mutex to transfer data to the busHigh priority task blocked until mutex is releasedMedium priority task pre-empts low priority taskEventually a watchdog timer notices Bus Manager hasn’t run for some

time…

FactorsVery hard to diagnose and hard to reproduce

Need full tracing switched on to analyze what happenedWas experienced a couple of times in pre-flight testing

Never reproduced or explained, hence testers assumed it was a hardware glitch


Real Time/MediaReal Time/Media

Real-time schedulers must support regular, periodic execution of tasks (e.g., continuous media).

E.g., OS X has four user-settable parameters per thread:

• Period (y)

• Computation (x)

• Preemptible (boolean)

• Constraint (<y)

• Can the application adapt if the scheduler cannot meet its requirements?

• Admission control and reflection

Provided for completeness

Behavior of FCFS QueuesBehavior of FCFS Queues

Assume: stream of normal task arrivals with mean arrival rate λ.Tasks have normally distributed service demands with mean D.

Then: Utilization U = λD (Note: 0 <= U <= 1) Probability that service center is idle is 1-U. “Intuitively”, R = D/(1-U)

R

U 1(100%)

Service center saturates as 1/ λ approaches D: small increases in λ cause large increases in the expected response time R.

servicecenter


Little’s LawLittle’s Law

For an unsaturated queue in steady state, queue length N and responsetime R are governed by:

Little’s Law: N = λR.

While task T is in the system for R time units, λR new tasks arrive.During that time, N tasks depart (all tasks ahead of T).But in steady state, the flow in must balance the flow out. (Note: this means that throughput X = λ).

Little’s Law gives response time R = D/(1 - U).

Intuitively, each task T’s response time R = D + DN.Substituting λR for N: R = D + D λR Substituting U for λD: R = D + URR - UR = D --> R(1 - U) = D --> R = D/(1 - U)

(W/T = C/T * W/C)


Why Little’s Law Is ImportantWhy Little’s Law Is Important

1. Intuitive understanding of FCFS queue behavior.Compute response time from demand parameters (λ, D).

Compute N: tells you how much storage is needed for the queue.

2. Notion of a saturated service center. If D=1: R = 1/(1- λ)

Response times rise rapidly with load and are unbounded.

At 50% utilization, a 10% increase in load increases R by 10%.

At 90% utilization, a 10% increase in load increases R by 10x.

3. Basis for predicting performance of queuing networks.

Cheap and easy “back of napkin” estimates of system performance based on observed behavior and proposed changes, e.g., capacity planning, “what if” questions.


Considering I/OConsidering I/O

In real systems, overall system performance is determined by the interactions of multiple service centers.

CPU

I/O device

I/O requestI/O completion

start (arrival rate λ)

exit (throughput λ until some

center saturates)

A queue network has K service centers.Each job makes Vk visits to center k demanding service Sk.

Each job’s total demand at center k is Dk = Vk*Sk

Forced Flow Law: Uk = λk Sk = λ Dk

(Arrivals/throughputs λk at different centers are proportional.)

Easy to predict Xk, Uk, λk, Rk and Nk

at each center: use Forced Flow Lawto predict arrival rate λk at eachcenter k, then apply Little’s Law to k.

Then:

R = Σ Vk*Rk


Digression: BottlenecksDigression: Bottlenecks

It is easy to see that the maximum throughput X of a system is reached as 1/λ approaches Dk for service center k with the highest demand Dk.

k is called the bottleneck center

Overall system throughput is limited by λk when Uk approaches 1.

This job is I/O bound. How much will performance improve if we double the speed of the CPU?Is it worth it?

To improve performance, always attack the bottleneck center!

CPU

I/O

S0 = 1Example 1:

S1 = 4

CPU

I/O

S0 = 4Example 2:

S1 = 4

Demands are evenly balanced. Will multiprogramming improve system throughput in this case?


Minimizing Response Time: SJF (STCF)Minimizing Response Time: SJF (STCF)

Shortest Job First (SJF) is provably optimal if the goal is to minimize average-case R.

Also called Shortest Time to Completion First (STCF) or Shortest Remaining Processing Time (SRPT).

Example: express lanes at the MegaMart

Idea: get short jobs out of the way quickly to minimize the number of jobs waiting while a long job runs.

Intuition: longest jobs do the least possible damage to the wait times of their competitors.

1 3 6

D=3D=2D=1

R = (1 + 3 + 6)/3 = 3.33

Behavior of SJF SchedulingBehavior of SJF Scheduling

Little’s Law does not hold if the scheduler considers a priori knowledge of service demands, as in SJF.

• With SJF, best-case R is not affected by the number of tasks in the system.

Shortest jobs budge to the front of the line.

• Worst-case R is unbounded, just like FCFS.

The queue is not “fair”: it is subject to starvation: the longest jobs are repeatedly denied the CPU resource while other more recent jobs continue to be fed.

• SJF sacrifices fairness to lower average response time.

SJF in PracticeSJF in PracticePure SJF is impractical when the scheduler cannot predict

D values, which is usually the case.

However, SJF has value in real systems:

• Many applications execute a sequence of short CPU bursts with I/O in between.

• E.g., interactive jobs block quickly for user input.

Goal: deliver the best response time to the user.

• E.g., jobs may go through periods of I/O-intensive activity.

Goal: request next I/O operation ASAP to keep devices busy and deliver the best overall throughput.

• Web servers know the size of returned documents: use SRPT to schedule the output link.

cpu scheduling. cpu scheduling 101 the cpu scheduler makes a sequence of “moves” that determines...

Documents

cpu scheduling slide

cpu scheduler

priority scheduling

kernel policy

base priority

priority value

priority levels

wakeup thread