cityofnotes.files.wordpress.com · web viewsep 06, 2017 · batch system often provides simple...

INDEX

Department : Computer Science And Engineering

Batch Year : 2016

Branch : BCA

Semester : 3rd semester

Subject : Introduction to Operating System

Faculty Name : Ms. Ritu Kadyan

Sr. No. Particular

1 Syllabus

2 Scope and objective

3 Main lesson plan

4 Detailed lesson plan

5 Notes as per lesson plan

6 Last year papers

7 Last year papers with answers

8 Assignments

9 References

BCA-201 : Introduction to Operating System

External Marks: 80Internal Marks: 20

Time: 3 hoursNote: Examiner will be required to set NINE questions in all. Question Number 1 will consist oftotal 8 parts (short-answer type questions) covering the entire syllabus and will carry 16 marks. Inaddition to the compulsory question there will be four units i.e. Unit-I to Unit-IV. Examiner will settwo questions from each Unit of the syllabus and each question will carry 16 marks. Student will berequired to attempt FIVE questions in all. Question Number 1 will be compulsory. In addition tocompulsory question, student will have to attempt four more questions selecting one question fromeach Unit.UNIT – IFundamentals of Operating system: Introduction to Operating System, its need andoperating System services, Early systems, Structures - Simple Batch, Multi programmed,timeshared, Personal Computer, Parallel, Distributed Systems, Real-Time Systems.Process Management: Process concept, Operation on processes, Cooperating Processes,Threads, and Inter-process Communication.UNIT-IICPU Scheduling: Basic concepts, Scheduling criteria, Scheduling algorithms : FCFS, SJF,Round Robin & Queue Algorithms.Deadlocks: Deadlock characterization, Methods for handling deadlocks, Banker’sAlgorithm.UNIT-IIIMemory Management: Logical versus Physical address space, Swapping, Contiguousallocation, Paging, Segmentation.Virtual Memory: Demand paging, Performance of demand paging, Page replacement, Pagereplacement algorithms, Thrashing.UNIT-IVFile management: File system Structure, Allocation methods: Contiguous allocation, Linkedallocation, Indexed allocation, Free space management: Bit vector, Linked list, Grouping,Counting.Device Management: Disk structure, Disk scheduling: FCFS, SSTF, SCAN, C-SCAN,LOOK, C-LOOK.Suggested Readings1. Abraham Silberschatz, Peter B. Galvin, “ Operating System Concepts”, Addison-Wesleypublishing. Co., 7th. Ed., 2004.2. Nutt Gary, "Operating Systems", Addison Wesley Publication, 2000.3. Andrew S. Tannenbaum, "Modern Operating Systems", Pearson Education Asia, SecondEdition, 2001.4. William Stallings, "Operating Systems, "Internals and Design Principles", 4 th Edition, PH,2001.5. Ekta Walia, "Operating Systems Concepts", Khanna Publishes, New Delhi, 2002.

Note: Latest and additional good books may be suggested and added from time to time.

SCOPE AND OBJECTIVE

Learning Objectives Of Iintroduction to operating system.

Teaches program development. Program execution steps and instructions can be learned from this subject O.S teaches method to i/o device access. It also teach us system access methods. Various scheduling schemes are also learnt to manage the time and improve the CPU

efficiency.

GANGA TECHNICAL CAMPUS, SOLDHABAHADURGARH (JHAJJAR) HARYANA.

LESSON PLAN

Department: Computer Science And Engineering Semester: 3rd

Subject: Introduction to Operating System

Semester Due Date: 24-11-2017

Commencement Date: 18-08-2017 Faculty Name Ms. Ritu Kadyan

Batch Code: 16-BCA Batch Year: 2016

Outline of Syllabus:

S.No. Unit Hours Required to Complete Unit

Period Required

1 Section A 9 9

2 Section B 9 10

3 Section C 11 11

4 Section D 11 11

Total 40 41

GANGA TECHNICAL CAMPUS, SOLDHABAHADURGARH (JHAJJAR) HARYANA.

DETAILED LESSON SHEET

Department:Computer Science And Engineering Semester:3rd

SR. No.

Topic Date Signature Remark

1 Fundamentals of Operating system: Introduction to Operating System, its need

2 operating System services,3 Early systems, Structures - Simple Batch,

Multi programmed4 timeshared, Personal Computer, Parallel5 Distributed Systems, Real-Time Systems6 Process Management: Process concept7 Operation on processes, 8 Threads, Cooperating Processes9 Inter-process Communication10 CPU Scheduling: Basic concepts11 Scheduling criteria, 12 Scheduling algorithms : FCFS13 SJF14 Round Robin 15 Queue Algorithms16 Deadlocks: Deadlock characterization, 17 Methods for handling deadlocks, 18 Banker’sAlgorithm19 Memory Management basics 20 Logical versus Physical address space, 21 Swapping, 22 Contiguous allocation23 Paging24 Segmentation.25 Virtual Memory26 Demand paging27 Performance of demand paging28 Page replacement29 Page replacement algorithms 30 Thrashing31 File management: File system Structure32 Allocation methods: Contiguous allocation

33 Linked allocation34 Indexed allocation35 Free space management: Bit vector36 Linked list37 Grouping,Counting.38 Device Management: Disk structure39 Disk scheduling: FCFS, SSTF40 SCAN, C-SCAN41 LOOK, C-LOOK

Unit-1

Introduction to operating system.

An operating system act as an intermediary between the user of a computer and computer hardware. The purpose of an operating system is to provide an environment in which a user can execute programs in a convenient and efficient manner. An operating system is a software that manages the computer hardware. The hardware must provide appropriate mechanisms to ensure the correct operation of the computer system and to prevent user programs from interfering with the proper operation of the system.

Definition of Operating System:

An Operating system is a program that controls the execution of application programs and acts as an interface between the user of a computer and the computer hardware.

A more common definition is that the operating system is the one program running at all times on the computer (usually called the kernel), with all else being applications programs.

An Operating system is concerned with the allocation of resources and services, such as memory, processors, devices and information. The Operating System correspondingly includes programs to manage these resources, such as a traffic controller, a scheduler, memory management module, I/O programs, and a file system.

Functions of Operating System

Operating system performs three functions: 1. Convenience: An OS makes a computer more convenient to use. 2. Efficiency: An OS allows the computer system resources to be used in an efficient manner. 3. Ability to Evolve: An OS should be constructed in such a way as to permit the effective development, testing and introduction of new system functions without at the same time interfering with service. Operating System as User Interface

Every general purpose computer consists of the hardware, operating system, system programs, application programs. The hardware consists of memory, CPU, ALU, I/O devices, peripheral device and storage device. System program consists of compilers, loaders, editors, OS etc. The application program consists of business program, database program. The fig. 1.1 shows the conceptual view of a computer system

Fig 1.1 Conceptual view of a computer system

Every computer must have an operating system to run other programs. The operating system and coordinates the use of the hardware among the various system programs and application program for a various users. It simply provides an environment within which other programs can do useful work.

The operating system is a set of special programs that run on a computer system that allow it to work properly. It performs basic tasks such as recognizing input from the keyboard, keeping track of files and directories on the disk, sending output to the display screen and controlling a peripheral devices .

History of Operating System

Operating systems have been evolving through the years. Following table

shows the history of OS.

Generation Year Electronic devices used Types of OS and

devices

First 1945 – 55 Vacuum tubes Plug boards

Second 1955 – 1965 Transistors Batch system

Third 1965 – 1980 Integrated Circuit (IC) Multiprogramming

Fourth Since 1980 Large scale integration PC

Operating System Services

An operating system provides services to programs and to the users of those

programs. It provided by one environment for the execution of programs. The services

provided by one operating system is difficult than other operating system. Operating system

makes the programming task easier. The common service provided by the operating system is

listed below.

1. Program execution 2.

I/O operation

3. File system manipulation 4.

Communications

5. Error detection

1. Program execution: Operating system loads a program into memory and executes the program.

The program must be able to end its execution, either normally or abnormally.

2. I/O Operation : I/O means any file or any specific I/O device. Program may require any I/O

device while running. So operating system must provide the required I/O.

of CPU. CPU scheduling routines consider the speed of the CPU, number of available registers

and other required factors.

B) Accounting :

Logs of each user must be kept. It is also necessary to keep record of which user how much and

what kinds of computer resources. This log is used for accounting purposes.

The accounting data may be used for statistics or for the billing. It also used to improve system

efficiency.

C) Protection :

Protection involves ensuring that all access to system resources is controlled. Security starts with

each user having to authenticate to the system, usually by means of a password. External I/O

devices must be also protected from invalid access attempts.

In protection, all the access to the resources is controlled. In multiprocess environment, it is

possible that, one process to interface with the other, or with the operating system, so

protection is required.

Batch System

Some computer systems only did one thing at a time. They had a list of the

computer system may be dedicated to a single program until its completion, or they may be

dynamically reassigned among a collection of active programs in different stages of execution.

Batch operating system is one where programs and data are collected together in a batch before

processing starts. A job is predefined sequence of commands, programs and data that are

combined in to a single unit called job.

Fig. 2.1 shows the memory layout for a simple batch system. Memory management in batch

system is very simple. Memory is usually divided into two areas : Operating system and user

program area.

Operating System

User Program Area

Resident Portion

Transient Program

Fig Memory Layout for a Simple Batch System

Scheduling is also simple in batch system. Jobs are processed in the order of

submission i.e first come first served fashion.

When job completed execution, its memory is releases and the output for the job gets copied into

an output spool for later printing.

Batch system often provides simple forms of file management. Access to file is serial. Batch

systems do not require any time critical device management. Batch systems are inconvenient for

users because users can not interact with their jobs to fix problems. There may also be long turn

around times. Example of this system id generating monthly bank statement.

Advantages o Batch System

Move much of the work of the operator to the computer.Increased performance since it was possible for job to start as soon as the previous job finished.

Disadvantages of Batch System

Turn around time can be large from user standpoint. Difficult to debug program.A job could enter an infinite loop.A job could corrupt the monitor, thus affecting pending jobs.Due to lack of protection scheme, one batch job can affect pending jobs.

Time Sharing SystemsMulti-programmed batched systems provide an environment where the

various system resources (for example, CPU, memory, peripheral devices) are utilized

effectively.

Time sharing, or multitasking, is a logical extension of multiprogramming. Multiple jobs are

executed by the CPU switching between them, but the

switches occur so frequently that the users may interact with each program while it is running.

An interactive, or hands-on, computer system provides on-line

communication between the user and the system. The user gives instructions to the operating

system or to a program directly, and receives an immediate response. Usually, a keyboard is

used to provide input, and a display screen (such as a cathode-ray tube (CRT) or monitor) is used

to provide output.

If users are to be able to access both data and code conveniently, an on-line file system must

be available. A file is a collection of related information defined by its creator. Batch systems

are appropriate for executing large jobs that need little interaction.

Time-sharing systems were developed to provide interactive use of a computer system

at a reasonable cost. A time-shared operating system uses CPU scheduling and

multiprogramming to provide each user with a small portion of a time-shared computer.

Each user has at least one separate program in memory. A program that is loaded into

memory and is executing is commonly referred to as a process. When a process executes, it

typically executes for only a short time before it either finishes or needs to perform I/O. I/O

may be interactive; that is, output is to a display for the user and input is from a user

keyboard. Since interactive I/O typically runs at people speeds, it may take a long time to

completed.

A time-shared operating system allows the many users to share the computer simultaneously.

Since each action or command in a time-shared system tends to be short, only a little

CPU time is needed for each user. As the system switches rapidly from one user to the

next, each user is given the impression that she has her own computer, whereas actually one

computer is being shared among many users.

Time-sharing operating systems are even more complex than are multi-

programmed operating systems. As in multiprogramming, several jobs must be kept

simultaneously in memory, which requires some form of memory management and

protection.

MultiprogrammingWhen two or more programs are in memory at the same time, sharing the processor is referred to the multiprogramming operating system. Multiprogramming assumes a single processor that is being shared. It increases CPU utilization by organizing jobs so that the CPU always has one to execute.Fig. 2.2 shows the memory layout for a multiprogramming system.

The operating system keeps several jobs in memory at a time. This set of jobs is a subset of the jobs kept in the job pool. The operating system picks and begins to execute one of the job in the memory.

Multiprogrammed system provide an environment in which the various system resources are utilized effectively, but they do not provide for user interaction with the computer system.

Jobs entering into the system are kept into the memory. Operating system picks the job and begins to execute one of the job in the memory. Having

several programs in memory at the same time requires some form of memory management.

Multiprogramming operating system monitors the state of all active programs and system resources. This ensures that the CPU is never idle unless there are no jobs.

Advantages1. High CPU utilization.2. It appears that many programs are allotted CPU almost simultaneously.

Disadvantages1. CPU scheduling is requires.2. To accommodate many jobs in memory, memory management is required.

SpoolingAcronym for simultaneous peripheral operations on line. Spooling refers to putting jobs in a buffer, a special area in memory or on a disk where a device can access them when it is ready.Spooling is useful because device access data that different rates. The buffer provides a waiting station where data can rest while the slower device catches up. Fig 2.3 shows the spooling.

DISK

CARD PRINTER READER rocess

Computer can perform I/O in parallel with computation, it becomes possible to have the computer read a deck of cards to a tape, drum or disk and to write out to a tape printer while it was computing. This process is called spooling.

The most common spooling application is print spooling. In print spooling, documents are loaded into a buffer and then the printer pulls them off the buffer at its own rate.Spooling is also used for processing data at remote sites. The CPU sends the data via communications path to a remote printer. Spooling overlaps the I/O of one job with the computation of other jobs.One difficulty with simple batch systems is that the computer still needs to read the decks of cards before it can begin to execute the job. This means that the CPU is idle during these relatively slow operations.Spooling batch systems were the first and are the simplest of the multiprogramming systems.

Advantage of Spooling1. The spooling operation uses a disk as a very large buffer.2. Spooling is however capable of overlapping I/O operation for one job with processor operations

for another job.

Essential Properties of the Operating System1. Batch : Jobs with similar needs are batched together and run through the computer as a group by an operator or automatic job sequencer. Performance is increased by attempting to keep CPU and I/O devices busy at all times through buffering , off line operation, spooling and multiprogramming. A Batch system is good for executing large jobs that need little interaction, it can be submitted and picked up latter.2. Time sharing : Uses CPU s scheduling and multiprogramming to provide economical interactive use of a system. The CPU switches rapidly from one user to another i.e. the CPU is shared between a number of interactive users. Instead of having a job defined by spooled card images, each program reads its next control instructions from the terminal and output is normally printed immediately on the screen.3. Interactive : User is on line with computer system and interacts with it via an interface. It is typically composed of many short transactions where the result of the next transaction may be unpredictable. Response time needs to be short since the user submits and waits for the result.

4. Real time system : Real time systems are usually dedicated, embedded systems. They typically read from and react to sensor data. The system must guarantee response to events within fixed periods of time to ensure correct performance.5. Distributed : Distributes computation among several physical processors. The processors do not share memory or a clock. Instead, each processor has its own local memory. They communicate with each other through various communication lines.

Concept of Process

A process is sequential program in execution. A process defines the fundamental unit of computation for the computer. Components of process are :

1. Object Program 2. Data3. Resources4. Status of the process execution.

Object program i.e. code to be executed. Data is used for executing the program. While executing the program, it may require some resources. Last component is used for verifying the status of the process execution. A process can run to completion only when all requested resources have been allocated to the process. Two or more processes could be executing the same program, each using their own data and resources.

Processes and ProgramsProcess is a dynamic entity, that is a program in execution. A process is a sequence of information executions. Process exists in a limited span of time. Two or more processes could be executing the same program, each using their own data and resources.Program is a static entity made up of program statement. Program contains the instructions. A program exists at single place in space and continues to exist. A program does not perform the action by itself.

Process State

When process executes, it changes state. Process state is defined as the current activity of the process. Fig. 3.1 shows the general form of the process state transition diagram. Process state contains five states. Each process is in one of the states. The states are listed below.

1. New 2. Ready3. Running 4. Waiting5. Terminated(exist)

1. New : A process that just been created.2. Ready : Ready processes are waiting to have the processor allocated to them by the operating

system so that they can run.3. Running : The process that is currently being executed. A running process possesses all the

resources needed for its execution, including the processor.4. Waiting : A process that can not execute until some event occurs such as the completion of an I/O

operation. The running process may become suspended by invoking an I/O module.5. Terminated : A process that has been released from the pool of executable processes by the

operating system.

Fig - Diagram for Process State

Whenever processes changes state, the operating system reacts by placing the process PCB in the list that corresponds to its new state. Only one

process can be running on any processor at any instant and many processes may be ready and waiting state.

Suspended Processes Characteristics of suspend process

1. Suspended process is not immediately available for execution. 2. The process may or may not be waiting on an event.3. For preventing the execution, process is suspend by OS, parent process, process itself and an

agent.4. Process may not be removed from the suspended state until the agent orders the removal.

Swapping is used to move all of a process from main memory to disk. When all the process by putting it in the suspended state and transferring it to disk.

Reasons for process suspension 1. Swapping2. Timing3. Interactive user request 4. Parent process request

Swapping : OS needs to release required main memory to bring in a process that is ready to execute.

Timing : Process may be suspended while waiting for the next time interval.

Interactive user request : Process may be suspended for debugging purpose by user.

Parent process request : To modify the suspended process or to coordinate the activity of various descendants.

Process Control Block (PCB)Each process contains the process control block (PCB). PCB is the data structure used by the operating system. Operating system groups all

information that needs about particular process. Fig. 3.2 shows the process control block.

Pointer

Process State

Process Number

Program Counter

CPU registers

Memory Allocation

Event Information

List of open files

Fig.- Process Control Block1. Pointer : Pointer points to another process control block. Pointer is used for maintaining the

scheduling list.2. Process State : Process state may be new, ready, running, waiting and so on. 3. Program Counter : It indicates the address of the next instruction to be

executed for this process.4. Event information : For a process in the blocked state this field contains information

concerning the event for which the process is waiting.5. CPU register : It indicates general purpose register, stack pointers, index registers and

accumulators etc. number of register and type of register totally depends upon the computer architecture.

6. Memory Management Information : This information may include the value of base and limit register. This information is useful for deallocating the memory when the process terminates.

7. Accounting Information : This information includes the amount of CPU and real time used, time limits, job or process numbers, account numbers etc. Process control block also includes the information about CPU scheduling, I/O resource management, file management information priority and so on.

The PCB simply serves as the repository for any information that may vary from process to process.When a process is created, hardware registers and flags are set to the values provided by the loader or linker. Whenever that process is suspended, the contents of the processor register are usually saved on the stack and the pointer to the related stack frame is stored in the PCB. In this way, the hardware state can be restored when the process is scheduled to run again.

Process Management / Process Scheduling

Multiprogramming operating system allows more than one process to be loaded into the executable memory at a time and for the loaded process to share the CPU using time multiplexing.The scheduling mechanism is the part of the process manager that handles the removal of the running process from the CPU and the selection of another process on the basis of particular strategy.

Scheduling QueuesWhen the process enters into the system, they are put into a job queue. This queue consists of all processes in the system. The operating system also has other queues.Device queue is a queue for which a list of processes waiting for a particular I/O device. Each device has its own device queue. Fig. 3.3 shows the queuing diagram of process scheduling. In the fig 3.3, queue is represented

by rectangular box. The circles represent the resources that serve the queues. The arrows indicate the flow of processes in the system.

Fig.- Queuing Diagram

Queues are of two types : ready queue and set of device queues. A newly arrived process is put in the ready queue. Processes are waiting in ready queue for allocating the CPU. Once the CPU is assigned to the process, then process will execute. While executing the process, one of the several events could occur.1. The process could issue an I/O request and then place in an I/O queue.2. The process could create new sub process and waits for its termination.3. The process could be removed forcibly from the CPU, as a result of interrupt and

put back in the ready queue.

Two State Process ModelProcess may be in one of two states : a)

Runningb) Not Running

when new process is created by OS, that process enters into the system in the running state.

Processes that are not running are kept in queue, waiting their turn to execute. Each entry in the queue is a printer to a particular process. Queue is implemented by using linked list. Use of dispatcher is as follows. When a process interrupted, that process is transferred in the waiting queue. If the process has completed or aborted, the process is discarded. In either case, the dispatcher then select a process from the queue to execute.

SchedulesSchedulers are of three types. 1.

Long Term Scheduler2. Short Term Scheduler3. Medium Term Scheduler

Long Term SchedulerIt is also called job scheduler. Long term scheduler determines which programs are admitted to the system for processing. Job scheduler selects processes from the queue and loads them into memory for execution. Process loads into the memory for CPU scheduler. The primary objective of the job scheduler is to provide a balanced mix of jobs, such as I/O bound and processor bound. It also controls the degree of multiprogramming. If the degree of multiprogramming is stable, then the average rate of process creation must be equal to the average departure rate of processes leaving the system.On same systems, the long term scheduler may be absent or minimal. Time-sharing operating systems have no long term scheduler. When process changes the state from new to ready, then there is a long term scheduler.

Short Term SchedulerIt is also called CPU scheduler. Main objective is increasing system performance in accordance with the chosen set of criteria. It is the change of ready state to running state of the process. CPU scheduler selects from among the processes that are ready to execute and allocates the CPU to one of them.

Short term scheduler also known as dispatcher, execute most frequently and makes the fine grained decision of which process to execute next. Short term scheduler is faster than long tern scheduler.

Medium Term SchedulerMedium term scheduling is part of the swapping function. It removes the processes from the memory. It reduces the degree of multiprogramming. The medium term scheduler is in charge of handling the swapped out-processes.

Medium term scheduler is shown in the Fig. 3.4

Fig 3.4 Queueing diagram with medium term scheduler

Running process may become suspended by making an I/O request. Suspended processes cannot make any progress towards completion. In this condition, to remove the process from memory and make space for other process. Suspended process is move to the secondary storage is called swapping, and the process is said to be swapped out or rolled out. Swapping may be necessary to improve the process mix.

Comparison between Scheduler

Sr. No.

Long Term Short Term Medium Term

1 It is job scheduler It is CPU Scheduler It is swapping

2 Speed is less than short term scheduler

Speed is very fast Speed is in between both

3 It controls degree of multiprogramming

Less control over degree of multiprogramming

Reduce the degree of multiprogramming.

4 Absent or minimal in time sharing system.

Minimal in time sharing system.

Time sharing system use medium term scheduler.

5 It select processes from pool and load them into memory for execution.

It select from among the processes that are ready to execute.

Process can be reintroduced into memory and its execution can be continued.

6 Process state is (New to Ready)

Process state is (Ready to Running)

-

7 Select a good process, mix of I/O bound and CPU bound.

Select a new process fora CPU quite

frequently.

-

Context SwitchWhen the scheduler switches the CPU from executing one process to executing another, the context switcher saves the content of all processor registers for the process being removed from the CPU in its process being removed from the CPU in its process descriptor. The context of a process is represented in the process control block of a process. Context switch time is pure overhead. Context switching can significantly affect performance, since modern computers have a lot of general and status registers to be saved.Content switch times are highly dependent on hardware support. Context switch requires ( n + m ) bXK time units to save the state of the processor with n general registers, assuming b store operations are required to save register and each store instruction requires K time units. Some hardware systems employ two or more sets of processor registers to reduce the amount of context switching time.When the process is switched the information stored is : 1. Program Counter2. Scheduling Information3. Base and limit register value 4. Currently used register5. Changed State 6. I/O State7. Accounting

Operation on ProcessesSeveral operations are possible on the process. Process must be created and deleted dynamically. Operating system must provide the environment for the process operation. We discuss the two main operations on processes.

1. Create a process2. Terminate a process

1)Create Process

Operating system creates a new process with the specified or default attributes and identifier. A process may create several new subprocesses. Syntax for creating new process is :

CREATE ( processed, attributes )Two names are used in the process they are parent process and child process. Parent process is a

creating process. Child process is created by the parent process. Child process may create another subprocess. So it forms a tree of processes. When operating system issues a CREATE system call, it obtains a new process control block from the pool of free memory, fills the fields with provided and default parameters, and insert the PCB into the ready list. Thus it makes the specified process eligible to run the process.

When a process is created, it requires some parameters. These are priority, level of privilege, requirement of memory, access right, memory protection information etc. Process will need certain resources, such as CPU time, memory, files and I/O devices to complete the operation. When process creates a subprocess, that subprocess may obtain its resources directly from the operating system. Otherwise it uses the resources of parent process. When a process creates a new process, two possibilities exist in terms of execution.

1. The parent continues to execute concurrently with its children.2. The parent waits until some or all of its children have terminated. For address space,

two possibilities occur:1. The child process is a duplicate of the parent process. 2. The child process has a program loaded into it.

2) Terminate a ProcessDELETE system call is used for terminating a process. A process may delete itself or by another process. A process can cause the termination of another process via an appropriate system call. The operating system reacts by reclaiming all resources allocated to the specified process, closing files opened by or for the process. PCB is also removed from its place of residence in the list and is returned to the free pool. The DELETE service is normally invoked as a part of orderly program termination.

Following are the resources for terminating the child process by parent process.1. The task given to the child is no longer required.2. Child has exceeded its usage of some of the resources that it has been allocated.3. Operating system does not allow a child to continue if its parent terminates.

Co-operating ProcessesCo-operating process is a process that can affect or be affected by the other processes while executing. If suppose any process is sharing data with other processes, then it is called co-operating process. Benefit of the co-operating processes are :

1. Sharing of information2. Increases computation speed 3. Modularity4. Convenience

Co-operating processes share the information : Such as a file, memory etc. System must provide an environment to allow concurrent access to these types of resources. Computation speed will increase if the computer has multiple processing elements are connected together. System is constructed in a modular fashion. System function is divided into number of modules.

Process 1 Printf(“abc”)

Process 2 Printf(“CBA”)

CBAabc abCcBA abcCBA

Behavior of co-operating processes is nondeterministic i.e. it depends on relative execution sequence and cannot be predicted a priori. Co-operating processes are also Reproducible. For example, suppose one process writes ―ABC‖, another writes ―CBA‖ can get different outputs, cannot tell what comes from which. Which process output first ―C‖ in ―ABCCBA‖. The subtle state sharing that occurs here via the terminal. Not just anything can happen, though. For example, ―AABBCC‖ cannot occur.

Introduction of Thread

A thread is a flow of execution through the process code, with its own

program counter, system registers and stack. Threads are a popular way to improve

application performance through parallelism. A thread is sometimes called a light weight

process.

Threads represent a software approach to improving performance of operating system

by reducing the over head thread is equivalent to a classical process. Each thread

belongs to exactly one process and no thread

can exist outside a process. Each thread represents a separate flow of control.

Fig. 4.1shows the single and multithreaded process.

Threads have been successfully used in implementing network servers. They also provide a

suitable foundation for parallel execution of applications on shared memory multiprocessors.

Types of Thread

Threads is implemented in two ways : 1. User

Level

2. Kernel Level

User Level Thread

In a user thread, all of the work of thread management is done by the

application and the kernel is not aware of the existence of threads. The thread library

contains code for creating and destroying threads, for passing message and data between

threads, for scheduling thread execution and for saving and restoring thread contexts. The

application begins with a single thread and begins running in that thread.

Fig. 4.2 shows the user level thread.

User Level Thread

User Space Thread

Library

Kernel Space

P

User level threads are generally fast to create and manage.

Advantage of user level thread over Kernel level thread :

1. Thread switching does not require Kernel mode privileges. 2. User level

thread can run on any operating system.

3. Scheduling can be application specific.

4. User level threads are fast to create and manage.

Disadvantages of user level thread :

1. In a typical operating system, most system calls are blocking.

2. Multithreaded application cannot take advantage of multiprocessing.

Kernel Level Threads

In Kernel level thread, thread management done by the Kernel. There is no

thread management code in the application area. Kernel threads are

supported directly by the operating system. Any application can be programmed to

be multithreaded. All of the threads within an application are supported within a single process.

The Kernel maintains context information for the process as a whole and for individuals threads

within the process.

Scheduling by the Kernel is done on a thread basis. The Kernel performs

thread creation, scheduling and management in Kernel space. Kernel threads are generally slower

to create and manage than the user threads.

Advantages of Kernel level thread:

1. Kernel can simultaneously schedule multiple threads from the same process on multiple process.

2. If one thread in a process is blocked, the Kernel can schedule another thread of the same process.

3. Kernel routines themselves can multithreaded.

Disadvantages:

1. Kernel threads are generally slower to create and manage than the user threads.

2. Transfer of control from one thread to another within same process requires a mode switch to the

Kernel.

Advantages of Thread

1. Thread minimize context switching time.

2. Use of threads provides concurrency within a process. 3. Efficient

communication.

4. Economy- It is more economical to create and context switch threads. 5. Utilization of

multiprocessor architectures –

The benefits of multithreading can be greatly increased in a multiprocessor

architecture.

Multithreading Models

Some operating system provide a combined user level thread and Kernel

level thread facility. Solaris is a good example of this combined approach. In a combined system,

multiple threads within the same application can run in parallel on multiple processors and a

blocking system call need not block the entire process.

Multithreading models are three types:

1. Many to many relationship.

2. Many to one relationship.

3. One to one relationship.

Many to Many Model

In this model, many user level threads multiplexes to the Kernel thread of

smaller or equal numbers. The number of Kernel threads may be specific to either a particular

application or a particular machine.

Fig. shows the many to many model. In this model, developers can

create as many user threads as necessary and the corresponding Kernel threads can run in

parallels on a multiprocessor.

User Level Thread

KernelThread K

K K

Many to One Model

Many to one model maps many user level threads to one Kernel level thread.

Thread management is done in user space. When thread makes a blocking

system call, the entire process will be blocks. Only one thread can access the Kernel at a time,

so multiple threads are unable to run in parallel on

multiprocessors.

Fig below shows the many to one model.

User Level Thread

Kernel Thread

K

If the user level thread libraries are implemented in the operating system, that system does

not support Kernel threads use the many to one relationship modes.

One to One Model

There is one to one relationship of user level thread to the kernel level

thread. Fig. below shows one to one relationship model. This model provides more

concurrency than the many to one model.

User Level Thread

Kernel Thread

K K K

It also another thread to run when a thread makes a blocking system call. It

support multiple thread to execute in parallel on microprocessors. Disadvantage

of this model is that creating user thread requires the corresponding Kernel thread.

OS/2, windows NT and windows 2000 use one to one relationship model.

Difference between User Level & Kernel Level Thread

Sr.

No

User Level Threads Kernel Level Thread

1 User level thread are faster to create and

manage.

Kernel level thread are slower to create

and manage.

2 Implemented by a thread library at

the user level.

Operating system support directly to

Kernel threads.

3 User level thread can run on any

operating system.

Kernel level threads are specific to the

operating system.

4 Support provided at the user level called

user level thread.

Support may be provided by kernel is called

Kernel level threads.

5 Multithread application cannot take

advantage of multiprocessing.

Kernel routines themselves can be

multithreaded.

Difference between Process and Thread

Sr.

No

Process Thread

1 Process is called heavy weight process. Thread is called light weight

process.

2 Process switching needs interface with

operating system.

Thread switching does not need

to call a operating system and

cause an interrupt to the Kernel.

3 In multiple process implementation each process

executes the same code but has its own memory and

file resources.

All threads can share same set of open

files, child processes.

4 If one server process is blocked no other server

process can execute until the first process

unblocked.

While one server thread is blocked

and waiting, second thread in the

same task could run.

5 Multiple redundant process uses more resources

than multiple threaded.

Multiple threaded process uses fewer

resources than multiple redundant

process.

6 In multiple process each process operates

independently of the others.

One thread can read, write or even

completely wipe out another threads

stack.

Threading Issues

System calls fork and exec is discussed here. In a multithreaded program

environment, fork and exec system calls is changed. Unix system have two version of

fork system calls. One call duplicates all threads and another that duplicates only the

thread that invoke the fork system call. Whether to use one or two version of fork

system call totally depends upon the application. Duplicating all threads is unnecessary,

if exec is called immediately after fork system call.

Thread cancellation is a process of thread terminates before its completion of

task. For example, in multiple thread environment, thread concurrently searching

through a database. If any one thread returns the result, the remaining thread

might be cancelled.

Thread cancellation is of two types.

1. Asynchronous cancellation

2. Synchronous cancellation

In asynchronous cancellation, one thread immediately terminates the target

thread. Deferred cancellation periodically check for terminate by target thread. It

also allow the target thread to terminate itself in an orderly fashion. Some resources are

allocated to the thread. If we cancel the thread, which update the data with other

thread. This problem may face by asynchronous cancellation system wide resource

are not free if threads cancelled asynchronously. Most of the operating system

allow a process or thread to be cancelled asynchronously.

Unit-II

CPU scheduling

Basic concepts-

Almost all programs have some alternating cycle of CPU number crunching and waiting for I/O of some kind. ( Even a simple fetch from memory takes a long time relative to CPU speeds. )

In a simple system running a single process, the time spent waiting for I/O is wasted, and those CPU cycles are lost forever.

A scheduling system allows one process to use the CPU while another is waiting for I/O, thereby making full use of otherwise lost CPU cycles.

The challenge is to make the overall system as "efficient" and "fair" as possible, subject to varying and often dynamic conditions, and where "efficient" and "fair" are somewhat subjective terms, often subject to shifting priority policies.

Alternating Sequence of CPU And I/O Bursts

CPU SchedulerSelects from among the processes in memory that are ready toexecute, and allocates the CPU to one of themCPU scheduling decisions may take place when a process:1. Switches from running to waiting state

2. Switches from running to ready state3. Switches from waiting to ready4. TerminatesScheduling under 1 and 4 is nonpreemptiveAll other scheduling is preemptive.

Dispatcher

Dispatcher module gives control of the CPU to the processselected by the short-term scheduler; this involves:switching contextswitching to user modejumping to the proper location in the user program to restart thatprogramDispatch latency – time it takes for the dispatcher to stop oneprocess and start another running

Scheduling Criteria

CPU utilization – keep the CPU as busy as possibleThroughput – # of processes that complete theirexecution per time unitTurnaround time – amount of time to execute a particularprocessWaiting time – amount of time a process has been waitingin the ready queueResponse time – amount of time it takes from when arequest was submitted until the first response isproduced, not output (for time-sharing environment).

Preemptive Scheduling

CPU scheduling decisions take place under one of four conditions: 1. When a process switches from the running state to the waiting state, such as for

an I/O request or invocation of the wait( ) system call. 2. When a process switches from the running state to the ready state, for example in

response to an interrupt. 3. When a process switches from the waiting state to the ready state, say at

completion of I/O or a return from wait( ). 4. When a process terminates.

For conditions 1 and 4 there is no choice - A new process must be selected.

For conditions 2 and 3 there is a choice - To either continue running the current process, or select a different one.

If scheduling takes place only under conditions 1 and 4, the system is said to be non-preemptive, or cooperative. Under these conditions, once a process starts running it keeps running, until it either voluntarily blocks or until it finishes. Otherwise the system is said to be preemptive.

Windows used non-preemptive scheduling up to Windows 3.x, and started using pre-emptive scheduling with Win95. Macs used non-preemptive prior to OSX, and pre-emptive since then. Note that pre-emptive scheduling is only possible on hardware that supports a timer interrupt.

Note that pre-emptive scheduling can cause problems when two processes share data, because one process may get interrupted in the middle of updating shared data structures. Chapter 6 will examine this issue in greater detail.

Preemption can also be a problem if the kernel is busy implementing a system call ( e.g. updating critical kernel data structures ) when the preemption occurs. Most modern UNIXes deal with this problem by making the process wait until the system call has either completed or blocked before allowing the preemption Unfortunately this solution is problematic for real-time systems, as real-time response can no longer be guaranteed.

Some critical sections of code protect themselves from concurrency problems by disabling interrupts before entering the critical section and re-enabling interrupts on exiting the section. Needless to say, this should only be done in rare situations, and only on very short pieces of code that will finish quickly, ( usually just a few machine instructions. )

Dispatcher

The dispatcher is the module that gives control of the CPU to the process selected by the scheduler. This function involves:

o Switching context. o Switching to user mode. o Jumping to the proper location in the newly loaded program.

The dispatcher needs to be as fast as possible, as it is run on every context switch. The time consumed by the dispatcher is known as dispatch latency.

Scheduling Criteria

There are several different criteria to consider when trying to select the "best" scheduling algorithm for a particular situation and environment, including:

o CPU utilization - Ideally the CPU would be busy 100% of the time, so as to waste 0 CPU cycles. On a real system CPU usage should range from 40% ( lightly loaded ) to 90% ( heavily loaded. )

o Throughput - Number of processes completed per unit time. May range from 10 / second to 1 / hour depending on the specific processes.

o Turnaround time - Time required for a particular process to complete, from submission time to completion. ( Wall clock time. )

o Waiting time - How much time processes spend in the ready queue waiting their turn to get on the CPU.

( Load average - The average number of processes sitting in the ready queue waiting their turn to get into the CPU. Reported in 1-minute, 5-minute, and 15-minute averages by "uptime" and "who". )

o Response time - The time taken in an interactive program from the issuance of a command to the commence of a response to that command.

In general one wants to optimize the average value of a criteria ( Maximize CPU utilization and throughput, and minimize all the others. ) However some times one wants to do something different, such as to minimize the maximum response time.

Sometimes it is most desirable to minimize the variance of a criteria than the actual value. I.e. users are more accepting of a consistent predictable system than an inconsistent one, even if it is a little bit slower.

Scheduling Algorithms

The following subsections will explain several common scheduling strategies, looking at only a single CPU burst each for a small number of processes. Obviously real systems have to deal with a lot more simultaneous processes executing their CPU-I/O burst cycles.

First-Come First-Serve Scheduling, FCFS

FCFS is very simple - Just a FIFO queue, like customers waiting in line at the bank or the post office or at a copying machine.

Unfortunately, however, FCFS can yield some very long average wait times, particularly if the first process to get there takes a long time. For example, consider the following three processes:

Process Burst Time P1 24P2 3P3 3

In the first Gantt chart below, process P1 arrives first. The average waiting time for the three processes is ( 0 + 24 + 27 ) / 3 = 17.0 ms.

In the second Gantt chart below, the same three processes have an average wait time of ( 0 + 3 + 6 ) / 3 = 3.0 ms. The total run time for the three bursts is the same, but in the second case two of the three finish much quicker, and the other process is only delayed by a short amount.

FCFS can also block the system in a busy dynamic system in another way, known as the convoy effect. When one CPU intensive process blocks the CPU, a number of I/O intensive processes can get backed up behind it, leaving the I/O devices idle. When the CPU hog finally relinquishes the CPU, then the I/O processes pass through the CPU quickly, leaving the CPU idle while everyone queues up for I/O, and then the cycle repeats itself when the CPU intensive process gets back to the ready queue.

Shortest-Job-First Scheduling, SJF

The idea behind the SJF algorithm is to pick the quickest fastest little job that needs to be done, get it out of the way first, and then pick the next smallest fastest job to do next.

( Technically this algorithm picks a process based on the next shortest CPU burst, not the overall process time. )

For example, the Gantt chart below is based upon the following CPU burst times, ( and the assumption that all jobs arrive at the same time. )

Process Burst Time P1 6P2 8P3 7P4 3

In the case above the average wait time is ( 0 + 3 + 9 + 16 ) / 4 = 7.0 ms, ( as opposed to 10.25 ms for FCFS for the same processes. )

SJF can be proven to be the fastest scheduling algorithm, but it suffers from one important problem: How do you know how long the next CPU burst is going to be?

o For long-term batch jobs this can be done based upon the limits that users set for their jobs when they submit them, which encourages them to set low limits, but risks their having to re-submit the job if they set the limit too low. However that does not work for short-term CPU scheduling on an interactive system.

o Another option would be to statistically measure the run time characteristics of jobs, particularly if the same tasks are run repeatedly and predictably. But once again that really isn't a viable option for short term CPU scheduling in the real world.

o A more practical approach is to predict the length of the next burst, based on some historical measurement of recent burst times for this process. One simple, fast, and relatively accurate method is the exponential average, which can be defined as follows. ( The book uses tau and t for their variables, but those are hard to distinguish from one another and don't work well in HTML. )

estimate[ i + 1 ] = alpha * burst[ i ] + ( 1.0 - alpha ) * estimate[ i ]

o In this scheme the previous estimate contains the history of all previous times, and alpha serves as a weighting factor for the relative importance of recent data versus past history. If alpha is 1.0, then past history is ignored, and we assume the next burst will be the same length as the last burst. If alpha is 0.0, then all measured burst times are ignored, and we just assume a constant burst time. Most commonly alpha is set at 0.5,

Fig- prediction of length of next cpu burst

SJF can be either preemptive or non-preemptive. Preemption occurs when a new process arrives in the ready queue that has a predicted burst time shorter than the time remaining in the process whose burst is currently on the CPU. Preemptive SJF is sometimes referred to as shortest remaining time first scheduling.

For example, the following Gantt chart is based upon the following data:

Process Arrival Time Burst Time P1 0 8

P2 1 4P3 2 9p4 3 5

The average wait time in this case is ( ( 5 - 3 ) + ( 10 - 1 ) + ( 17 - 2 ) ) / 4 = 26 / 4 = 6.5 ms. ( As opposed to 7.75 ms for non-preemptive SJF or 8.75 for FCFS. )

Priority Scheduling

Priority scheduling is a more general case of SJF, in which each job is assigned a priority and the job with the highest priority gets scheduled first. ( SJF uses the inverse of the next expected burst time as its priority - The smaller the expected burst, the higher the priority. )

Note that in practice, priorities are implemented using integers within a fixed range, but there is no agreed-upon convention as to whether "high" priorities use large numbers or small numbers. This book uses low number for high priorities, with 0 being the highest possible priority.

For example, the following Gantt chart is based upon these process burst times and priorities, and yields an average waiting time of 8.2 ms:

Process Burst Time PriorityP1 10 3P2 1 1P3 2 4P4 1 5P5 5 2

Priorities can be assigned either internally or externally. Internal priorities are assigned by the OS using criteria such as average burst time, ratio of CPU to I/O activity, system resource use, and other factors available to the kernel. External priorities are assigned by users, based on the importance of the job, fees paid, politics, etc.

Priority scheduling can be either preemptive or non-preemptive.

Priority scheduling can suffer from a major problem known as indefinite blocking, or starvation, in which a low-priority task can wait forever because there are always some other jobs around that have higher priority.

o If this problem is allowed to occur, then processes will either run eventually when the system load lightens ( at say 2:00 a.m. ), or will eventually get lost when the system is shut down or crashes. ( There are rumors of jobs that have been stuck for years. )

o One common solution to this problem is aging, in which priorities of jobs increase the longer they wait. Under this scheme a low-priority job will eventually get its priority raised high enough that it gets run.

Round Robin Scheduling

Round robin scheduling is similar to FCFS scheduling, except that CPU bursts are assigned with limits called time quantum.

When a process is given the CPU, a timer is set for whatever value has been set for a time quantum.

o If the process finishes its burst before the time quantum timer expires, then it is swapped out of the CPU just like the normal FCFS algorithm.

o If the timer goes off first, then the process is swapped out of the CPU and moved to the back end of the ready queue.

The ready queue is maintained as a circular queue, so when all processes have had a turn, then the scheduler gives the first process another turn, and so on.

RR scheduling can give the effect of all processors sharing the CPU equally, although the average wait time can be longer than with other scheduling algorithms. In the following example the average wait time is 5.66 ms.

Process Burst Time P1 24P2 3P3 3

The performance of RR is sensitive to the time quantum selected. If the quantum is large enough, then RR reduces to the FCFS algorithm; If it is very small, then each process gets 1/nth of the processor time and share the CPU equally.

BUT, a real system invokes overhead for every context switch, and the smaller the time quantum the more context switches there are. Most modern systems use time quantum between 10 and 100 milliseconds, and context switch times on the order of 10 microseconds, so the overhead is small relative to the time quantum.

Turn around time also varies with quantum time, in a non-apparent manner. Consider, for example the processes shown in Figure below

Fig- turnaround varies for each quantum.

In general, turnaround time is minimized if most processes finish their next cpu burst within one time quantum. For example, with three processes of 10 ms bursts each, the average turnaround time for 1 ms quantum is 29, and for 10 ms quantum it reduces to 20. However, if it is made too large, then RR just degenerates to FCFS. A rule of thumb is that 80% of CPU bursts should be smaller than the time quantum.

Multilevel Queue Scheduling

When processes can be readily categorized, then multiple separate queues can be established, each implementing whatever scheduling algorithm is most appropriate for that type of job, and/or with different parametric adjustments.

Scheduling must also be done between queues, that is scheduling one queue to get time relative to other queues. Two common options are strict priority ( no job in a lower priority queue runs until all higher priority queues are empty ) and round-robin ( each queue gets a time slice in turn, possibly of different sizes. )

Note that under this algorithm jobs cannot switch from queue to queue - Once they are assigned a queue, that is their queue until they finish.

Fig- multilevel queue scheduling

Multilevel Feedback-Queue Scheduling

Multilevel feedback queue scheduling is similar to the ordinary multilevel queue scheduling described above, except jobs may be moved from one queue to another for a variety of reasons:

o If the characteristics of a job change between CPU-intensive and I/O intensive, then it may be appropriate to switch a job from one queue to another.

o Aging can also be incorporated, so that a job that has waited for a long time can get bumped up into a higher priority queue for a while.

Multilevel feedback queue scheduling is the most flexible, because it can be tuned for any situation. But it is also the most complex to implement because of all the adjustable parameters. Some of the parameters which define one of these systems include:

o The number of queues. o The scheduling algorithm for each queue.

o The methods used to upgrade or demote processes from one queue to another. ( Which may be different. )

o The method used to determine which queue a process enters initially.

Fig-multilevel feedback queue scheduling

Deadlocks

Deadlock Characterization 1. Mutual exclusion: only one process at a time can use a resource. 2. Hold and wait: a process holding at least one resource is waiting to acquire additional resources held by other processes. 3. No preemption: a resource can be released only voluntarily by the process holding it, after that process has Completed its task. 4. Circular wait: there exists a set {P0, P1, …, P0} of waiting processes such that P0 is waiting for a resource that is held by P1, P1 is waiting for a resource that is held by P2, …, Pn–1 is waiting for a resource that is held by Pn, and P0 is waiting for a resource that is held by P0.

Deadlock can arise if four conditions hold simultaneously.

System Model

For the purposes of deadlock discussion, a system can be modeled as a collection of limited resources, which can be partitioned into different categories, to be allocated to a number of processes, each having different needs.

Resource categories may include memory, printers, CPUs, open files, tape drives, CD-ROMS, etc.

By definition, all the resources within a category are equivalent, and a request of this category can be equally satisfied by any one of the resources in that category. If this is not the case ( i.e. if there is some difference between the resources within a category ), then that category needs to be further divided into separate categories. For example, "printers" may need to be separated into "laser printers" and "color inkjet printers".

Some categories may have a single resource. In normal operation a process must request a resource before using it, and release it when it is

done, in the following sequence: 1. Request - If the request cannot be immediately granted, then the process must wait until

the resource(s) it needs become available. For example the system calls open( ), malloc( ), new( ), and request( ).

2. Use - The process uses the resource, e.g. prints to the printer or reads from the file. 3. Release - The process relinquishes the resource. so that it becomes available for other

processes. For example, close( ), free( ), delete( ), and release( ). For all kernel-managed resources, the kernel keeps track of what resources are free and which are

allocated, to which process they are allocated, and a queue of processes waiting for this resource to become available. Application-managed resources can be controlled using mutexes or wait( ) and signal( ) calls, ( i.e. binary or counting semaphores. )

A set of processes is deadlocked when every process in the set is waiting for a resource that is currently allocated to another process in the set ( and which can only be released when that other waiting process makes progress. )

Necessary Conditions

There are four conditions that are necessary to achieve deadlock: 1. Mutual Exclusion - At least one resource must be held in a non-sharable mode;

If any other process requests this resource, then that process must wait for the resource to be released.

2. Hold and Wait - A process must be simultaneously holding at least one resource and waiting for at least one resource that is currently being held by some other process.

3. No preemption - Once a process is holding a resource ( i.e. once its request has been granted ), then that resource cannot be taken away from that process until the process voluntarily releases it.

4. Circular Wait - A set of processes { P0, P1, P2, . . ., PN } must exist such that every P[ i ] is waiting for P[ ( i + 1 ) % ( N + 1 ) ]. ( Note that this condition implies the hold-and-wait condition, but it is easier to deal with the conditions if the four are considered separately. )

Resource-Allocation Graph

In some cases deadlocks can be understood more clearly through the use of Resource-Allocation Graphs, having the following properties:

o A set of resource categories, { R1, R2, R3, . . ., RN }, which appear as square nodes on the graph. Dots inside the resource nodes indicate specific instances of the resource. ( E.g. two dots might represent two laser printers. )

o A set of processes, { P1, P2, P3, . . ., PN } o Request Edges - A set of directed arcs from Pi to Rj, indicating that process Pi

has requested Rj, and is currently waiting for that resource to become available. o Assignment Edges - A set of directed arcs from Rj to Pi indicating that resource

Rj has been allocated to process Pi, and that Pi is currently holding resource Rj. o Note that a request edge can be converted into an assignment edge by reversing

the direction of the arc when the request is granted. ( However note also that request edges point to the category box, whereas assignment edges emanate from a particular instance dot within the box. )

o For example:

Figure - Resource allocation graph

If a resource-allocation graph contains no cycles, then the system is not deadlocked. ( When looking for cycles, remember that these are directed graphs. ) See the example in Figure 7.2 above.

If a resource-allocation graph does contain cycles AND each resource category contains only a single instance, then a deadlock exists.

If a resource category contains more than one instance, then the presence of a cycle in the resource-allocation graph indicates the possibility of a deadlock, but does not guarantee one. Consider, for example, Figures 7.3 and 7.4 below:

Figure - Resource allocation graph with a deadlock

Figure - Resource allocation graph with a cycle but no deadlock

Methods for Handling Deadlocks

Generally speaking there are three ways of handling deadlocks: 1. Deadlock prevention or avoidance - Do not allow the system to get into a deadlocked

state. 2. Deadlock detection and recovery - Abort a process or preempt some resources when

deadlocks are detected. 3. Ignore the problem all together - If deadlocks only occur once a year or so, it may be

better to simply let them happen and reboot as necessary than to incur the constant overhead and system performance penalties associated with deadlock prevention or detection. This is the approach that both Windows and UNIX take.

In order to avoid deadlocks, the system must have additional information about all processes. In particular, the system must know what resources a process will or may request in the future. ( Ranging from a simple worst-case maximum to a complete resource request and release plan for each process, depending on the particular algorithm. )

Deadlock detection is fairly straightforward, but deadlock recovery requires either aborting processes or preempting resources, neither of which is an attractive alternative.

If deadlocks are neither prevented nor detected, then when a deadlock occurs the system will gradually slow down, as more and more processes become stuck waiting for resources currently held by the deadlock and by other waiting processes. Unfortunately this slowdown can be indistinguishable from a general system slowdown when a real-time process has heavy computing needs.

Deadlock Prevention

Deadlocks can be prevented by preventing at least one of the four required conditions:

Mutual Exclusion

Shared resources such as read-only files do not lead to deadlocks. Unfortunately some resources, such as printers and tape drives, require exclusive access

by a single process.

Hold and Wait

To prevent this condition processes must be prevented from holding one or more resources while simultaneously waiting for one or more others. There are several possibilities for this:

o Require that all processes request all resources at one time. This can be wasteful of system resources if a process needs one resource early in its execution and doesn't need some other resource until much later.

o Require that processes holding resources must release them before requesting new resources, and then re-acquire the released resources along with the new ones in a single new request. This can be a problem if a process has partially completed an operation using a resource and then fails to get it re-allocated after releasing it.

o Either of the methods described above can lead to starvation if a process requires one or more popular resources.

No Preemption

Preemption of process resource allocations can prevent this condition of deadlocks, when it is possible.

o One approach is that if a process is forced to wait when requesting a new resource, then all other resources previously held by this process are implicitly released, ( preempted ), forcing this process to re-acquire the old resources along with the new resources in a single request, similar to the previous discussion.

o Another approach is that when a resource is requested and not available, then the system looks to see what other processes currently have those resources and are themselves blocked waiting for some other resource. If such a process is found, then some of their resources may get preempted and added to the list of resources for which the process is waiting.

o Either of these approaches may be applicable for resources whose states are easily saved and restored, such as registers and memory, but are generally not applicable to other devices such as printers and tape drives.

Circular Wait

One way to avoid circular wait is to number all resources, and to require that processes request resources only in strictly increasing ( or decreasing ) order.

In other words, in order to request resource Rj, a process must first release all Ri such that i >= j.

One big challenge in this scheme is determining the relative ordering of the different resources

Deadlock Avoidance

The general idea behind deadlock avoidance is to prevent deadlocks from ever happening, by preventing at least one of the aforementioned conditions.

This requires more information about each process, AND tends to lead to low device utilization. ( I.e. it is a conservative approach. )

In some algorithms the scheduler only needs to know the maximum number of each resource that a process might potentially use. In more complex algorithms the scheduler can also take advantage of the schedule of exactly what resources may be needed in what order.

When a scheduler sees that starting a process or granting resource requests may lead to future deadlocks, then that process is just not started or the request is not granted.

A resource allocation state is defined by the number of available and allocated resources, and the maximum requirements of all processes in the system.

Safe State

A state is safe if the system can allocate all resources requested by all processes ( up to their stated maximums ) without entering a deadlock state.

More formally, a state is safe if there exists a safe sequence of processes { P0, P1, P2, ..., PN } such that all of the resource requests for Pi can be granted using the resources currently allocated to Pi and all processes Pj where j < i. ( I.e. if all the processes prior to Pi finish and free up their resources, then Pi will be able to finish also, using the resources that they have freed up. )

If a safe sequence does not exist, then the system is in an unsafe state, which MAY lead to deadlock. ( All safe states are deadlock free, but not all unsafe states lead to deadlocks. )

Figure - Safe, unsafe, and deadlocked state spaces.

For example, consider a system with 12 tape drives, allocated as follows. Is this a safe state? What is the safe sequence?

Maximum Needs Current Allocation P0 10 5P1 4 2P2 9 2

What happens to the above table if process P2 requests and is granted one more tape drive?

Key to the safe state approach is that when a request is made for resources, the request is granted only if the resulting allocation state is a safe one.

Resource-Allocation Graph Algorithm

If resource categories have only single instances of their resources, then deadlock states can be detected by cycles in the resource-allocation graphs.

In this case, unsafe states can be recognized and avoided by augmenting the resource-allocation graph with claim edges, noted by dashed lines, which point from a process to a resource that it may request in the future.

In order for this technique to work, all claim edges must be added to the graph for any particular process before that process is allowed to request any resources. ( Alternatively, processes may only make requests for resources for which they have already established claim edges, and claim edges cannot be added to any process that is currently holding resources. )

When a process makes a request, the claim edge Pi->Rj is converted to a request edge. Similarly when a resource is released, the assignment reverts back to a claim edge.

This approach works by denying requests that would produce cycles in the resource-allocation graph, taking claim edges into effect.

Consider for example what happens when process P2 requests resource R2:

Figure - Resource allocation graph for deadlock avoidance

The resulting resource-allocation graph would have a cycle in it, and so the request cannot be granted.

Figure - An unsafe state in a resource allocation graph

Banker's Algorithm

For resource categories that contain more than one instance the resource-allocation graph method does not work, and more complex ( and less efficient ) methods must be chosen.

The Banker's Algorithm gets its name because it is a method that bankers could use to assure that when they lend out resources they will still be able to satisfy all their clients. ( A banker won't loan out a little money to start building a house unless they are assured that they will later be able to loan out the rest of the money to finish the house. )

When a process starts up, it must state in advance the maximum allocation of resources it may request, up to the amount available on the system.

When a request is made, the scheduler determines whether granting the request would leave the system in a safe state. If not, then the process must wait until the request can be granted safely.

The banker's algorithm relies on several key data structures: ( where n is the number of processes and m is the number of resource categories. )

o Available[ m ] indicates how many resources are currently available of each type. o Max[ n ][ m ] indicates the maximum demand of each process of each resource. o Allocation[ n ][ m ] indicates the number of each resource category allocated to

each process.

o Need[ n ][ m ] indicates the remaining resources needed of each type for each process. ( Note that Need[ i ][ j ] = Max[ i ][ j ] - Allocation[ i ][ j ] for all i, j. )

For simplification of discussions, we make the following notations / observations: o One row of the Need vector, Need[ i ], can be treated as a vector corresponding

to the needs of process i, and similarly for Allocation and Max. o A vector X is considered to be <= a vector Y if X[ i ] <= Y[ i ] for all i.

Safety Algorithm

In order to apply the Banker's algorithm, we first need an algorithm for determining whether or not a particular state is safe.

This algorithm determines if the current state of a system is safe, according to the following steps:

1. Let Work and Finish be vectors of length m and n respectively. Work is a working copy of the available resources, which will be

modified during the analysis. Finish is a vector of booleans indicating whether a particular

process can finish. ( or has finished so far in the analysis. ) Initialize Work to Available, and Finish to false for all elements.

2. Find an i such that both (A) Finish[ i ] == false, and (B) Need[ i ] < Work. This process has not finished, but could with the given available working set. If no such i exists, go to step 4.

3. Set Work = Work + Allocation[ i ], and set Finish[ i ] to true. This corresponds to process i finishing up and releasing its resources back into the work pool. Then loop back to step 2.

4. If finish[ i ] == true for all i, then the state is a safe state, because a safe sequence has been found.

( JTB's Modification: 1. In step 1. instead of making Finish an array of booleans initialized to

false, make it an array of ints initialized to 0. Also initialize an int s = 0 as a step counter.

2. In step 2, look for Finish[ i ] == 0. 3. In step 3, set Finish[ i ] to ++s. S is counting the number of finished

processes. 4. For step 4, the test can be either Finish[ i ] > 0 for all i, or s >= n. The

benefit of this method is that if a safe state exists, then Finish[ ] indicates one safe sequence ( of possibly many. ) )

Resource-Request Algorithm ( The Bankers Algorithm )

Now that we have a tool for determining if a particular state is safe or not, we are now ready to look at the Banker's algorithm itself.

This algorithm determines if a new request is safe, and grants it only if it is safe to do so.

When a request is made ( that does not exceed currently available resources ), pretend it has been granted, and then see if the resulting state is a safe one. If so, grant the request, and if not, deny the request, as follows:

1. Let Request[ n ][ m ] indicate the number of resources of each type currently requested by processes. If Request[ i ] > Need[ i ] for any process i, raise an error condition.

2. If Request[ i ] > Available for any process i, then that process must wait for resources to become available. Otherwise the process can continue to step 3.

3. Check to see if the request can be granted safely, by pretending it has been granted and then seeing if the resulting state is safe. If so, grant the request, and if not, then the process must wait until its request can be granted safely.The procedure for granting a request ( or pretending to for testing purposes ) is:

Available = Available - Request Allocation = Allocation + Request Need = Need - Request

An Illustrative Example

Consider the following situation:

And now consider what happens if process P1 requests 1 instance of A and 2 instances of C. ( Request[ 1 ] = ( 1, 0, 2 ) )

What about requests of ( 3, 3,0 ) by P4? or ( 0, 2, 0 ) by P0? Can these be safely granted? Why or why not?

Deadlock Detection

If deadlocks are not avoided, then another approach is to detect when they have occurred and recover somehow.

In addition to the performance hit of constantly checking for deadlocks, a policy / algorithm must be in place for recovering from deadlocks, and there is potential for lost work when processes must be aborted or have their resources preempted.

Single Instance of Each Resource Type

If each resource category has a single instance, then we can use a variation of the resource-allocation graph known as a wait-for graph.

A wait-for graph can be constructed from a resource-allocation graph by eliminating the resources and collapsing the associated edges, as shown in the figure below.

An arc from Pi to Pj in a wait-for graph indicates that process Pi is waiting for a resource that process Pj is currently holding.

Figure - (a) Resource allocation graph. (b) Corresponding wait-for graph

As before, cycles in the wait-for graph indicate deadlocks. This algorithm must maintain the wait-for graph, and periodically search it for cycles.

Several Instances of a Resource Type

The detection algorithm outlined here is essentially the same as the Banker's algorithm, with two subtle differences:

o In step 1, the Banker's Algorithm sets Finish[ i ] to false for all i. The algorithm presented here sets Finish[ i ] to false only if Allocation[ i ] is not zero. If the currently allocated resources for this process are zero, the algorithm sets Finish[ i ] to true. This is essentially assuming that IF all of the other processes can finish, then this process can finish also. Furthermore, this algorithm is specifically looking for which processes are involved in a deadlock situation, and a process that does not have any resources allocated cannot be involved in a deadlock, and so can be removed from any further consideration.

o Steps 2 and 3 are unchanged o In step 4, the basic Banker's Algorithm says that if Finish[ i ] == true for all i,

that there is no deadlock. This algorithm is more specific, by stating that if

Finish[ i ] == false for any process Pi, then that process is specifically involved in the deadlock which has been detected.

( Note: An alternative method was presented above, in which Finish held integers instead of booleans. This vector would be initialized to all zeros, and then filled with increasing integers as processes are detected which can finish. If any processes are left at zero when the algorithm completes, then there is a deadlock, and if not, then the integers in finish describe a safe sequence. To modify this algorithm to match this section of the text, processes with allocation = zero could be filled in with N, N - 1, N - 2, etc. in step 1, and any processes left with Finish = 0 in step 4 are the deadlocked processes. )

Consider, for example, the following state, and determine if it is currently deadlocked:

Now suppose that process P2 makes a request for an additional instance of type C, yielding the state shown below. Is the system now deadlocked?

Detection-Algorithm Usage

When should the deadlock detection be done? Frequently, or infrequently? The answer may depend on how frequently deadlocks are expected to occur, as well as

the possible consequences of not catching them immediately. ( If deadlocks are not removed immediately when they occur, then more and more processes can "back up" behind the deadlock, making the eventual task of unblocking the system more difficult and possibly damaging to more processes. )

There are two obvious approaches, each with trade-offs: 1. Do deadlock detection after every resource allocation which cannot be

immediately granted. This has the advantage of detecting the deadlock right away, while the minimum number of processes are involved in the deadlock. ( One might consider that the process whose request triggered the deadlock condition is the "cause" of the deadlock, but realistically all of the processes in the cycle are equally responsible for the resulting deadlock. ) The down side of this approach is the extensive overhead and performance hit caused by checking for deadlocks so frequently.

2. Do deadlock detection only when there is some clue that a deadlock may have occurred, such as when CPU utilization reduces to 40% or some other magic

number. The advantage is that deadlock detection is done much less frequently, but the down side is that it becomes impossible to detect the processes involved in the original deadlock, and so deadlock recovery can be more complicated and damaging to more processes.

3. ( As I write this, a third alternative comes to mind: Keep a historical log of resource allocations, since that last known time of no deadlocks. Do deadlock checks periodically ( once an hour or when CPU usage is low?), and then use the historical log to trace through and determine when the deadlock occurred and what processes caused the initial deadlock. Unfortunately I'm not certain that breaking the original deadlock would then free up the resulting log jam. )

Recovery From Deadlock

There are three basic approaches to recovery from deadlock: 1. Inform the system operator, and allow him/her to take manual intervention. 2. Terminate one or more processes involved in the deadlock 3. Preempt resources.

Process Termination

Two basic approaches, both of which recover resources allocated to terminated processes: o Terminate all processes involved in the deadlock. This definitely solves the

deadlock, but at the expense of terminating more processes than would be absolutely necessary.

o Terminate processes one by one until the deadlock is broken. This is more conservative, but requires doing deadlock detection after each step.

In the latter case there are many factors that can go into deciding which processes to terminate next:

1. Process priorities. 2. How long the process has been running, and how close it is to finishing. 3. How many and what type of resources is the process holding. ( Are they easy to

preempt and restore? ) 4. How many more resources does the process need to complete. 5. How many processes will need to be terminated 6. Whether the process is interactive or batch. 7. ( Whether or not the process has made non-restorable changes to any resource. )

Resource Preemption

When preempting resources to relieve deadlock, there are three important issues to be addressed:

1. Selecting a victim - Deciding which resources to preempt from which processes involves many of the same decision criteria outlined above.

2. Rollback - Ideally one would like to roll back a preempted process to a safe state prior to the point at which that resource was originally allocated to the process. Unfortunately it can be difficult or impossible to determine what such a safe state is, and so the only safe rollback is to roll back all the way back to the beginning. ( I.e. abort the process and make it start over. )

3. Starvation - How do you guarantee that a process won't starve because its resources are constantly being preempted? One option would be to use a priority system, and increase the priority of a process every time its resources get preempted. Eventually it should get a high enough priority that it won't get preempted any more.

Unit-III

Memory management

Basic Hardware

It should be noted that from the memory chips point of view, all memory accesses are equivalent. The memory hardware doesn't know what a particular part of memory is being used for, nor does it care. This is almost true of the OS as well, although not entirely.

The CPU can only access its registers and main memory. It cannot, for example, make direct access to the hard drive, so any data stored there must first be transferred into the main memory chips before the CPU can work with it. ( Device drivers communicate with their hardware via interrupts and "memory" accesses, sending short instructions for example to transfer data from the hard drive to a specified location in main memory. The disk controller monitors the bus for such instructions, transfers the data, and then notifies the CPU that the data is there with another interrupt, but the CPU never gets direct access to the disk. )

Memory accesses to registers are very fast, generally one clock tick, and a CPU may be able to execute more than one machine instruction per clock tick.

Memory accesses to main memory are comparatively slow, and may take a number of clock ticks to complete. This would require intolerable waiting by the CPU if it were not for an intermediary fast memory cache built into most modern CPUs. The basic idea of the cache is to transfer chunks of memory at a time from the main memory to the cache, and then to access individual memory locations one at a time from the cache.

User processes must be restricted so that they only access memory locations that "belong" to that particular process. This is usually implemented using a base register and a limit register for each process, as shown in Figures 8.1 and 8.2 below. Every memory access made by a user process is checked against these two registers, and if a memory access is attempted outside the valid range, then a fatal error is generated. The OS obviously has access to all existing memory locations, as this is necessary to swap users' code and data in and out of memory. It should also be obvious that changing the contents of the base and limit registers is a privileged activity, allowed only to the OS kernel.

Figure - A base and a limit register define a logical addresss space

Figure - Hardware address protection with base and limit registers

Address Binding

User programs typically refer to memory addresses with symbolic names such as "i", "count", and "averageTemperature". These symbolic names must be mapped or bound to physical memory addresses, which typically occurs in several stages:

o Compile Time - If it is known at compile time where a program will reside in physical memory, then absolute code can be generated by the compiler, containing actual physical addresses. However if the load address changes at some later time, then the program will have to be recompiled. DOS .COM programs use compile time binding.

o Load Time - If the location at which a program will be loaded is not known at compile time, then the compiler must generate relocatable code, which references addresses relative to the start of the program. If that starting address changes, then the program must be reloaded but not recompiled.

o Execution Time - If a program can be moved around in memory during the course of its execution, then binding must be delayed until execution time. This requires special hardware, and is the method implemented by most modern OSes.

Figure 8.3 shows the various stages of the binding processes and the units involved in each stage:

Figure - Multistep processing of a user program

Logical Versus Physical Address Space

The address generated by the CPU is a logical address, whereas the address actually seen by the memory hardware is a physical address.

Addresses bound at compile time or load time have identical logical and physical addresses.

Addresses created at execution time, however, have different logical and physical addresses.

o In this case the logical address is also known as a virtual address, and the two terms are used interchangeably by our text.

o The set of all logical addresses used by a program composes the logical address space, and the set of all corresponding physical addresses composes the physical address space.

The run time mapping of logical to physical addresses is handled by the memory-management unit, MMU.

o The MMU can take on many forms. One of the simplest is a modification of the base-register scheme described earlier.

o The base register is now termed a relocation register, whose value is added to every memory request at the hardware level.

Note that user programs never see physical addresses. User programs work entirely in logical address space, and any memory references or manipulations are done using purely logical addresses. Only when the address gets sent to the physical memory chips is the physical memory address generated.

Figure - Dynamic relocation using a relocation register

Dynamic Loading

Rather than loading an entire program into memory at once, dynamic loading loads up each routine as it is called. The advantage is that unused routines need never be loaded, reducing total memory usage and generating faster program startup times. The downside is the added complexity and overhead of checking to see if a routine is loaded every time it is called and then then loading it up if it is not already loaded.

Dynamic Linking and Shared Libraries

With static linking library modules get fully included in executable modules, wasting both disk space and main memory usage, because every program that included a certain routine from the library would have to have their own copy of that routine linked into their executable code.

With dynamic linking, however, only a stub is linked into the executable module, containing references to the actual library module linked in at run time.

o This method saves disk space, because the library routines do not need to be fully included in the executable modules, only the stubs.

o We will also learn that if the code section of the library routines is reentrant, ( meaning it does not modify the code while it runs, making it safe to re-enter it ), then main memory can be saved by loading only one copy of dynamically linked routines into memory and sharing the code amongst all processes that are concurrently using it. ( Each process would have their own copy of the data section of the routines, but that may be small relative to the code segments. ) Obviously the OS must manage shared routines in memory.

o An added benefit of dynamically linked libraries ( DLLs, also known as shared libraries or shared objects on UNIX systems ) involves easy upgrades and updates. When a program uses a routine from a standard library and the routine changes, then the program must be re-built ( re-linked ) in order to incorporate the changes. However if DLLs are used, then as long as the stub doesn't change, the program can be updated merely by loading new versions of the DLLs onto

the system. Version information is maintained in both the program and the DLLs, so that a program can specify a particular version of the DLL if necessary.

o In practice, the first time a program calls a DLL routine, the stub will recognize the fact and will replace itself with the actual routine from the DLL library. Further calls to the same routine will access the routine directly and not incur the overhead of the stub access.

o

Swapping

A process must be loaded into memory in order to execute. If there is not enough memory available to keep all running processes in memory at the same

time, then some processes who are not currently using the CPU may have their memory swapped out to a fast local disk called the backing store.

Standard Swapping

If compile-time or load-time address binding is used, then processes must be swapped back into the same memory location from which they were swapped out. If execution time binding is used, then the processes can be swapped back into any available location.

Swapping is a very slow process compared to other operations. For example, if a user process occupied 10 MB and the transfer rate for the backing store were 40 MB per second, then it would take 1/4 second ( 250 milliseconds ) just to do the data transfer. Adding in a latency lag of 8 milliseconds and ignoring head seek time for the moment, and further recognizing that swapping involves moving old data out as well as new data in, the overall transfer time required for this swap is 512 milliseconds, or over half a second. For efficient processor scheduling the CPU time slice should be significantly longer than this lost transfer time.

To reduce swapping transfer overhead, it is desired to transfer as little information as possible, which requires that the system know how much memory a process is using, as opposed to how much it might use. Programmers can help with this by freeing up dynamic memory that they are no longer using.

It is important to swap processes out of memory only when they are idle, or more to the point, only when there are no pending I/O operations. ( Otherwise the pending I/O operation could write into the wrong process's memory space. ) The solution is to either swap only totally idle processes, or do I/O operations only into and out of OS buffers, which are then transferred to or from process's main memory as a second step.

Most modern OSes no longer use swapping, because it is too slow and there are faster alternatives available. ( e.g. Paging. ) However some UNIX systems will still invoke swapping if the system gets extremely full, and then discontinue swapping when the load reduces again. Windows 3.1 would use a modified version of swapping that was somewhat controlled by the user, swapping process's out if necessary and then only swapping them back in when the user focused on that particular window.

Figure - Swapping of two processes using a disk as a backing store

Contiguous Memory Allocation

One approach to memory management is to load each process into a contiguous space. The operating system is allocated space first, usually at either low or high memory locations, and then the remaining available memory is allocated to processes as needed. ( The OS is usually loaded low, because that is where the interrupt vectors are located, but on older systems part of the OS was loaded high to make more room in low memory ( within the 640K barrier ) for user processes. )

Memory Protection ( was Memory Mapping and Protection )

The system shown in Figure 8.6 below allows protection against user programs accessing areas that they should not, allows programs to be relocated to different memory starting addresses as needed, and allows the memory space devoted to the OS to grow or shrink dynamically as needs change.

Figure - Hardware support for relocation and limit registers

Memory Allocation

One method of allocating contiguous memory is to divide all available memory into equal sized partitions, and to assign each process to their own partition. This restricts both the number of simultaneous processes and the maximum size of each process, and is no longer used.

An alternate approach is to keep a list of unused ( free ) memory blocks ( holes ), and to find a hole of a suitable size whenever a process needs to be loaded into memory. There are many different strategies for finding the "best" allocation of memory to processes, including the three most commonly discussed:

1. First fit - Search the list of holes until one is found that is big enough to satisfy the request, and assign a portion of that hole to that process. Whatever fraction of the hole not needed by the request is left on the free list as a smaller hole. Subsequent requests may start looking either from the beginning of the list or from the point at which this search ended.

2. Best fit - Allocate the smallest hole that is big enough to satisfy the request. This saves large holes for other process requests that may need them later, but the resulting unused portions of holes may be too small to be of any use, and will therefore be wasted. Keeping the free list sorted can speed up the process of finding the right hole.

3. Worst fit - Allocate the largest hole available, thereby increasing the likelihood that the remaining portion will be usable for satisfying future requests.

Simulations show that either first or best fit are better than worst fit in terms of both time and storage utilization. First and best fits are about equal in terms of storage utilization, but first fit is faster.

Fragmentation

All the memory allocation strategies suffer from external fragmentation, though first and best fits experience the problems more so than worst fit. External fragmentation means that the available memory is broken up into lots of little pieces, none of which is big enough to satisfy the next memory requirement, although the sum total could.

The amount of memory lost to fragmentation may vary with algorithm, usage patterns, and some design decisions such as which end of a hole to allocate and which end to save on the free list.

Statistical analysis of first fit, for example, shows that for N blocks of allocated memory, another 0.5 N will be lost to fragmentation.

Internal fragmentation also occurs, with all memory allocation strategies. This is caused by the fact that memory is allocated in blocks of a fixed size, whereas the actual memory needed will rarely be that exact size. For a random distribution of memory requests, on the average 1/2 block will be wasted per memory request, because on the average the last allocated block will be only half full.

o Note that the same effect happens with hard drives, and that modern hardware gives us increasingly larger drives and memory at the expense of ever larger block sizes, which translates to more memory lost to internal fragmentation.

o Some systems use variable size blocks to minimize losses due to internal fragmentation.

If the programs in memory are relocatable, ( using execution-time address binding ), then the external fragmentation problem can be reduced via compaction, i.e. moving all processes down to one end of physical memory. This only involves updating the relocation register for each process, as all internal work is done using logical addresses.

Another solution as we will see in upcoming sections is to allow processes to use non-contiguous blocks of physical memory, with a separate relocation register for each block.

Segmentation

Basic Method

Most users ( programmers ) do not think of their programs as existing in one continuous linear address space.

Rather they tend to think of their memory in multiple segments, each dedicated to a particular use, such as code, data, the stack, the heap, etc.

Memory segmentation supports this view by providing addresses with a segment number ( mapped to a segment base address ) and an offset from the beginning of that segment.

For example, a C compiler might generate 5 segments for the user code, library code, global ( static ) variables, the stack, and the heap, as shown in Figure 8.7:

Figure Programmer's view of a program.

Segmentation Hardware

A segment table maps segment-offset addresses to physical addresses, and simultaneously checks for invalid addresses, using a system similar to the page tables and relocation base registers discussed previously. ( Note that at this point in the discussion of segmentation, each segment is kept in contiguous memory and may be of different sizes, but that segmentation can also be combined with paging as we shall see shortly. )

Figure Segmentation hardware

Figure Example of segmentation

Paging

Paging is a memory management scheme that allows processes physical memory to be discontinuous, and which eliminates problems with fragmentation by allocating memory in equal sized blocks known as pages.

Paging eliminates most of the problems of the other methods discussed previously, and is the predominant memory management technique used today.

Basic Method

The basic idea behind paging is to divide physical memory into a number of equal sized blocks called frames, and to divide a programs logical memory space into blocks of the same size called pages.

Any page ( from any process ) can be placed into any available frame. The page table is used to look up what frame a particular page is stored in at the moment. In the

following example, for instance, page 2 of the program's logical memory is currently stored in frame 3 of physical memory:

Figure - Paging hardware

1Figure - Paging model of logical and physical memory

A logical address consists of two parts: A page number in which the address resides, and an offset from the beginning of that page. ( The number of bits in the page number limits how many pages a single process can address. The number of bits in the offset determines the maximum size of each page, and should correspond to the system frame size. )

The page table maps the page number to a frame number, to yield a physical address which also has two parts: The frame number and the offset within that frame. The number of bits in the frame number determines how many frames the system can address, and the number of bits in the offset determines the size of each frame.

Page numbers, frame numbers, and frame sizes are determined by the architecture, but are typically powers of two, allowing addresses to be split at a certain number of bits. For example, if the logical address size is 2^m and the page size is 2^n, then the high-order m-n bits of a logical address designate the page number and the remaining n bits represent the offset.

Note also that the number of bits in the page number and the number of bits in the frame number do not have to be identical. The former determines the address range of the logical address space, and the latter relates to the physical address space.

( DOS used to use an addressing scheme with 16 bit frame numbers and 16-bit offsets, on hardware that only supported 24-bit hardware addresses. The result was a resolution of starting frame addresses finer than the size of a single frame, and multiple frame-offset combinations that mapped to the same physical hardware address. )

Consider the following micro example, in which a process has 16 bytes of logical memory, mapped in 4 byte pages into 32 bytes of physical memory. ( Presumably some other processes would be consuming the remaining 16 bytes of physical memory. )

Figure - Paging example for a 32-byte memory with 4-byte pages

Note that paging is like having a table of relocation registers, one for each page of the logical memory.

There is no external fragmentation with paging. All blocks of physical memory are used, and there are no gaps in between and no problems with finding the right sized hole for a particular chunk of memory.

There is, however, internal fragmentation. Memory is allocated in chunks the size of a page, and on the average, the last page will only be half full, wasting on the average half a page of memory per process. ( Possibly more, if processes keep their code and data in separate pages. )

Larger page sizes waste more memory, but are more efficient in terms of overhead. Modern trends have been to increase page sizes, and some systems even have multiple size pages to try and make the best of both worlds.

Page table entries ( frame numbers ) are typically 32 bit numbers, allowing access to 2^32 physical page frames. If those frames are 4 KB in size each, that translates to 16 TB of addressable physical memory. ( 32 + 12 = 44 bits of physical address space. )

When a process requests memory ( e.g. when its code is loaded in from disk ), free frames are allocated from a free-frame list, and inserted into that process's page table.

Processes are blocked from accessing anyone else's memory because all of their memory requests are mapped through their page table. There is no way for them to generate an address that maps into any other process's memory space.

The operating system must keep track of each individual process's page table, updating it whenever the process's pages get moved in and out of memory, and applying the correct page table when processing system calls for a particular process. This all increases the overhead involved when swapping processes in and out of the CPU. ( The currently active page table must be updated to reflect the process that is currently running. )

Figure - Free frames (a) before allocation and (b) after allocation

.

Figure - Valid (v) or invalid (i) bit in page table

Shared Pages

Paging systems can make it very easy to share blocks of memory, by simply duplicating page numbers in multiple page frames. This may be done with either code or data.

If code is reentrant, that means that it does not write to or change the code in any way ( it is non self-modifying ), and it is therefore safe to re-enter it. More importantly, it means the code can be shared by multiple processes, so long as each has their own copy of the data and registers, including the instruction register.

In the example given below, three different users are running the editor simultaneously, but the code is only loaded into memory ( in the page frames ) one time.

Some systems also implement shared memory in this fashion.

Figure - Sharing of code in a paging environment

Structure of the Page Table

Hierarchical Paging

Most modern computer systems support logical address spaces of 2^32 to 2^64. With a 2^32 address space and 4K ( 2^12 ) page sizes, this leave 2^20 entries in the page

table. At 4 bytes per entry, this amounts to a 4 MB page table, which is too large to reasonably keep in contiguous memory. ( And to swap in and out of memory with each process switch. ) Note that with 4K pages, this would take 1024 pages just to hold the page table!

One option is to use a two-tier paging system, i.e. to page the page table. For example, the 20 bits described above could be broken down into two 10-bit page

numbers. The first identifies an entry in the outer page table, which identifies where in memory to find one page of an inner page table. The second 10 bits finds a specific entry in that inner page table, which in turn identifies a particular frame in physical memory. ( The remaining 12 bits of the 32 bit logical address are the offset within the 4K frame. )

Figure A two-level page-table scheme

Figure - Address translation for a two-level 32-bit paging architecture

VAX Architecture divides 32-bit addresses into 4 equal sized sections, and each page is 512 bytes, yielding an address form of:

With a 64-bit logical address space and 4K pages, there are 52 bits worth of page numbers, which is still too many even for two-level paging. One could increase the paging level, but with 10-bit page tables it would take 7 levels of indirection, which would be prohibitively slow memory access. So some other approach must be used.

64-bits Two-tiered leaves 42 bits in outer table

Going to a fourth level still leaves 32 bits in the outer table.

Hashed Page Tables

One common data structure for accessing data that is sparsely distributed over a broad range of possible values is with hash tables. Figure 8.16 below illustrates a hashed page table using chain-and-bucket hashing:

Figure - Hashed page table

Inverted Page Tables

Another approach is to use an inverted page table. Instead of a table listing all of the pages for a particular process, an inverted page table lists all of the pages currently loaded in memory, for all processes. ( I.e. there is one entry per frame instead of one entry per page. )

Access to an inverted page table can be slow, as it may be necessary to search the entire table in order to find the desired page ( or to discover that it is not there. ) Hashing the table can help speedup the search process.

Inverted page tables prohibit the normal method of implementing shared memory, which is to map multiple logical pages to a common physical frame. ( Because each frame is now mapped to one and only one process. )

Figure - Inverted page table

Virtual memory

A cache stores a subset of the addresss space of RAM. An address space is the set of valid addresses. Thus, for each address in cache, there is a corresponding address in RAM. This subset of addresses (and corresponding copy of data) changes over time, based on the behavior of your program.

Cache is used to keep the most commonly used sections of RAM in the cache, where it can be accessed quickly. This is necessary because CPU speeds increase much faster than speed of memory access. If we could access RAM at 3 GHz, there wouldn't be any need for cache, because RAM could keep up. Because it can't keep up, we use cache.

What if we wanted more RAM than we had available. For example, we might have 1 M of RAM, what if we wanted 10 M? How could we manage?

One way to extend the amount of memory accessible by a program is to use disk. Thus, we can use 10 Megs of disk space. At any time, only 1 Meg resides in RAM.

In effect, RAM acts like cache for disk.

This idea of extending memory is called virtual memory. It's called "virtual" only because it's not RAM. It doesn't mean it's fake.

The real problem with disk is that it's really, really slow to access. If registers can be accessed in 1 nanosecond, and cache in 5 ns and RAM in about 100 ns, then disk is accessed in fractions of seconds. It can be a million times slower to access disk than a register.

The advantage of disk is it's easy to get lots of disk space for a small cost.

Still, becaues disk is so slow to access, we want to avoid accessing disk unnecessarily.

Uses of Virtual Memory

Virtual memory is an old concept. Before computers had cache, they had virtual memory. For a long time, virtual memory only appeared on mainframes. Personal computers in the 1980s did not use virtual memory. In fact, many good ideas that were in common use in the UNIX operating systems didn't appear until the mid 1990s in personal computer operating systems (pre-emptive multitasking and virtual memory).

Initially, virtual memory meant the idea of using disk to extend RAM. Programs wouldn't have to care whether the memory was "real" memory (i.e., RAM) or disk. The operating system and hardware would figure that out.

Later on, virtual memory was used as a means of memory protection. Every program uses a range of addressed called the address space.

The assumption of operating systems developers is that any user program can not be trusted. User programs will try to destroy themselves, other user programs, and the operating system itself. That seems like such a negative view, however, it's how operating systems are designed. It's not necessary that programs have to be deliberately malicious. Programs can be accidentally malicious (modify the data of a pointer pointing to garbage memory).

Virtual memory can help there too. It can help prevent programs from interfering with other programs.

Occasionally, you want programs to cooperate, and share memory. Virtual memory can also help in that respect.

How Virtual Memory Works

When a computer is running, many programs are simulataneously sharing the CPU. Each running program, plus the data structures needed to manage it, is called a process.

Each process is allocated an address space. This is a set of valid addresses that can be used. This address space can be changed dynamically. For example, the program might request additional memory (from dynamic memory allocation) from the operating system.

If a process tries to access an address that is not part of its address space, an error occurs, and the operating system takes over, usually killing the process (core dumps, etc).

How does virtual memory play a role? As you run a program, it generates addresses. Addresses are generated (for RISC machines) in one of three ways:

A load instruction A store instruction Fetching an instruction

Load/store create data addresses, while fetching an instruction creates instruction addresses. Of course, RAM doesn't distinguish between the two kinds of addresses. It just sees it as an address.

Each address generated by a program is considered virtual. It must be translated to a real physical address. Thus, address tranlation is occuring all the time. As you might imagine, this must be handled in hardware, if it's to be done efficiently.

You might think translating each address from virtual to physical is a crazy idea, because of how slow it is. However, you get memory protection from address translation, so it's worth the hardware needed to get memory protection.

Demand paging:-

In computer operating systems, demand paging (as opposed to anticipatory paging) is a method of virtual memory management. In a system that uses demand paging, the operating system copies a disk page into physical memory only if an attempt is made to access it and that page is not already in memory (i.e., if a page fault occurs). It follows that a process begins execution with none of its pages in physical memory, and many page faults will occur until most of a process's working set of pages is located in physical memory. This is an example of a lazy loading technique.Demand paging follows that pages should only be brought into memory if the executing process demands them. This is often referred to as lazy evaluation as only those pages demanded by the process are swapped from secondary storage to main memory. Contrast this to pure swapping, where all memory for a process is swapped from secondary storage to main memory during the process startup.Commonly, to achieve this process a page table implementation is used. The page table maps logical memory to physical memory. The page table uses a bitwise operator to mark if a page is valid or invalid. A valid page is one that currently resides in main memory. An invalid page is one that currently resides in secondary memory. When a process tries to access a page, the following steps are generally followed:

Attempt to access page.

If page is valid (in memory) then continue processing instruction as normal.

If page is invalid then a page-fault trap occurs.

Check if the memory reference is a valid reference to a location on secondary memory. If not, the process is terminated (illegal memory access). Otherwise, we have to page in the required page.

Schedule disk operation to read the desired page into main memory.

Restart the instruction that was interrupted by the operating system trap.

Advantages[edit]Demand paging, as opposed to loading all pages immediately:

Only loads pages that are demanded by the executing process.

As there is more space in main memory, more processes can be loaded reducing context switching time which utilizes large amounts of resources.

Less loading latency occurs at program startup, as less information is accessed from secondary storage and less information is brought into main memory.

As main memory is expensive compared to secondary memory, this technique helps significantly reduce the bill of material (BOM) cost in smart phones for example. Symbian OS had this feature.

Disadvantages-Individual programs face extra latency when they access a page for the first time.

Programs running on low-cost, low-power embedded systems may not have a memory management unit that supports page replacement.

Memory management with page replacement algorithms becomes slightly more complex.

Possible security risks, including vulnerability to timing attacks; see Percival 2005 Cache Missing for Fun and Profit .

Thrashing which may occur due to repeated page faults.

Performance of demand paging:-

Demand paging can significantly affect the performance of a computer system. Let's compute the effective access time for a demand-paged memory.

o For most computer systems, the memory-access time, denoted , ranges from 10 to 200 nanoseconds.

o As long as we have no page faults, the effective access time is equal to the memory access time.

o If, however a page fault occurs, we must first read the relevant page from disk and then access the desired word.

o Let be the probability of a page fault ( ). We would

expect to be close to zero -that is, we would expect to have only a few page faults.o The effective access time is theno effective access time = (1 - p)*ma + p*page fault time

We are faced with three major components of the page-fault service time:

1. Service the page-fault interrupt.2. Read in the page.3. Restart the process.

The first and third tasks can be reduced, with careful coding, to several hundred instructions. These tasks may take from 1 to 100 microseconds each.The page-switch time, however, will probably be close to 8 milliseconds.

o A typical hard disk has an average latency of 3 milliseconds, a seek of 5 milliseconds, and a transfer time of 0.05 milliseconds.

Thus, the total paging time is about 8 milliseconds, including hardware and software time.If we take an average page-fault service time of 8 milliseconds and a memory-access time of 200 nanoseconds, then the effective access time in nanoseconds iseffective access time = (1 - p)*(200) + p*(8 milliseconds) = (1 - p)*200 + p*8,000,000

= 200 + 7,999,800 x p.We see, then, that the effective access time is directly proportional to the page-fault rate.If one access out of 1,000 causes a page fault, the effective access time is 8.2 microseconds. The computer will be slowed down by a factor of 40 because of demand paging!It is important to keep the page-fault rate low in a demand-paging system. Otherwise, the effective access time increases, slowing process execution dramatically.An additional aspect of demand paging is the handling and overall use of swap space.

o Disk I/O to swap space is generally faster than that to the file system. It is faster because swap space is allocated in much larger blocks, and file lookups and indirect allocation methods are not used.

o The system can therefore gain better paging throughput by copying an entire file image into the swap space at process startup and then performing demand paging from the swap space.

o Another option is to demand pages from the file system initially but to write the pages to swap space as they are replaced.

Page replacement algorithms:-

Basic Page Replacement

1. find the location of the desired page on disk

2. find a free frame:a. if there is a free frame, use itb. if there is no free frame, use a page-replacement algorithm to select a victim framec. write the victim frame to disk; change the page and frame tables accordingly

3. read the desired page into the newly freed frame; change the page and frame tables

4. restart the user process

If no frames are free, "two" pages transfers (one in and one out) are required, which doubles the page-fault service time.

This a lot of overhead without even mentioning the time spent waiting on the paging device.

Can reduce it by using a "dirty bit" (applies to read-only pages, such as program code). This scheme can reduce I/O time by 1/2 if the page has not been modified.

Two major issues with demand paging which must be addressed:

o page-replacement algorithm: how to select a page to be replaced)o frame-allocation algorithm: how many frames to allocate to each process)

These are important problems because disk I/O is so expensive.

Page-Replacement Algorithms

first in, first out (FIFO) optimal (OPT) least recently used (LRU)

Reference string: 7,0,1,2,0,3,0,4,2,3,0,3,2,1,2,0,1,7,0,1

FIFO Page Replacement

run prior reference string with 3 frames of memory (Fig. 9.12): ? easy to understand and implement poor performance

o initialization moduleo variable initialized early and used often

everything will work, but a bad replacement choice increases the page fault rate and slows process execution it does not result in incorrect execution

Optimal Page Replacement

replace the page which will not be used for the longest period of time (akin to A* search) run prior reference string with 3 frames of memory (Fig. 9.14): ? 9 (OPT) vs. 15 (FIFO) or really 6 (OPT) vs. 12 (FIFO) (twice as good!) has lowest page fault rate, and never suffers from Belady's anomaly problem? recall SJF? mainly used for comparison purposes

LRU Page Replacement

idea is to try to approximate the optimum key difference between FIFO and OPT (other than looking backward vs. forward in time, resp.)

o FIFO uses the time when a page was brought into memoryo OPT uses the time when a page is to be used

use recent page as an approximation of future, then we can replace the page which "has not been used" for the longest period of time

run prior reference string with 3 frames of memory (Fig. 9.15): ? associate with each page the time of its last use does not suffer from Belady's anomaly major issue is implementation (need hardware support):

o counters every time a reference is made, copy value of clock register into page table entry requires a search of the page table and a write to memory (time of use) for each

memory accesso stack

when a reference is made, remove page number from stack and push onto stack most recently used page is always on top least recently used page is always on bottom use doubly-linked-list with head and tail pointers to implement no search for replacement page number

LRU-Approximation Page Replacement

Often we have a reference bit associated with each entry in the page table. Can determine which pages have been used and not used by examining the reference bits, although we do not know the "order" of use.

Can we approximate order?

Additional-reference bits algorithm:

0000000011111111

110001000 has been used more recently than one with 01110111

If we interpret these bytes as unsigned integers, the page with the lowest value is the LRU page.

Problem: all numbers not unique.

Solution: swap out all or use FIFO.

Clock Algorithms

(or second-chance page-replacement algorithms)

a FIFO replacement algorithm number of bits used is 0; leaving only the reference bit if 0, replace if 1, give page a 2nd chance and move onto next FIFO page; reset reference bit to 0 and reset

arrival time to current time a page given a second chance will not be replaced until all other pages have been replaced (FIFO)

or given second chances if a page is used often enough to keep its reference bit set, it will never be replaced implementation: the clock algorithm using a circular queue

o a pointer (hand on a clock) indicates which page is to be replaced nexto when a frame is needed, the pointer advances until it finds a page with a 0 reference bito as it advances, it clears the reference bitso once a victim page is found, the page is replaced, and the new page is inserted in the

circular queue in that positiono degenerates to FIFO replacement if all bits are set

Enhanced Second-chance Algorithm

by considering the reference bit and dirty bit as an ordered pair, we now have four classes:o (0,0): neither recently used nor modified --- best page to replaceo (0,1): not recently used but modified --- not quite as good, because the page will need to

be written out before replacemento (1,0): recently used, but clean --- probably will be used again soono (1,1): recently used and modified --- probably will be used again soon, and the page will

need to be written out to disk before it can be replaced use same scheme as above, but rather than just checking if the reference bit is set to 1, we

examine the class, replace the first page encountered in the lowest non-empty class may have to search circular queue several times before we find a page to be replaced difference with above algorithm is that here we give preference to those pages which have been

modified to reduce the number of I/Os required.

Thrashing:-

Causes

In virtual memory systems, thrashing may be caused by programs or workloads that present insufficient locality of reference: if the working set of a program or a workload cannot be effectively held within

physical memory, then constant data swapping, i.e., thrashing, may occur. The term was first used during the tape operating system days to describe the sound the tapes made when data was being rapidly written to and read from them. Many older low-end computers have insufficient RAM (memory) for modern usage patterns and increasing the amount of memory can often cause the computer to run noticeably faster. This speed increase is due to the reduced amount of paging necessary.

An example of this sort of situation occurred on the IBM System/370 series mainframe computer, in which a particular instruction could consist of an execute instruction (which crosses a page boundary) that points to a move instruction (which itself also crosses a page boundary), targeting a move of data from a source that crosses a page boundary, to a target of data that also crosses a page boundary. The total number of pages thus being used by this particular instruction is eight, and all eight pages must be present in memory at the same time. If the operating system allocates fewer than eight pages of actual memory, when it attempts to swap out some part of the instruction or data to bring in the remainder, the instruction will again page fault, and it will thrash on every attempt to restart the failing instruction.

Assignment I

1) Explain process management .2) Explain all scheduling algorithms with examples.

Assignment II

1) What is memory management explain with some techniques.2) Explain demand paging and thrashing.

Assignment III

1) Explain disk scheduling methods.2) Explain free space management techniques.

References

www.wikepedia.com

www.studytonight.com

Books:- silberschatz galvin / gagne sixth edition

http://www.studytonight.com/

http://www.wikepedia.com/

cityofnotes.files.wordpress.com · web viewsep 06, 2017 · batch system often provides simple...

Documents