distributed process management

28
Chapter 14 - 28 pages 1 Distributed Process Management Chapter 14

Upload: brian

Post on 23-Jan-2016

49 views

Category:

Documents


0 download

DESCRIPTION

Distributed Process Management. Chapter 14. Process Migration. Move an active process from one machine to another The process migrates to a target machine Transferring a sufficient amount of the state of a process from one machine to another. Why Migrate?. Load sharing - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Distributed Process Management

Chapter 14 - 28 pages 1

Distributed Process Management

Chapter 14

Page 2: Distributed Process Management

Chapter 14 - 28 pages 2

Process Migration

• Move an active process from one machine to another

• The process migrates to a target machine

• Transferring a sufficient amount of the state of a process from one machine to another

Page 3: Distributed Process Management

Chapter 14 - 28 pages 3

Why Migrate?

• Load sharing• move processes from heavily loaded to lightly

load systems• load can be balanced to improve overall

performance

• Communications performance• processes that interact intensively can be moved

to the same node to reduce communications cost• may be better to move process to where the data

reside when the data is large

Page 4: Distributed Process Management

Chapter 14 - 28 pages 4

Why Migrate?

• Availability• long-running process may need to move

because the machine it is running on will be down

• Utilizing special capabilities• process can take advantage of unique

hardware or software capabilities

Page 5: Distributed Process Management

Chapter 14 - 28 pages 5

Initiation of Migration

• Operating system• when goal is load balancing

• Process• when goal is to reach a particular

resource

Page 6: Distributed Process Management

Chapter 14 - 28 pages 6

What is Migrated?

• Must destroy the process on the source system

• Process control block and any links must be moved

Page 7: Distributed Process Management

Chapter 14 - 28 pages 7

What is Migrated?

• Eager (all):Transfer entire address space• no trace of process is left behind• if address space is large and if the

process does not need most of it, then this approach my be unnecessarily expensive

Page 8: Distributed Process Management

Chapter 14 - 28 pages 8

What is Migrated

• Precopy: Process continues to execute on the source node while the address space is copied• pages modified on the source during

precopy operation have to be copied a second time

• reduces the time that a process is frozen and cannot execute during migration

Page 9: Distributed Process Management

Chapter 14 - 28 pages 9

What is Migrated?

• Eager (dirty): Transfer only that portion of the address space that is in main memory• any additional blocks of the virtual address

space are transferred on demand• the source machine is involved throughout

the life of the process• good if process is temporarily going to

another machine• good for a thread since the threads left

behind need the same address space

Page 10: Distributed Process Management

Chapter 14 - 28 pages 10

What is Migrated?

• Copy-on-reference: Pages are only brought over on reference• variation of eager (dirty)• has lowest initial cost of process

migration

Page 11: Distributed Process Management

Chapter 14 - 28 pages 11

What is Migrated?

• Flushing: Pages are cleared from main memory by flushing dirty pages to disk• later use copy-on-reference strategy• relieves the source of holding any pages

of the migrated process in main memory

Page 12: Distributed Process Management

Chapter 14 - 28 pages 12

Negotiation of Process Migration

Starter Starter1: Will you take P?

2: Yes, migrate to machine 3

3: MigrateOut P

4: Offer P

7: Accept offer

5: Offer P 6: MigrateIn P

0 1 2 3 4

A

P

B

KJ KJ KJ KJKJ

S D

Page 13: Distributed Process Management

Chapter 14 - 28 pages 13

Distributed Global States

• Operating system cannot know the current state of all process in the distributed system

• A process can only know the current state of all processes on the local system

• Remote processes only know state information that is received by messages• these messages represent the state in the past

Page 14: Distributed Process Management

Chapter 14 - 28 pages 14

Example

• Bank account is distributed over two branches

• The total amount in the account is the sum at each branch

• At 3 PM the account balance is determined

• Messages are sent to request the information

Page 15: Distributed Process Management

Chapter 14 - 28 pages 15

Example

Branch A

Branch B

3:00

Total = $100.00

SA = $100.00

SB = $0.00

Page 16: Distributed Process Management

Chapter 14 - 28 pages 16

Example

• If at the time of balance determination, the balance from branch A is in transit to branch B

• The result is a false reading

Page 17: Distributed Process Management

Chapter 14 - 28 pages 17

Example

Branch A

Branch B

3:00

Total = $0.00

SA = $0.00

SB = $0.00

msg = “Transfer $100 to Branch B”

2:59

3:01

Page 18: Distributed Process Management

Chapter 14 - 28 pages 18

Example

• All messages in transit must be examined at time of observation

• Total consists of balance at both branches and amount in message

Page 19: Distributed Process Management

Chapter 14 - 28 pages 19

Example

• If clocks at the two branches are not perfectly synchronized

• Transfer amount at 3:01 from branch A

• Amount arrives at branch B at 2:59• At 3:00 the amount is counted twice

Page 20: Distributed Process Management

Chapter 14 - 28 pages 20

Example

Branch A

Branch B

Total = $200.00

SA = $100.00

SB = $100.00

3:00 3:01

2:59

3:00

msg = “Transfer $100 to Branch B”

Page 21: Distributed Process Management

Chapter 14 - 28 pages 21

Example of a Snapshot

Process 1 Outgoing channels 2 sent 1,2,3,4,5,6 3 sent 1,2,3,4,5,6 Incoming channels

Process 2 Outgoing channels 3 sent 1,2,3,4 4 sent 1,2,3,4 Incoming channels 1 received 1,2,3,4 stored 5,6 3 received 1,2,3,4,5,6,7,8

Process 3 Outgoing channels 2 sent 1,2,3,4,5,6,7,8 Incoming channels 1 received 1,2,3 stored 4,5,6 2 received 1,2,3 stored 4 4 received 1,2,3

Process 4 Outgoing channels 3 sent 1,2,3 Incoming channels 2 received 1,2 stored 3,4

Page 22: Distributed Process Management

Chapter 14 - 28 pages 22

Ordering of Events

• Events must be order to ensure mutual exclusion and avoid deadlock

• Clocks are not synchronized• Communication delays• State information for a process is not

up to date

Page 23: Distributed Process Management

Chapter 14 - 28 pages 23

Ordering of Events

• Need to consistently say that one event occurs before another event

• Messages are sent when want to enter critical section and when leaving critical section

• Time-stamping• orders events on a distributed system• system clock is not used

Page 24: Distributed Process Management

Chapter 14 - 28 pages 24

Time-Stamping

• Each system on the network maintains a counter which functions as a clock

• Each site has a numerical identifier• When a message is received, the

receiving system sets is counter to one more than the maximum of its current value and the incoming time-stamp (counter)

Page 25: Distributed Process Management

Chapter 14 - 28 pages 25

Time-Stamping

• If two messages have the same time-stamp, they are ordered by the number of their sites

• For this method to work, each message is sent from one process to all other processes• ensures all sites have same ordering of

messages• for mutual exclusion and deadlock all processes

must be aware of the situation

Page 26: Distributed Process Management

Chapter 14 - 28 pages 26

Token-Passing Algorithmif not token_present then begin clock := clock + 1; Prelude

broadcast(Request, clock I); wait(access, token); token_present := True;

endendif;token_held := True: <critical section>

token(i) := clock; Postludetoken_head := False;for j := i + 1 to n, 1 to i - 1 do if (request(j) > token(J)) [Symbol]^token_present then begin token_present := False; send(access, token(j)) end endif;

when received (Request, k, j) do Notation: request(j) := max(request(j), k); send(j, access, token) send message of type access, with token, by process j if token_present[Symbol]Ynot token_held then broadcast(request, clock, i) send message from process i of type request, with <text of postlude> timestamp clock, to all other processes endif received(request, t, j) receive message from process j of type request,enddo; with timestamp t

Page 27: Distributed Process Management

Chapter 14 - 28 pages 27

Distributed Deadlock Detection Algorithm

{Date object Dj receiving a lock_request(Ti)}begin if Locked_by(Dj) = nil then send granted else begin send not granted to Ti; send Locked)by(Dj) to Ti endend.{Transaction Ti makes a lock request for data object Dj}begin send lock_request(Ti) to Dj; wait for granted/not granted; if granted then begin Locked_by(Dj) := Ti; Held_by(Ti) := end else {suppose Dj is being used by transaction Tj} begin Held_by(Ti) := Tj; Enqueue(Ti, Request_Q(Tj)); if Wait_for(Tj) = nil then Wait_for(Ti) := Tj else Wait_for(Ti) := Wait_for(Tj); update(Wait)for(Ti), Request_Q(Ti)) endend.

{Transaction Tj receiving update message}begin if Wait_for(Tj) Wait_for(Ti) then Wait_for(Tj) := Wait_for(Ti); if Wait_for(Tj) Request_Q(Tj) = nil then update(Wait_for(Ti), Request_Q(Tj)) else begin DECLARE DEADLOCK; {initiate deadlock resolution as follows} {Tj is chosen as the transaction to be aborted} {Tj releases all the data objects it holds} send clear(Tj, Held_by(Tj)); allocate each data object Di held by Tj to the first requester Tk in Request_Q(Tj); for every transaction Tn in Request_Q(Tj) requesting data object Di held by Tj do Enqueue(Tn, Request_Q(Tk)); endend.

{Transaction Tk receiving a clear(Tj, Tk) messagebegin purge the tuple having Tj as the requesting transactionfrom Request_Q(Tk)end.

Page 28: Distributed Process Management

Chapter 14 - 28 pages 28

Example of Distributed Deadlock Algorithm

T5

T1 T2 T3

T4

T6

T0

T5

T1 T2 T3

T4

T6

T0

Transaction Wait_for Held_by Request_Q

T0 T0 T3 T1

T1 T0 T0 T2

T2 T0 T1 T3

T3 T0 T2 T4, T6, T0

T4 T0 T3 T5

T5 T0 T4 nil

T6 T0 T3 nil

Transaction Wait_for Held_by Request_Q

T0 nil nil T1

T1 T0 T0 T2

T2 T0 T1 T3

T3 T0 T2 T4, T6

T4 T0 T3 T5

T5 T0 T4 nil

T6 T0 T3 nil