introduction

20
User Level Interprocess User Level Interprocess Communication for Communication for Shared Memory Shared Memory Multiprocessor Multiprocessor by by Bershad, B.N. Anderson, Bershad, B.N. Anderson, A.E., Lazowska, E.D., and A.E., Lazowska, E.D., and Levy, H.M. Levy, H.M.

Upload: willow-beck

Post on 31-Dec-2015

16 views

Category:

Documents


0 download

DESCRIPTION

User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M. Introduction. RPC Help in implementing distributed applications by eliminating the need to implement communication mechanism. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Introduction

User Level Interprocess User Level Interprocess Communication for Communication for

Shared Memory Shared Memory MultiprocessorMultiprocessor

by by

Bershad, B.N. Anderson, Bershad, B.N. Anderson, A.E., Lazowska, E.D., and A.E., Lazowska, E.D., and

Levy, H.M.Levy, H.M.

Page 2: Introduction

IntroductionIntroduction

RPCRPC Help in implementing distributed Help in implementing distributed

applications by eliminating the need to applications by eliminating the need to implement communication mechanism.implement communication mechanism.

Decomposed system provides Decomposed system provides advantages of failure isolation, advantages of failure isolation, extensibility and modularity. So RPC is extensibility and modularity. So RPC is used even when the call is in the same used even when the call is in the same machine.machine.

Page 3: Introduction

IntroductionIntroduction

RPC CostsRPC Costs Stub overheadStub overhead Message buffer overhead (4 copies)Message buffer overhead (4 copies) Access validationAccess validation Message transferMessage transfer SchedulingScheduling Context switchContext switch DispatchDispatch

Page 4: Introduction

IntroductionIntroduction

LRPC CostsLRPC Costs Stub overheadStub overhead Message buffer overhead (1 copy)Message buffer overhead (1 copy) Only necessary access validationOnly necessary access validation Message transferMessage transfer Only necessary schedulingOnly necessary scheduling Context switch is minimized by using Context switch is minimized by using

domain cachingdomain caching

Page 5: Introduction

IntroductionIntroduction

IPCIPC Main components (All work in Kernel)Main components (All work in Kernel)

Processor reallocation (process context Processor reallocation (process context switch)switch)

Data transferData transfer Thread managementThread management

ProblemsProblems Processor reallocation is expensiveProcessor reallocation is expensive Parallel applications need user-level thread Parallel applications need user-level thread

managementmanagement

Page 6: Introduction

URPCURPC

User-Level Remote Procedure CallUser-Level Remote Procedure Call Shared memory multiprocessorsShared memory multiprocessors

Processor reallocation - minimizeProcessor reallocation - minimize Data transfer - user-level (Package called Data transfer - user-level (Package called

URPC)URPC) Thread management - user-level (Package Thread management - user-level (Package

called FastThreads)called FastThreads)

Page 7: Introduction

User-level componentsUser-level components

Page 8: Introduction

Processor ReallocationProcessor Reallocation

Limit the frequency of processor Limit the frequency of processor reallocationreallocation WhyWhy

Cost of process context switch is more Cost of process context switch is more expensive than thread context switchexpensive than thread context switch

Cost of invoking kernelCost of invoking kernel-Client makes procedure call in server address space-Invoke kernel-Kernel reallocates processor to server address space-Server finishes the job-Invoke kernel-Kernel reallocates processor to client address space-Client resumes the work

Page 9: Introduction

Processor ReallocationProcessor Reallocation

Limit the frequency of processor Limit the frequency of processor reallocationreallocation HowHow

Optimistic reallocation policyOptimistic reallocation policy Client has other worksClient has other works Server has or will soon has a processor to do the Server has or will soon has a processor to do the

jobjob

Uniprocessor can delay processor Uniprocessor can delay processor reallocationreallocation

-Client makes procedure call in server address space-Client does something else-Server finishes the job-Client resumes the work

Page 10: Introduction

Processor ReallocationProcessor Reallocation

ProblemsProblems Inappropriate situationsInappropriate situations

Single-threaded client, real time applications Single-threaded client, real time applications & high-latency I/O applications& high-latency I/O applications

Solve: Allow client to force processor Solve: Allow client to force processor reallocationreallocation

UnderpoweredUnderpowered No processor to handle the pending request No processor to handle the pending request

from clientfrom client Solve: Donate – idle processor donates itself Solve: Donate – idle processor donates itself

to underpowered address spaceto underpowered address space

Page 11: Introduction

Processor ReallocationProcessor Reallocation

ProblemsProblems Voluntary return of processorVoluntary return of processor

Processor working in server never return to Processor working in server never return to client because it is too busy working on the client because it is too busy working on the request of other clients.request of other clients.

Solve: enforce the process reallocation Solve: enforce the process reallocation when necessary such as high priority when necessary such as high priority waiting while low priority job is running and waiting while low priority job is running and processor is idlingprocessor is idling

Page 12: Introduction

Processor ReallocationProcessor Reallocation

LRPC VS URPCLRPC VS URPC Domain caching looks for idle processor in Domain caching looks for idle processor in

server contextserver context Optimistic reallocation assume there will Optimistic reallocation assume there will

be an available processor in server context be an available processor in server context and queue the request to be done laterand queue the request to be done later

URPC needs two level scheduling URPC needs two level scheduling decisions including looking for idle decisions including looking for idle processor and underpoweredprocessor and underpowered address address space while LRPC does not.space while LRPC does not.

Page 13: Introduction

Data TransferData Transfer

Use pair-wise shared memory to Use pair-wise shared memory to avoid the need of copying in kernel.avoid the need of copying in kernel.

Both give the same level of security Both give the same level of security since data need to be passed into since data need to be passed into stubs before it can be usedstubs before it can be used

Page 14: Introduction

Thread ManagementThread Management

ArgumentsArguments Fine-grained parallel application needs Fine-grained parallel application needs

high performance thread management high performance thread management which could only be achieved by which could only be achieved by implementing in user-levelimplementing in user-level

Communication & Thread management Communication & Thread management can achieve very good performances can achieve very good performances when both are implemented at user-when both are implemented at user-levellevel

Page 15: Introduction

Thread ManagementThread Management

Features of kernel such as time Features of kernel such as time slicing degrade performance of slicing degrade performance of applicationsapplications

To invoke thread management To invoke thread management operation, kernel traps are requiredoperation, kernel traps are required

Thread management policy Thread management policy implemented in kernel is unlikely to implemented in kernel is unlikely to be efficient for all parallel be efficient for all parallel applicationsapplications

Page 16: Introduction

Thread ManagementThread Management

Threads block in order toThreads block in order to Synchronize their activities in same Synchronize their activities in same

address spaceaddress space Wait for external events from different Wait for external events from different

address spaceaddress space Communication implemented at kernel level Communication implemented at kernel level

will result in synchronization at both user will result in synchronization at both user level and kernel levellevel and kernel level

Page 17: Introduction

URPCURPC

Page 18: Introduction

PerformancePerformance

Thread managementThread management faster at user faster at user levellevel

Component breakdownComponent breakdown

Page 19: Introduction

PerformancePerformance

Call latency & throughput is at worst Call latency & throughput is at worst when S=0when S=0

Page 20: Introduction

ConclusionConclusion

Moving the possible functionality Moving the possible functionality from kernel into user-lever to from kernel into user-lever to improve performanceimprove performance

In order to achieve great In order to achieve great performance on multiprocessors, performance on multiprocessors, system need to be designed to system need to be designed to support its functionalitysupport its functionality