douglas thain and miron livny computer sciences department university of wisconsin-madison...

35
Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu http://www.cs.wisc.edu/condor/ bypass Bypass: A tool for building distributed systems

Post on 20-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

Douglas Thain and Miron Livny

Computer Sciences DepartmentUniversity of Wisconsin-Madison

{thain|miron}@cs.wisc.eduhttp://www.cs.wisc.edu/condor/

bypass

Bypass:A tool for building

distributed systems

Page 2: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Building distributed systems ishard.

Page 3: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Bypass makes building split execution systems easy.

Bypass is to split execution systemsas

yacc is to compilers

Page 4: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Problem:Unfriendly Machines

› Many systems can distribute your jobs to available machines scattered around the world. (rsh, Condor, Globus, etc...)

› But... the machines you have access to may not be properly equipped to run your job.

Page 5: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Problem:Unfriendly Machines

› An unfriendly machine… allows you to login under some identity. allows you to execute your program. might not have your files or a shared file

system! might not have space for your output! might be a different architecture or OS!

• (If you want to use a lot of machines, you can’t be picky!)

Page 6: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

homemachine

foreignmachine

HELP!core dumped

Page 7: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Solution: Split Execution

› General strategy: An agent process traps some of the

application's standard library calls. Some of the calls can be executed at the

foreign machine. Some of the calls are sent via RPC back to the

home machine. A shadow process at the home machine

executes the RPCs and sends the results back to the agent.

Page 8: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Solution: Split Execution

Shadow

Kernel

Localsystemcalls

Home Machine

Agent

Application

Kernel

Localsystemcalls

Trappedsystem

calls

Foreign Machine

Remotesystem calls

Page 9: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Split Execution is anOpen Research Topic

› We want to explore many possibilities: Foreign machine could be partially friendly

– has some needed resources, but not all. Data may be buffered and cached at both

the agent and the shadow. What procedure calls to trap depends on

the application and the services needed. Some procedure calls could be routed to

third parties such as file servers. …

Page 10: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Problem:Split Execution is Hard

› One example of many: Trapping stat() Different data types:

• struct stat, struct stat64

• Depending on system, integer elements are 2->8 bytes

Multiple entry points:

• stat, _stat, __libc_stat

Surprises:

• #define stat(a,b) _fxstat(VERSION,a,b)

Page 11: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Solution: Bypass› Bypass takes a specification of a split

execution system and produces a matched shadow and agent.

› Bypass hides all of the ugly details of trapping, type conversion, and RPCs.

› Bypass lets you: split any dynamically-linked application. transparently use heterogeneous systems. trap calls with minimal overhead. control execution paths with plain C++.

Page 12: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

homemachine

foreignmachine

Just like home!

Page 13: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Bypass Language› Declare what procedures to trap in

C++

› Annotate pointer types with data flow. Direction: in, out, or in out Binary data: give expression yielding the

number of bytes to send/receive.

› Give two function bodies: agent_action shadow_action

Page 14: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

ssize_t write(int fd,in "length" const void *data,size_t length)

agent_action{{

if( fd==1 ) {return bypass_shadow_write(fd,data,length);

} else {return write(fd,data,length);

}}}

shadow_action{{

printf("remote data: %s", data );}};

Page 15: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Agent Action

› Any arbitrary C++ code.

› When the program invokes write(), the agent_action is executed at the home machine.

› Within the agent_action: write() - Invoke the original write() at the

foreign machine. bypass_shadow_write() - Invoke the

shadow_action via RPC.

Page 16: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Shadow Action

› Any arbitrary C++ code.

› If the agent decides to invoke the RPC to the shadow, the shadow_action is executed at the home machine.

› Within the shadow_action: write() - Invoke write() at the home

machine.

Page 17: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Using Bypass

› Run "bypass" to read the specification and produce C++ source code:

• % bypass -agent -shadow simple.bypass

› The shadow is compiled into a plain executable.

› The agent is compiled into a shared library.

Page 18: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Using Bypass

› The dynamic linker is used to force the agent into an executable at run-time:

• setenv LD_PRELOAD simple_agent.so

› Procedure calls are “trapped” merely by putting the agent first in the link list.

› This method can be used on any dynamically-linked program: tcsh, netscape, emacs…

Page 19: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Example Application:Complete Remote I/O

› Trap all the standard I/O calls, and send them to the home machine unmodified:

open(in string char *path, int flags, int mode);close(int fd);int read(int fd, out “length” void *data, int length );int write(int fd, in “length” void *data, int length );int lseek(int fd, off_t offset, int whence );

Page 20: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Complete Remote I/O

Shadow

Kernel

open, close, read, write, lseek

Home Machine

Agent

Application

Kernel

allothercalls

Trappedsystem

calls

Foreign Machine

open, closeread, write,

lseek

Page 21: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Example Application:Remote Console

› Trap only read and write, and send operations on standard files back to a single shadow process.int read( int fd, in “length” void *data, int length )

agent_action {{if( fd<3 ) {

bypass_remote_read( fd, data,length );} else {

return read(fd,data,length);}

}};

Page 22: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Remote Console

Shadow

Kernel

read,write

Home Machine

Agent

Application

Kernel

allothercalls

Trappedsystem

calls

Foreign Machine

Agent

Application

Kernel

allothercalls

Trappedsystem

calls

Foreign Machine

Agent

Application

Kernel

allothercalls

Trappedsystem

calls

Foreign Machine

Standard I/Oreads and writes

Page 23: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Example Application:Attach New Filesystem

› Trap standard I/O calls and replace them with calls to a user-level filesystem library, such as Globus GASS.int open( in string const char *path, int flags, int mode )

agent_action {{return globus_gass_open( path, flags, mode );

}};

int close( int fd )agent_action {{

return globus_gass_close( fd );}};

Page 24: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Attach New Filesystem

Agent

Application

Kernel

allothercalls

Trappedsystem

calls

Foreign Machine

GlobusLibrary

moresystemcalls

open

close

THEGRID

Page 25: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Bypass can be used by Real Users!

› Bypass works on unmodified executables. (Real Users are not willing/able to

rewrite/recompile their programs.)

› Bypass requires no special privileges. (Real Users do not have the root password)

› Thus, Bypass allows a Real User to make good use of a remote cluster without begging the administrator to configure it to his/her needs.

Page 26: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Performance› Overhead of trapping a system call

is very small: 1-4 us The "trapping mechanism" simply

interposes a few extra function calls. Small compared to the expense of a

real system call (about 10-70us)

› Remote procedure calls are, as expected, much slower: about 1 ms under the best conditions.

Page 27: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Related Work

› “Classic” RPC and XDR: Define standard integer sizes,

endianness, etc. Start by defining external protocol,

then produce programming interface which is not always convenient:

• struct read_results * read_1( int fd, int length );

Page 28: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Related Work

› Bypass: We are stuck with existing interfaces, so

annotate them to produce a protocol:• int read( int fd, out “length” void *data, int length );

Do “best effort” conversion to/from external data format:

• off_t is 4 bytes on some platforms, 8 bytes on others.• A conversion might fail!

Define canonical values for source-level symbols:

• O_CREAT has different values on Linux and Solaris!

Page 29: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Related Work

› Hunt and Brubacher, “Detours” Trap library calls on NT using binary

rewriting – can be applied to any executable.

Make original procedure available through special “trampoline” call.

Bypass leaves the original entry point intact, so subroutines need not be re-written to use the trampoline.

Page 30: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Related Work

› Alexandrov, et al., “UFO” Use a kernel-level facility to trap all of a

process’ system calls and translate some of them into WWW operations.

The kernel mechanism is secure and can be applied to any process.

But… it has a high (7x) trapping overhead and cannot be applied to procedures that are not true system calls.

Page 31: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Related Work

› Bypass: Trapping overhead is very small and

can be performed on procedures that are not necessarily system calls.

But… can only be applied to dynamically-linked executables, and is not suitable as a security mechanism.

Page 32: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Related/Future Work

› A complete remote execution system needs both methods: The program owner provides a

lightweight mechanism for creating a correct split execution environment.

The machine owner provides a heavyweight mechanism to defend itself from a (possibly) malicious program.

Page 33: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Complete System

Shadow

Kernel

open, close, read, write, lseek

Home Machine

Agent

Application

Kernel

allothercalls

Foreign Machine

open, close

read, write, lseek

Sandbox

Page 34: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Future Work› Multiple agents applied to one application

How to select and invoke the correct agent action?

› Signal handling Flow of control is backwards.

› Other implementations Binary rewriting. Build specialized linker that understands

multiple definitions of symbols.

Page 35: Douglas Thain and Miron Livny Computer Sciences Department University of Wisconsin-Madison {thain|miron}@cs.wisc.edu

www.cs.wisc.edu/condor

Further Questions?

› Douglas Thain [email protected]

› Miron Livny [email protected]

› Bypass Web Page http://www.cs.wisc.edu/condor/bypass

› Questions now?