infinit filesystem, reactor reloaded

63
Infinit filesystem Reactor reloaded mefyl [email protected] Version 1.2-5-g4a755e6

Upload: infinit

Post on 08-Jan-2017

407 views

Category:

Software


0 download

TRANSCRIPT

Infinit filesystemReactor reloaded

[email protected]

Version 1.2-5-g4a755e6

www.princexml.com
Prince - Non-commercial License
This document was created with Prince, a great way of getting web content onto paper.

Infinit filesystem

Distributed filesystem in byzantine environment: aggregate multiple computersinto one storage.

Infinit filesystem

Distributed filesystem in byzantine environment: aggregate multiple computersinto one storage.

• Filesystem in the POSIX sense of the term.◦ Works with any client, unmodified.◦ Fine grained semantic (e.g: video streaming).◦ Well featured (e.g: file locking).

Infinit filesystem

Distributed filesystem in byzantine environment: aggregate multiple computersinto one storage.

• Filesystem in the POSIX sense of the term.◦ Works with any client, unmodified.◦ Fine grained semantic (e.g: video streaming).◦ Well featured (e.g: file locking).

• Distributed: no computer as any specific authority or role.◦ Availability: no SPOF, network failure resilient.◦ No admin. As in, no janitor and no tyran.◦ Scalability flexibility.

Infinit filesystem

Distributed filesystem in byzantine environment: aggregate multiple computersinto one storage.

• Filesystem in the POSIX sense of the term.◦ Works with any client, unmodified.◦ Fine grained semantic (e.g: video streaming).◦ Well featured (e.g: file locking).

• Distributed: no computer as any specific authority or role.◦ Availability: no SPOF, network failure resilient.◦ No admin. As in, no janitor and no tyran.◦ Scalability flexibility.

• Byzantine: you do not need to trust other computers in any way.◦ No admins. As in, no omniscient god.◦ Support untrusted networks, both faulty and malicious peers.

Infinit architecture

Coroutines in a nutshell

Coroutines in a nutshell

Intelligible Raceconditions Scale Multi core Stack

memory

Systemthreads OK KO KO OK KO

Coroutines in a nutshell

Intelligible Raceconditions Scale Multi core Stack

memory

Systemthreads OK KO KO OK KO

Event-based KO OK OK KO OK

Coroutines in a nutshell

Intelligible Raceconditions Scale Multi core Stack

memory

Systemthreads OK KO KO OK KO

Event-based KO OK OK KO OK

SystemCoroutines OK OK OK KO KO

Coroutines in a nutshell

Intelligible Raceconditions Scale Multi core Stack

memory

Systemthreads OK KO KO OK KO

Event-based KO OK OK KO OK

SystemCoroutines OK OK OK KO KO

StacklessCoroutines Meh OK OK KO OK

Coroutines in a nutshell

Intelligible Raceconditions Scale Multi core Stack

memory

Systemthreads OK KO KO OK KO

Event-based KO OK OK OK OK

SystemCoroutines OK OK OK OK KO

StacklessCoroutines Meh OK OK OK OK

Reactor in a nutshell

Reactor is a C++ library providing coroutines support, enabling simple and safeimperative style concurrency.

Reactor in a nutshell

Reactor is a C++ library providing coroutines support, enabling simple and safeimperative style concurrency.

while (true){

auto socket = tcp_server.accept();new Thread([socket] {try{while (true)

socket->write(socket->read_until("\n"));}catch (reactor::network::Error const&){}

});}

Sugar for common pattern

Coroutines are mostly used in three patterns:

Sugar for common pattern

Coroutines are mostly used in three patterns:

• The entirely autonomous detached thread.• The background thread tied to an object.• The parallel flow threads tied to the stack.

Basic API for threads

Three core calls:

• Create a thread that will run callable concurrently:

reactor::Thread("name", callable)

Basic API for threads

Three core calls:

• Create a thread that will run callable concurrently:

reactor::Thread("name", callable)

• Wait until the a thread finishes:

reactor::wait(thread)

Basic API for threads

Three core calls:

• Create a thread that will run callable concurrently:

reactor::Thread("name", callable)

• Wait until the a thread finishes:

reactor::wait(thread)

• Terminate a thread:

thread->terminate_now()

Basic API for threads

Three core calls:

• Create a thread that will run callable concurrently:

reactor::Thread("name", callable)

• Wait until the a thread finishes:

reactor::wait(thread)

• Terminate a thread:

thread->terminate_now()

Nota bene:

• Don't destroy an unfinished thread. Wait for it or terminate it.

Basic API for threads

Three core calls:

• Create a thread that will run callable concurrently:

reactor::Thread("name", callable)

• Wait until the a thread finishes:

reactor::wait(thread)

• Terminate a thread:

thread->terminate_now()

Nota bene:

• Don't destroy an unfinished thread. Wait for it or terminate it.• Exceptions escaping a thread terminate the whole scheduler.

Basic API for reactor

reactor::Scheduler sched;reactor::Thread main(

sched, "main",[]{

});// Will execute until all threads are done.sched.run();

Basic API for reactor

reactor::Scheduler sched;reactor::Thread main(

sched, "main",[]{reactor::Thread t("t", [] { print("world"); });print("hello");

});// Will execute until all threads are done.sched.run();

Basic API for reactor

reactor::Scheduler sched;reactor::Thread main(

sched, "main",[]{reactor::Thread t("t", [] { print("world"); });print("hello");reactor::wait(t);

});// Will execute until all threads are done.sched.run();

The detached thread

The detached thread is a global background operation whose lifetime is tied tothe program only.

The detached thread

The detached thread is a global background operation whose lifetime is tied tothe program only.

E.g. uploading crash reports on startup.

for (auto p: list_directory(reports_dir))reactor::http::put("infinit.sh/reports",

ifstream(p));

The detached thread

The detached thread is a global background operation whose lifetime is tied tothe program only.

E.g. uploading crash reports on startup.

new reactor::Thread([]{for (auto p: list_directory(reports_dir))reactor::http::put("infinit.sh/reports",

ifstream(p));});

The detached thread

The detached thread is a global background operation whose lifetime is tied tothe program only.

E.g. uploading crash reports on startup.

new reactor::Thread([]{for (auto p: list_directory(reports_dir))reactor::http::put("infinit.sh/reports",

ifstream(p));},reactor::Thread::auto_dispose = true);

The background thread tied to an object

The background thread performs a concurrent operation for an object and istied to its lifetime

The background thread tied to an object

The background thread performs a concurrent operation for an object and istied to its lifetime

class Async{

reactor::Channel<Block> _blocks;

void store(Block b){this->_blocks->put(b);

}

void _flush(){while (true)this->_store(this->_blocks.get());

}};

The background thread tied to an object

The background thread performs a concurrent operation for an object and istied to its lifetime

class Async{

reactor::Channel<Block> _blocks;void store(Block b);void _flush();

std::unique_ptr<reactor::Thread> _flush_thread;Async(): _flush_thread(new reactor::Thread(

[this] { this->_flush(); })){}

};

The background thread tied to an object

The background thread performs a concurrent operation for an object and istied to its lifetime

class Async{

reactor::Channel<Block> _blocks;void store(Block b);void _flush();std::unique_ptr<reactor::Thread> _flush_thread;Async();

~Async(){this->_flush_thread->terminate_now();

}};

The background thread tied to an object

The background thread performs a concurrent operation for an object and istied to its lifetime

class Async{

reactor::Channel<Block> _blocks;void store(Block b);void _flush();Thread::unique_ptr<reactor::Thread> _flush_thread;Async();

};

The background thread tied to an object

The background thread performs a concurrent operation for an object and istied to its lifetime

struct Terminator: public std::default_delete<reactor::Thread>

{void operator ()(reactor::Thread* t){bool disposed = t->_auto_dispose;if (t)t->terminate_now();

if (!disposed)std::default_delete<reactor::Thread>::

operator()(t);}

};typedef

std::unique_ptr<reactor::Thread, Terminator>unique_ptr;

The parallel flow threads

Parallel flow threads are used to make the local flow concurrent, like parallelcontrol statements.

auto data = reactor::http::get("...");;reactor::http::put("http://infinit.sh/...", data);reactor::network::TCPSocket s("hostname");s.write(data);print("sent!");

GET HTTP PUT TCP PUT SENT

The parallel flow threads

auto data = reactor::http::get("...");reactor::Thread http_put(

[&]{reactor::http::put("http://infinit.sh/...", data);

});reactor::Thread tcp_put(

[&]{reactor::network::TCPSocket s("hostname");s.write(data);

});reactor::wait({http_put, tcp_put});print("sent!");

GET SENT

HTTP PUT

TCP PUT

The parallel flow threads

auto data = reactor::http::get("...");std::exception_ptr exn;reactor::Thread http_put([&] {

try { reactor::http::put("http://...", data); }catch (...) { exn = std::current_exception(); }

});reactor::Thread tcp_put([&] {

try {reactor::network::TCPSocket s("hostname");s.write(data);

}catch (...) { exn = std::current_exception(); }

});reactor::wait({http_put, tcp_put});if (exn)

std::rethrow_exception(exn);print("sent!");

The parallel flow threads

auto data = reactor::http::get("...");std::exception_ptr exn;reactor::Thread http_put([&] {

try {reactor::http::put("http://...", data);

} catch (reactor::Terminate const&) { throw; }catch (...) { exn = std::current_exception(); }

});reactor::Thread tcp_put([&] {

try {reactor::network::TCPSocket s("hostname");s.write(data);

} catch (reactor::Terminate const&) { throw; }catch (...) { exn = std::current_exception(); }

});reactor::wait({http_put, tcp_put});if (exn) std::rethrow_exception(exn);print("sent!");

The parallel flow threads

auto data = reactor::http::get("...");reactor::Scope scope;scope.run([&] {

reactor::http::put("http://infinit.sh/...", data);});scope.run([&] {

reactor::network::TCPSocket s("hostname");s.write(data);

});reactor::wait(scope);print("sent!");

GET SENT

HTTP PUT

TCP PUT

Futures

auto data = reactor::http::get("...");reactor::Scope scope;scope.run([&] {

reactor::http::Request r("http://infinit.sh/...");r.write(data);

});scope.run([&] {

reactor::network::TCPSocket s("hostname");s.write(data);

});reactor::wait(scope);print("sent!");

Futures

elle::Buffer data;reactor::Thread fetcher(

[&] { data = reactor::http::get("..."); });reactor::Scope scope;scope.run([&] {

reactor::http::Request r("http://infinit.sh/...");reactor::wait(fetcher);r.write(data);

});scope.run([&] {

reactor::network::TCPSocket s("hostname");reactor::wait(fetcher);s.write(data);

});reactor::wait(scope);print("sent!");

Futures

reactor::Future<elle::Buffer> data([&] { data = reactor::http::get("..."); });

reactor::Scope scope;scope.run([&] {

reactor::http::Request r("http://infinit.sh/...");r.write(data.get());

});scope.run([&] {

reactor::network::TCPSocket s("hostname");s.write(data.get());

});reactor::wait(scope);print("sent!");

Futures

Scope + futures

SENT

GET

HTTP PUT

TCP PUT

Scope

GET SENT

HTTP PUT

TCP PUT

Serial

GET HTTP PUT TCP PUT SENT

Concurrent iteration

The final touch.

std::vector<Host> hosts;reactor::Scope scope;for (auto const& host: hosts)

scope.run([] (Host h) { h->send(block); });reactor::wait(scope);

Concurrent iteration

The final touch.

std::vector<Host> hosts;reactor::Scope scope;for (auto const& host: hosts)

scope.run([] (Host h) { h->send(block); });reactor::wait(scope);

Concurrent iteration:

std::vector<Host> hosts;reactor::concurrent_for(

hosts, [] (Host h) { h->send(block); });

Concurrent iteration

The final touch.

std::vector<Host> hosts;reactor::Scope scope;for (auto const& host: hosts)

scope.run([] (Host h) { h->send(block); });reactor::wait(scope);

Concurrent iteration:

std::vector<Host> hosts;reactor::concurrent_for(

hosts, [] (Host h) { h->send(block); });

• Also available: reactor::concurrent_break().• As for a concurrent continue, that's just return.

CPU bound operations

In the cases we are CPU bound, how to exploit multicore since coroutines areconcurrent but not parallel ?

CPU bound operations

In the cases we are CPU bound, how to exploit multicore since coroutines areconcurrent but not parallel ?

• Idea 1: schedule coroutines in parallel◦ Pro: absolute best parallelism.◦ Cons: race conditions.

CPU bound operations

In the cases we are CPU bound, how to exploit multicore since coroutines areconcurrent but not parallel ?

• Idea 1: schedule coroutines in parallel◦ Pro: absolute best parallelism.◦ Cons: race conditions.

• Idea 2: isolate CPU bound code in "monads" and run it in background.◦ Pro: localized race conditions.◦ Cons: only manually parallelized code exploits multiple cores.

reactor::background

reactor::background(callable): run callable in a system threadand return its result.

reactor::background

reactor::background(callable): run callable in a system threadand return its result.

Behind the scene:

• A boost::asio_service whose run method is called in as manythreads as there are cores.

• reactor::background posts the action to Asio and waits forcompletion.

reactor::background

reactor::background(callable): run callable in a system threadand return its result.

Behind the scene:

• A boost::asio_service whose run method is called in as manythreads as there are cores.

• reactor::background posts the action to Asio and waits forcompletion.

Block::seal(elle::Buffer data){

this->_data = reactor::background([data=std::move(data)] { return encrypt(data); });

}

reactor::background

No fiasco rules of thumb: be pure.

• No side effects.• Take all bindings by value.• Return any result by value.

reactor::background

No fiasco rules of thumb: be pure.

• No side effects.• Take all bindings by value.• Return any result by value.

std::vector<int> ints;int i = 0;

// Do not:reactor::background([&] {ints.push_back(i+1);});reactor::background([&] {ints.push_back(i+2);});

// Do:ints.push_back(reactor::background([i] {return i+1;});ints.push_back(reactor::background([i] {return i+2;});

Background pools

Infinit generates a lot of symmetric keys.

Let other syscall go through during generation with reactor::background:

Block::Block(): _block_keys(reactor::background(

[] { return generate_keys(); })){}

Background pools

Infinit generates a lot of symmetric keys.

Let other syscall go through during generation with reactor::background:

Block::Block(): _block_keys(reactor::background(

[] { return generate_keys(); })){}

Problem: because of usual on-disk filesystems, operations are often sequential.

$ time for i in $(seq 128); touch $i; done

The whole operation will still be delayed by 128 * generation_time.

Background pools

Solution: pregenerate keys in a background pool.

reactor::Channel<KeyPair> keys;keys.max_size(64);

reactor::Thread keys_generator([&] {reactor::Scope keys;for (int i = 0; i < NCORES; ++i)scope.run([] {while (true)

keys.put(reactor::background([] { return generate_keys(); }));

});reactor::wait(scope);

});

Block::Block(): _block_keys(keys.get())

{}

Generators

Fetching a block with paxos:

• Ask the overlay for the replication_factor owners of the block.• Fetch at least replication_factor / 2 + 1 versions.• Paxos the result

Generators

Fetching a block with paxos:

DHT::fetch(Address addr){

auto hosts = this->_overlay->lookup(addr, 3);std::vector<Block> blocks;reactor::concurrent_for(hosts,[&] (Host host){blocks.push_back(host->fetch(addr));if (blocks.size() >= 2)

reactor::concurrent_break();});

return some_paxos_magic(blocks);}

Generators

reactor::Generator<Host>Kelips::lookup(Address addr, factor f){

return reactor::Generator<Host>([=] (reactor::Generator<Host>::Yield const& yield){some_udp_socket.write(...);while (f--)

yield(kelips_stuff(some_udp_socket.read()));});

}

Generators

Behind the scene:

template <typename T>class Generator{

reactor::Channel<T> _values;Generator(Callable call): _thread([this] {

call([this] (T e) {this->_values.put(e);});})

{}// Iterator interface calling _values.get()

};

• Similar to python generators• Except generation starts right away and is not tied to iteration

Questions?