concurrency, parallelism, events, asynchronicity, oh my threads, processes, queues & workers,...
TRANSCRIPT
![Page 1: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/1.jpg)
Concurrency, Parallelism, Events, Asynchronicity, Oh My
threads, processes, queues & workers,multiprocessing, gevent, twisted, yo momma
SoCal Piggies 2011-11-10Tommi Virtanen / @tv / DreamHost
![Page 2: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/2.jpg)
One thingat a time
per core
hyperthreadingis a filthy lie
SMP/NUMA = think of multiple computers in one
box
![Page 3: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/3.jpg)
Concurrency vs Parallelism
flip-flops vs thongs
![Page 4: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/4.jpg)
Parallelism
Speed-orientedPhysically parallel
![Page 5: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/5.jpg)
Concurrency
Multiplexing operationsReasoning about program structure
![Page 6: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/6.jpg)
Von Neumann architecture
meet Von Neumann bottleneck
![Page 7: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/7.jpg)
Multiple processeswget -q -O- http://www.dreamhost.com/ \| perl -ne 'for (/\b[[:upper:]]\w*\b/g) {print "$_\n"}' \| tr '[:upper:]' '[:lower:]' \| sort \| uniq -c \| sort -nr \| head -5
30 dreamhost 11 hosting 10 wordpress 9 web 9 we
waits for network
waits for disk
waits for disk
OS can schedule to avoid IO wait(HT can schedule to avoid RAM wait)
![Page 8: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/8.jpg)
Pipelines are lovely, but two-way communication is harder
![Page 9: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/9.jpg)
IO waiting deadlock# parentdata = child.stdout.read(8192)
# childdata = sys.stdin.read(8192)
![Page 10: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/10.jpg)
"Blocking" on IO
Like refusing to participatein a dinner conversation
because your glass is empty.
![Page 11: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/11.jpg)
Nonblocking
"if I asked you to do X,would you do it immediately?"
![Page 12: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/12.jpg)
Asynchronous
abstraction on top of nonblocking;queue outgoing data until writable,notify of incoming data just read
![Page 13: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/13.jpg)
Event loop
GUIs: "user did X, please react"services: lots of async operations,
concurrently
![Page 14: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/14.jpg)
Callback-oriented programming
services: "received bytes Bon connection C"
![Page 15: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/15.jpg)
Deferred
Twisted API fordelayed-response
function calls
![Page 16: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/16.jpg)
Deferred example
import something
def get_or_create(container, name): d = container.lookup(name) def or_create(failure): failure.trap(something.NotFoundError) return container.create(name) d.addErrback(or_create) return d
d = get_or_create(cont, 'foo')def cb(foo): print 'Got foo:', food.addCallback(cb)
![Page 17: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/17.jpg)
Twisted
Use to poke fun at Node.js fanboys
![Page 18: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/18.jpg)
concurrent.futures (PEP 3148)
Python fails to learn from TwistedMore like Java every day
Twisted dev not taken by surprise
![Page 19: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/19.jpg)
Co-operative scheduling
like the nice guy at the ATM,letting you go in between
because he still has20 checks to deposit
![Page 20: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/20.jpg)
Preemptive scheduling
the CPU gets taken away from you(plus you usually get punished for hogging)
![Page 21: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/21.jpg)
So.. back to processes
![Page 22: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/22.jpg)
Communication overhead
(most of the time, justpremature optimization)
![Page 23: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/23.jpg)
Creation cost
(5000 per second on a900MHz Pentium 3,
don't know if anyone's cared since..)
![Page 24: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/24.jpg)
Idea: Communicate by sharing memory
![Page 25: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/25.jpg)
Threads
... now you have two problems
![Page 26: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/26.jpg)
Unix has mmap you know(aka share memory by sharing memory)
<rant>Voluntarily giving up that precious MMU counts as pretty stupid in my book...</rant>
<rant>Threads originate from operating systems with bad schedulers and VM</rant>
![Page 27: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/27.jpg)
GILPython threadsare only good
for concurrency,not for parallelism
![Page 28: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/28.jpg)
Multiprocessing
Ingenious hack to emulate thread API with multiple processes.
Eww, pickles.
![Page 29: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/29.jpg)
Do not trust the pickle
![Page 30: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/30.jpg)
def process(work): while True: item = work.get() try: r = requests.get(item['url']) if r.status_code != 200: print 'Broken link {source} -> {url}'.format(**item) continue level = item['level'] - 1 if level > 0: if r.headers['content-type'].split(';', 1)[0] != 'text/html': print 'Not html: {url}'.format(**item) continue doc = html5lib.parse(r.content, treebuilder='lxml') links = doc.xpath( '//html:a/@href | //html:img/@src', namespaces=dict(html='http://www.w3.org/1999/xhtml'), ) for link in links: if link.startswith('#'): continue url = urlparse.urljoin(item['url'], link, allow_fragments=False) parsed = urlparse.urlsplit(url) if parsed.scheme not in ['http', 'https']: print 'Not http: {url}'.format(url=url) continue work.put( dict( level=level, url=url, source=item['url'], ), ) finally: work.task_done()
import html5libimport multiprocessingimport requestsimport urlparse def main():
work = multiprocessing.JoinableQueue(100) work.put( dict( level=2, url='http://www.dreamhost.com/', source=None, ), ) processes = [ multiprocessing.Process( target=process, kwargs=dict( work=work, ), ) for _ in xrange(10) ] for proc in processes: proc.start()
work.join()
for proc in processes: proc.terminate() for proc in processes: proc.join()
if __name__ == '__main__': main()
![Page 31: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/31.jpg)
CommunicatingSequentialProcesses
more about the thought process than the implementation
![Page 32: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/32.jpg)
"Share memory by communicating"
Radically less locking;radically less locking bugs.
![Page 33: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/33.jpg)
Prime number sieve with CSP
![Page 34: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/34.jpg)
Prime number sieve with Geventfrom gevent import monkey; monkey.patch_all()import geventimport gevent.queueimport itertools
def generate(queue): """Send the sequence 2, 3, 4, ... to queue.""" for i in itertools.count(2): queue.put(i)
def filter_primes(in_, out, prime): """Copy the values from in_ to out, removing those divisible by prime.""" for i in in_: if i % prime != 0: out.put(i)
def main(): ch = gevent.queue.Queue(maxsize=10) gevent.spawn(generate, ch) for i in xrange(100): # Print the first hundred primes. prime = ch.get() print prime ch1 = gevent.queue.Queue(maxsize=10) gevent.spawn(filter_primes, ch, ch1, prime) ch = ch1
![Page 35: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/35.jpg)
Look ma, no threads!
![Page 36: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/36.jpg)
monkey.patch_all()
![Page 37: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/37.jpg)
Coroutines
def foo(): a = yield b = yield yield a+b
f = foo()next(f)f.send(1)print f.send(2)
![Page 38: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/38.jpg)
Coroutines #2
def ticktock(): messages = ['tick', 'tock'] while True: cur = messages.pop(0) new = yield cur if new is not None: print '# Okay, {new} not {cur}.' \ .format(new=new, cur=cur) cur = new messages.append(cur)
t = ticktock()print next(t) # tickprint next(t) # tockprint next(t) # tickt.send('bip')# Okay, bip not tick.print next(t) # bipprint next(t) # tockprint next(t) # bipprint next(t) # tockt.send('bop')# Okay, bop not tock.print next(t) # bopprint next(t) # bipprint next(t) # bopprint next(t) # bipprint next(t) # bopprint next(t) # bip
![Page 39: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/39.jpg)
Laziness without hubris
with open(path) as f: words_on_lines = (line.split() for line in f) words = itertools.chain.from_iterable(words_on_lines) lengths = (len(word) for word in words) (a, b) = itertools.tee(lengths) count = sum(1 for _ in a) summed = sum(b) avg = summed / count
![Page 40: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/40.jpg)
Go watchJesse
Noller'sPyCon talks
![Page 41: Concurrency, Parallelism, Events, Asynchronicity, Oh My threads, processes, queues & workers, multiprocessing, gevent, twisted, yo momma SoCal Piggies](https://reader038.vdocuments.site/reader038/viewer/2022110211/56649ed55503460f94be5c3c/html5/thumbnails/41.jpg)
Thank you!
Questions?
Image credits:
self-portrait of anonymous macaquehttp://www.techdirt.com/articles/20110706/00200314983/monkey-business-can-monkey-license-its-copyrights-to-news-agency.shtml
fair use fromhttp://en.wikipedia.org/wiki/Gil_Grissom
<3 CChttp://www.flickr.com/photos/bitzcelt/2841983698/
<3 CChttp://www.flickr.com/photos/pedromourapinheiro/2122754745/
fair use fromhttps://plus.google.com/115662513673837016240