go concurrency

35
Writing Concurrent Programs robustly and productively with Go and zeromq 18 November 2014 Loh Siu Yin Technology Consultant, Beyond Broadcast LLP 1 of 35

Upload: siuyin

Post on 02-Jul-2015

259 views

Category:

Technology


0 download

DESCRIPTION

Writing concurrent programs robustly with Go and zeromq.

TRANSCRIPT

Page 1: Go concurrency

Writing Concurrent Programsrobustly and productively with Go and zeromq18 November 2014

Loh Siu YinTechnology Consultant, Beyond Broadcast LLP

1 of 35

Page 2: Go concurrency

Traditional v. Concurrent

sequential (executed one after the other)

concurrent (executed at the same time)

concurrency -- reflects the way we interact with the real world

2 of 35

Page 3: Go concurrency

Which languages can be used to write concurrentprograms?

3 of 35

Page 4: Go concurrency

Concurrent languages

C with pthreads lib

Java with concurrent lib

Scala with actors lib

Scala with Akka lib

Go

Any difference between Go and the other languages?

4 of 35

Page 5: Go concurrency

Concurrent programs are hard to write

5 of 35

Page 6: Go concurrency

Why Go for concurrent programs?

Go does not need an external concurrency library

Avoids, locks, semaphore, critical sections...

Instead uses goroutines and channels

6 of 35

Page 7: Go concurrency

Software I use to write concurrent programs:

Go -- Go has concurrency baked into the language.

zeromq -- zeromq is a networking library with multitasking functions.

For more info:

golang.org (http://golang.org)

zeromq.org (http://zeromq.org)

7 of 35

Page 8: Go concurrency

Go Hello World program

Why capital P in Println?Println is not a class (like in java).It is a regular function that is exported (visible from outside) the fmt package.The "fmt" package provides formatting functions like Println and Printf.

package main

import "fmt"

func main() { fmt.Println("Hello Go!")} Run

8 of 35

Page 9: Go concurrency

Package naming in Go

If you import "a/b/c/d" instead of import "fmt" and that package "a/b/c/d" hasexported Println.That package's Println is called as d.Println and not a.b.c.d.Println.

If you import "abcd", what is the package qualifier name?

package main

import "fmt"

func main() { fmt.Println("Hello Go!")} Run

9 of 35

Page 10: Go concurrency

Go Concurrent Hello World

package main

import ( "fmt" "time")

func a() { for { fmt.Print(".") time.Sleep(time.Second) }}func b() { for { fmt.Print("+") time.Sleep(2 * time.Second) }}func main() { go b() // Change my order go a() //time.Sleep(4 * time.Second) // uncomment me!} Run

10 of 35

Page 11: Go concurrency

goroutine lifecylce

goroutines are garbage collected when they end. This means the OS reclaims theresources used by a goroutine when it ends.

goroutines are killed when func main() ends. The killed goroutines are then garbagecollected.

goroutines can be run from functions other than main. They are not killed if thatfunction exits. They are killed only when main() exits.

The GOMAXPROCS variable limits the number of operating system threads that canexecute user-level Go code simultaneously. [default = 1]

At 1 there is no parallel execution,

increase to 2 or higher for parallel execution if you have 2 or more cores.

golang.org/pkg/runtime (http://golang.org/pkg/runtime/)

11 of 35

Page 12: Go concurrency

Go channels

Go channels provide a type-safe means of communication between:

the main function and a goroutine, or

two goroutines

What is:

a goroutine?

the main function?

12 of 35

Page 13: Go concurrency

func x()

func x() returns a channel of integers can can only be read from.Internally it runs a goroutine that emits an integer every 500ms.

Demo type safety. Use MyInt for i.

type MyInt int

func x() <-chan int { ch := make(chan int) go func(ch chan<- int) { // var i MyInt // Make i a MyInt var i int for i = 0; ; i++ { ch <- i // Send int into ch time.Sleep(500 * time.Millisecond) } }(ch) return ch} Run

13 of 35

Page 14: Go concurrency

func y()

func y() returns a channel of integers can can only be written to.All it does is run a goroutine to print out the integer it receives.

func y() chan<- int { ch := make(chan int) go func(ch <-chan int) { for { i := <-ch fmt.Print(i, " ") } }(ch) return ch} Run

14 of 35

Page 15: Go concurrency

Go channels 1

1 of 2 ...

package main

import ( "fmt" "time")

type MyInt int

func x() <-chan int { ch := make(chan int) go func(ch chan<- int) { // var i MyInt // Make i a MyInt var i int for i = 0; ; i++ { ch <- i // Send int into ch time.Sleep(500 * time.Millisecond) } }(ch) return ch} Run

15 of 35

Page 16: Go concurrency

Go channels 2

2 of 2

func y() chan<- int { ch := make(chan int) go func(ch <-chan int) { for { i := <-ch fmt.Print(i, " ") } }(ch) return ch}func main() { xch := x() // emit int every 500 ms ych := y() // print the int for { select { case n := <-xch: ych <- n // send it you ych for display case <-time.After(501 * time.Millisecond): // Change me fmt.Print("x") } }} Run

16 of 35

Page 17: Go concurrency

Synchronizing goroutines

n gets an integer from xch and pushes it to ych to be displayed.

What if the source of data, xch, is on a different machine?

Go can't help here. There is no longer a network channel package.

Rob Pike (one of the Go authors) said that he didn't quite know what he was doing...

func main() { xch := x() // emit int every 500 ms ych := y() // print the int for { select { case n := <-xch: ych <- n // send it you ych for display case <-time.After(501 * time.Millisecond): // Change me fmt.Print("x") } }} Run

17 of 35

Page 18: Go concurrency

zeromq Networking Patterns

Pub/Sub Many programs can pub to a network endpoint. Many other programs cansub from that endpoint. All subscribers get messages from mulitple publishers.

Req/Rep Many clients can req services from a server endpoint which rep withreplies to the client.

Push/Pull Many programs can push to a network endpoint. Many other programscan pull from that endpoint. Messages are round-robin routed to an availablepuller.

18 of 35

Page 19: Go concurrency

Using zeromq in a Go program

Import a package that implements zeromq

19 of 35

Page 20: Go concurrency

zeromq Pusher

package main

import ( "fmt" zmq "github.com/pebbe/zmq2")

func main() { fmt.Println("Starting pusher.")

ctx, _ := zmq.NewContext(1) defer ctx.Term()

push, _ := ctx.NewSocket(zmq.PUSH) defer push.Close() push.Connect("ipc://pushpull.ipc") // push.Connect("tcp://12.34.56.78:5555")

for i := 0; i < 3; i++ { msg := fmt.Sprintf("Hello zeromq %d", i) push.Send(msg, 0) fmt.Println(msg) } // Watch for Program Exit} Run

20 of 35

Page 21: Go concurrency

zeromq Puller

This puller may be a data-mover moving gigabytes of data around. It has to berock-solid with the program running as a daemon (service) and never shut down.Go fits this description perfectly! Why?In addition, Go has a built-in function defer which helps to avoid memory leaks.

func main() { fmt.Println("Starting puller")

ctx, _ := zmq.NewContext(1) defer ctx.Term()

pull, _ := ctx.NewSocket(zmq.PULL) defer pull.Close() pull.Bind("ipc://pushpull.ipc")

for { msg, _ := pull.Recv(0) time.Sleep(2 * time.Second) // Doing time consuming work fmt.Println(msg) // work all done }} Run

21 of 35

Page 22: Go concurrency

msg, _ := pull.Recv(0)

functions in Go can return multiple values.

the _ above is usually for an err variable. Eg. msg,err := pull.Recv(0)

a nil err value means no error

if you write msg := pull.Recv(0) [note: no _ or err var], the compiler will fail thecompile with an error message (not a warning).

typing _ forces the programmer to think about error handling

msg,err := pull.Recv(0)if err != nil { fmt.Println("zmq pull:", err)}

22 of 35

Page 23: Go concurrency

zeromq Controller/Pusher in ruby

The pusher may be a controller that is in active development -- requiring frequentcode updates and restarts. With zeromq, we can decouple the stable long-runningprocess from the unstable code being developed. What is the advantage of this?

#!/usr/bin/env rubyrequire 'ffi-rzmq'

puts "Starting ruby pusher"ctx = ZMQ::Context.new(1)push = ctx.socket(ZMQ::PUSH)push.connect("ipc://pushpull.ipc")# push.connect("tcp://12.34.56.78:5555")

(0..2).each do |i| msg = "Hello %d" % i push.send_string(msg) puts(msg)end

push.close()ctx.terminate() Run

23 of 35

Page 24: Go concurrency

Putting it all together

email_mover (puller) has two slow tasks: email and move big data.

24 of 35

Page 25: Go concurrency

email_mover

func main() { fmt.Println("Starting email_mover") z := zmqRecv() e := emailer() m := mover() for { s := <-z e <- s m <- s // report when done to 0.1ms resolution fmt.Println(time.Now().Format("05.0000"), "done:", s) }} Run

25 of 35

Page 26: Go concurrency

zmqRecv goroutine

func zmqRecv() <-chan string { ch := make(chan string)

go func(ch chan<- string) { ctx, _ := zmq.NewContext(1) defer ctx.Term()

pull, _ := ctx.NewSocket(zmq.PULL) defer pull.Close() pull.Bind("ipc://pushpull.ipc") for { msg, _ := pull.Recv(0) ch <- msg } }(ch) return ch} Run

26 of 35

Page 27: Go concurrency

emailer goroutine

func emailer() chan<- string { ch := make(chan string, 100) // buffered chan go func(ch <-chan string) { for { s := <-ch time.Sleep(1 * time.Second) fmt.Println("email:", s) } }(ch) return ch} Run

27 of 35

Page 28: Go concurrency

mover goroutine

func mover() chan<- string { ch := make(chan string, 100) // buffered chan go func(<-chan string) { for { s := <-ch time.Sleep(3 * time.Second) fmt.Println("move:", s) } }(ch) return ch} Run

28 of 35

Page 29: Go concurrency

email_mover main()

Why do we need buffered channels in emailer() and mover() and not for zmqRecv()?

email_mover is a stable service in production written in Go.This service should never stop, leak memory or lose data.

func main() { fmt.Println("Starting email_mover") z := zmqRecv() e := emailer() m := mover() for { s := <-z e <- s m <- s // report when done to 0.1ms resolution fmt.Println(time.Now().Format("05.0000"), "done:", s) }} Run

29 of 35

Page 30: Go concurrency

Go Pusher (not changed)

package main

import ( "fmt" zmq "github.com/pebbe/zmq2")

func main() { fmt.Println("Starting pusher.")

ctx, _ := zmq.NewContext(1) defer ctx.Term()

push, _ := ctx.NewSocket(zmq.PUSH) defer push.Close() push.Connect("ipc://pushpull.ipc") // push.Connect("tcp://12.34.56.78:5555")

for i := 0; i < 3; i++ { msg := fmt.Sprintf("Hello zeromq %d", i) push.Send(msg, 0) fmt.Println(msg) } // Watch for Program Exit} Run

30 of 35

Page 31: Go concurrency

ruby Pusher (not changed)

#!/usr/bin/env rubyrequire 'ffi-rzmq'

puts "Starting ruby pusher"ctx = ZMQ::Context.new(1)push = ctx.socket(ZMQ::PUSH)push.connect("ipc://pushpull.ipc")# push.connect("tcp://12.34.56.78:5555")

(0..2).each do |i| msg = "Hello %d" % i push.send_string(msg) puts(msg)end

push.close()ctx.terminate() Run

31 of 35

Page 32: Go concurrency

email_mover maintenance

email_mover sends emails and moves files. These are well understood, stablefunctions. This service should never be shutdown.

However, hardware needs maintenance. How do we swap-in a new email_moverwithout losing data?

zeromq to the rescue!"ØMQ ensures atomic delivery of messages; peers shall receive either all message partsof a message or none at all."

32 of 35

Page 33: Go concurrency

Maintenance Procedure:

Disconnect the network cable from the first email_mover host. zeromq messageswill not begin to queue up at the pushers. Because zeromq message delivery isatomic, no data is lost.

Connect the network cable to the new email_mover host. zeromq messages beginto flow again.

Job done!

33 of 35

Page 34: Go concurrency

Why is data not lost?

ReceiveThe half-received message / data packet received by the old host was deemed as notreceived by zeromq and is discarded.

SendThe half-sent message / data packet that was interrupted when the connection wasbroken was detected by zeromq as not delivered and will be sent again. This time tothe new host.

34 of 35

Page 35: Go concurrency

Thank you

Loh Siu YinTechnology Consultant, Beyond Broadcast [email protected] (mailto:[email protected])

35 of 35