functional programming for computing clouds

34
Functional Programming for computing clouds Joerg Fritsch, NATO CI Agency School of Computer Science & Informatics Cardiff University, 24 October 2012

Upload: joerg-fritsch

Post on 12-May-2015

1.190 views

Category:

Technology


1 download

DESCRIPTION

With the advent of multicore CPUs, cloud computing and Big Data, we are currently observing changes that will eventually lead information technology into a whole new era, and we are need to search for programming language paradigms that match with it. Will Functional Programming languages (FPLs) be the game changer?

TRANSCRIPT

Page 1: Functional Programming for Computing Clouds

Functional Programming for computing clouds

Joerg Fritsch, NATO CI AgencySchool of Computer Science & Informatics Cardiff University, 24 October 2012

Page 2: Functional Programming for Computing Clouds

2

Agenda

• Essentials of Functional Programming Languages

• Vision and requirements of computing clouds

• Haskell: a Pure Functional Programming Language

• Some innocent code: working with immutable data

• Gaps between the cloud computing vision and FPLs

Page 3: Functional Programming for Computing Clouds

3

Functional Programming Languages

• Based on the λ-calculus• Declarative • Functions are declared, describe relation between

input and output• Functions always evaluate to the same value for a

given argument (“free of side effects”).• Variables are assigned once.• Functional PLs that by default exclude destructive

modifications (to data structures) are called “pure.”

Page 4: Functional Programming for Computing Clouds

4

λ - calculus

• Alonzo Church 1930s• Small Grammar• Grammar can partially be found back in LISP and

Haskell syntax• Can express everything that is computable• No state !

Page 5: Functional Programming for Computing Clouds

5

Pure FPLs

• Functions can be composed, curried, etc..• All pure functions can be executed in parallel• Compiler can make it fit for multicore: e.g. re-arrange

order of function execution or inline. • Runtime can cache function evaluation.• IO is a beast that disturbs this concepts & needs to

be tamed (for example with a monad).• Every Haskell program is a function in the IO monad.

Page 6: Functional Programming for Computing Clouds

6

Pure FPLs (continued)

• Haskell• Clean• Go• F#• ML / OCaml• Lisp / Scheme

• Scala• Clojure• XSLT• Erlang• SQL• Mathematica

Page 7: Functional Programming for Computing Clouds

7

Vision & requirements of cloud computing

• Clouds will need to support scalable programs.

• “Any” application scaled through distribution over parallel (multicore) hardware.

• Applications with high concurrency are good candidates for parallelization.

Page 8: Functional Programming for Computing Clouds

8

Elasticity in Computing Clouds (now)

• Duplication!• IaaS– Duplicate VMs including OS.

• PaaS– Duplicate language App Servers (e.g. JVM, Rails) or RTS and guest

code.– Duplicate app execution engine (a component of the PaaS platform

that is). • (Virtualized) Load Balancers are the glue. “Clustering"• Concurrency is enabler for parallelization.• Map reduce sold as separate capability.• Multi-tenancy always supported.

Page 9: Functional Programming for Computing Clouds

9

Elasticity in Computing Clouds (continued)

Page 10: Functional Programming for Computing Clouds

10

Elasticity in Computing Clouds (in the future?)

Legacy/IaaS

• Currently prevailing• Unit of scale = OS, VM,

Runtime• Duplication of units

Future/PaaS

• Borders of building blocks are dissolved

• Unit of scale = (Green)thread?• Requires new software, new

programming languages, new designs.

Page 11: Functional Programming for Computing Clouds

11

Haskell

• Named after Haskell Brooks Curry (1900 - 1982). Combinatory logic (1930s).

• Born as Haskell 1.0 standard in 1990 (approximately at the same time than Erlang)

• Haskell 98 is most prominent definition yet

Page 12: Functional Programming for Computing Clouds

12

Haskell (continued)

• Is a pure functional PL• Has a static type system• Is Lazy• Function composition and currying mimicking

mathematical functions• Has monads (related to category theory)

• Is sometimes mind boggling blowing

Page 13: Functional Programming for Computing Clouds

13

What does Haskell bring to the table?

Haskell

Functional

Strong Types

Inherently Parallel

Immutable by defaultLazy

evaluationCode Maintainabili

ty

Page 14: Functional Programming for Computing Clouds

14

Functions

• Functions are Data as well• Functions consist of way less code than objects• Higher order functions• return is a function name• Function signatures declare constraints (types) and

computational strategies. adder :: [Int] -> Int --type signature fun_name :: input_type -> output_type

adder [] = 0 --define output for the empty listadder (x:xs) = x + sum xs --use some fancy reursion

Page 15: Functional Programming for Computing Clouds

15

Immutability of Data

• The consequences are huge. There is more data than you think. For example a counter: c = c + 1;

• Haskell implementation of “counters” depends on what you need to achieve.

• Common to use Map and Fold (aka reduce)• Eventually counters represent some sort of state. Use

the state monad: Control.Monad.State• Haskell is by default pure. Mutable data structures

can be used: Data.IORef, Data.Judy but are seen as “not idiomatic”.

Page 16: Functional Programming for Computing Clouds

16

Immutability of Data (continued)

• Data.IORef part of the base package.• The function unsafePerformIO can “subvert” the

type system and allows any kind of mutable state.• A large number of Haskell modules make use of it! – Randomness & Encryption, GUIs, …

• Is immutability over-emphasized?

Page 17: Functional Programming for Computing Clouds

17

Immutability of Data: There is no list

Lists are build on top of cons cells.Cons cells contain pairs of values.Example. cons (:) and append (++) to a “list”.[1,2,3,4] = 1:2:3:4:[] = 1 : (2 : (3 : (4 : [] ) ) )

cons :0:1:2:3:4:[] = 0 : (1 : (2 : (3 : (4 : [] ) ) ) )Result is new list [0] plus a pointer to the previous list. Runs in O(1) time.This is also called “sharing”.

append ++1:2:3:4:5:[] = 1 : (2 : (3 : (4 : (5 : []) ) ) )Destructive operation, whole data structure taken apart recursively. Result is an all new data structure. Runs in O(n) time.

Page 18: Functional Programming for Computing Clouds

18

Data.Map.Fold (Map Reduce)

• Foldadder :: [Int] -> Int adder xs = foldr (+) 0 xs -- reduce a map using the +

• Data Structures can have a left and a right: foldl, foldr, foldM

Page 19: Functional Programming for Computing Clouds

19

Strong Type System

• All monomorphic types are part of the category of Haskell types, “Hask”. Maps between types are the functions in Haskell.

• Data types can be tainted. E.g. IO Int, Maybe Int• Type system supports safety and correctness. Haskell code is

reasonably easy to test.• At the beginning the type system frequently gets into your way.• Maintainability: I am often positively surprised how many

changes to my existing code work at the first compilation (once I get the types right).

• Definition of own types and type classes etc. bears the foundation for great flexibility.

Page 20: Functional Programming for Computing Clouds

20

(Parametric) Polymorphism: Type Variables

adder :: Num a => [a] -> aadder xs = foldr (+) 0 xs -- reduce a map using the +

• More powerful types of polymorphism: type classes, kinds, … .

• The type system is Turing complete & allows manipulations far exceeding any other PL

• Type classes & type level programming

Page 21: Functional Programming for Computing Clouds

21

Lazy Evaluation

• Lazy evaluation, “call-by-need”.• Partially the paradigm that makes immutable data

structures workable (see also “sharing”).• Risk of space leaks• Opens up a door to “infinity”: infinite lists [1, …],

Fibonacci numbers, primes, e, … & to new strategies in AI (Hughes, 1990).

Page 22: Functional Programming for Computing Clouds

22

Do Cloud Computing and FPLs match?

Aynschronous operations

Parallel, multi- & many core support

Elasticity & large scale operations

Secure, multi tenancy,

confidentiality

Immutable Data. Shared nothing. Message passing (e.g. actors) available to re-synchronize processes STM better manageable than locks.

FPLs are inerently parallel. Functions, Closures, Currying Declarative Compiler has freedom to re-arrange “everything”

Elasticity is left to the developer or to the “app engine” Code easily testable & maintainable

No “Safe Haskell” may be a good start.

Page 23: Functional Programming for Computing Clouds

23

Multi Tenancy: Safe Haskell

• Released to public in early 2012.• Vision: tenants upload code (e.g. a worker) that gets

compiled and executed as plugin by a Haskell app-engine.

• Plugin-concept based on library System.Eval.Haskell

• New language extensions to allow secure code only: -XSafe, -XTrustworthy, -Xunsafe

• Eventually based on type safety.

Page 24: Functional Programming for Computing Clouds

24

Safe Haskell (continued)

• Two routes decide what to be trusted:

• -XSafe = trust inferred by the compiler, limiting Haskell to a (small) subset.

• In PaaS subsets and restrictions are “normal”. Think Java on the Google App Engine.

• -XTrustworthy = trust decided by a person. Not a powerful security concept?

Page 25: Functional Programming for Computing Clouds

25

Issues

• There is no obvious way how to match functions to threads.

• Threads are more related to sequential programming (with shared memory) than to FP. Think CSP.

• Many programs have to parallelize relative small computations with high inter-dependency.

• Message passing & actors also no fit to distribute small computations.

• Function composition is … sequential execution!

Page 26: Functional Programming for Computing Clouds

26

Issues (continued)

• When a computation is moved to a remote node, little is known about cost of transport and state (e.g. load of the remote node). Multi-tenancy!

• Cost model required.• (Network)Protocols are the most prominent cost center.• It is extremely unlikely that commercial clouds will use

“niche” hardware or proprietary protocols.• Protocol design will need to be simple and light weight.• Protocols in distributed environments will orchestrate

and coordinate. Basis for a DS coordination language?

Page 27: Functional Programming for Computing Clouds

27

Amdahl’s Law

• Possible to calculate the speed improvement when n% of the code are parallel.

• Unknown under what conditions the law holds.

• Relatively small influences have huge adversary effects: – code that has a parallel portion of 95% results in a speed

improvement of factor six on an 8 core CPU.– code that has a parallel portion of 75% results in a speed

improvement of factor three on an 8 core CPU.

Page 28: Functional Programming for Computing Clouds

28

Amdahl’s Law (continued)

Page 29: Functional Programming for Computing Clouds

29

If Amdahl’s law holds, then …

We better go on & develop sequential codeBecauseInefficiencies and overhead add up:• Compiler• Runtime• Competition for the cpu cores & resources

• By the way: (OS) threads are costly to provision, here elastic may become plastic!

Page 30: Functional Programming for Computing Clouds

30

Thank You.

Page 31: Functional Programming for Computing Clouds

31

Spare Slides

Page 32: Functional Programming for Computing Clouds

32

Mythical Walk-Through

Quantum Field TheoryJones PolynomialKnot TheoryCategory of TanglesCategory Theory“Hask”, Category of Haskell Types (and maps)Haskell

Page 33: Functional Programming for Computing Clouds

33

OOP

• “OOP is eliminated entirely from the introductory curriculum [of Carnegie Mellon University], because it is both anti-modular and anti-parallel by its very nature, and hence unsuitable for a modern CS curriculum.” (Harper, 2011)

Page 34: Functional Programming for Computing Clouds

34

Common Claims & Expectations

• FPLs my let us get away with less duplication.

• FPLs are inherently parallel

• FPLs are inherently thread safe

• FPLs are inherently modular

• FPLs are easily testable and maintainable.