l fu - dao: a novel programming language for bioinformatics

15
Motivation Example Features Concurrent Programming JIT ClangDao Future Dao: a novel programming language for bioinformatics Limin Fu UC San Diego BOSC 2012 July 14, 2012

Upload: jan-aerts

Post on 11-May-2015

1.422 views

Category:

Technology


2 download

DESCRIPTION

Presentation at BOSC2012 by L Fu - Dao: a novel programming language for bioinformatics

TRANSCRIPT

Page 1: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

Dao: a novel programming language forbioinformatics

Limin Fu

UC San Diego

BOSC 2012

July 14, 2012

Page 2: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

Why a new language for bioinformatics?

A simple fact:

Programming languages commonly used in bioinformatics aredesigned before:

multi-core machines started to become common;some important programming paradigms are widely accepted.

Once a language has got a lot of backward compatibility to maintain,it becomes very hard to add new features or support newprogramming paradigms without introducing inconsistency!

Page 3: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

Dao programming language (http://daovm.net)

Key Features

Optional typing with type inference and static type checking;Native support for concurrent programming;LLVM-based Just-In-Time (JIT) compiling;Simple C interfaces for easy embedding and extending;Clang-based tool for automatic wrapping of C/C++ libraries;

Page 4: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

A simple example

Use thread task to compute the sum of squares:# Start a thread task and return a future value:fut = mt.start( $now )::{

sum2 = 0for( i = 1 : 1000 ) sum2 += i * ireturn sum2

}

Due to type inference, variable fut will be a future value with type,

future<int>

It is optional to write the type explicitly as the following,

fut : future<int> = mt.start( $now )::{ ...

mt built-in module for multi-threading;

mt.start()::{} a code section method to start a thread task;

The thread task will be executed in a different thread.

An enum symbol to request immediate start of the thread task.

(More or less a combination of C++ enum and Ruby symbol.)

Page 5: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

A simple example

Use thread task to compute the sum of squares:# Start a thread task and return a future value:fut = mt.start( $now )::{

sum2 = 0for( i = 1 : 1000 ) sum2 += i * ireturn sum2

}

Due to type inference, variable fut will be a future value with type,

future<int>

It is optional to write the type explicitly as the following,

fut : future<int> = mt.start( $now )::{ ...

mt built-in module for multi-threading;

mt.start()::{} a code section method to start a thread task;

The thread task will be executed in a different thread.

An enum symbol to request immediate start of the thread task.

(More or less a combination of C++ enum and Ruby symbol.)

Page 6: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

A simple example

Use thread task to compute the sum of squares:# Start a thread task and return a future value:fut = mt.start( $now )::{

sum2 = 0for( i = 1 : 1000 ) sum2 += i * ireturn sum2

}

Due to type inference, variable fut will be a future value with type,

future<int>

It is optional to write the type explicitly as the following,

fut : future<int> = mt.start( $now )::{ ...

mt built-in module for multi-threading;

mt.start()::{} a code section method to start a thread task;

The thread task will be executed in a different thread.

An enum symbol to request immediate start of the thread task.

(More or less a combination of C++ enum and Ruby symbol.)

Page 7: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

A simple example

Use thread task to compute the sum of squares:# Start a thread task and return a future value:fut = mt.start( $now )::{

sum2 = 0for( i = 1 : 1000 ) sum2 += i * ireturn sum2

}

Due to type inference, variable fut will be a future value with type,

future<int>

It is optional to write the type explicitly as the following,

fut : future<int> = mt.start( $now )::{ ...

mt built-in module for multi-threading;

mt.start()::{} a code section method to start a thread task;

The thread task will be executed in a different thread.

An enum symbol to request immediate start of the thread task.

(More or less a combination of C++ enum and Ruby symbol.)

Page 8: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

Concurrent programming: parallelized code section methods

Parallelized code section methodsThe multi-threading module mt provides a number of parallelizedcode section methods:

mt.iterate(): iterate on array, list, map, or a number of iteration;

mt.map(): map items of array, list or map to produce new array or list;

mt.apply(): apply new values to the items of array, list or map;

mt.find(): find the first item that satisfy a condition;

Example,ls = {1,2,3,4,5,6}# Parallel iteration:mt.iterate( times => 10 )::{ [index] io.writeln( index ) }mt.iterate( ls, threads => 2 )::{ [item] io.writeln( item ) }# Parallel mapping and value application:ls2 = mt.map( ls, 2 )::{ [it] it*it } # ls2 = {1,4,9,16,25,36}mt.apply( ls, 2 )::{ [it] it*it } # ls = {1,4,9,16,25,36}

Page 9: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

Concurrent programming: asynchronous classes

Asynchronous classes

Asynchronous classes produce asynchronous instances;Calling a method will automatically create a thread task;Thread tasks on the same instance are queued for execution.

Example,class @Clustering {

routine Run(){ DoKmeansClustering() }}cls = Clustering()job = cls.Run()while( 1 ){

DoSomethingElse();if( job.wait( 0.1 ) ) break; # wait for 0.1 second

}

Page 10: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

LLVM-based Just-In-Time (JIT) compiler

Based on the Low Level Virtual Machine (LLVM);Emphasis on numeric computation;Compiles a subset of Dao virtual machine instructions;

JIT Performance Test (time in seconds)

Program Argument Dao Dao+JIT Speedup Python C (-O2)

fannkuch 11 100.5 22.8 4.4X 339.3 4.8

mandelbrot 4000 40.0 5.1 7.8X 158.8 3.8

nbody 10000000 63.5 19.2 3.4X 333.4 2.4

spectral-norm 5000 98.0 7.6 12.8X 985.5 7.7

binary-trees 16 64.4 64.4 1.0X 22.3 5.5

Note: benchmark programs are taken from Computer Language Benchmarks Gamehttp://shootout.alioth.debian.org

Page 11: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

LLVM-based Just-In-Time (JIT) compiler

Based on the Low Level Virtual Machine (LLVM);Emphasis on numeric computation;Compiles a subset of Dao virtual machine instructions;

JIT Performance Test (time in seconds)

Program Argument Dao Dao+JIT Speedup Python C (-O2)

fannkuch 11 100.5 22.8 4.4X 339.3 4.8

mandelbrot 4000 40.0 5.1 7.8X 158.8 3.8

nbody 10000000 63.5 19.2 3.4X 333.4 2.4

spectral-norm 5000 98.0 7.6 12.8X 985.5 7.7

binary-trees 16 64.4 64.4 1.0X 22.3 5.5

Note: benchmark programs are taken from Computer Language Benchmarks Gamehttp://shootout.alioth.debian.org

Page 12: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

ClangDao: bringing C/C++ libraries to your finger tips

Based on Clang (C Language Family Frontend for LLVM);Generate bindings directly from C/C++ header files;Support C/C++ functions, C structs, C callbacks, C++ classesand inheritance, C++ virtual functions, C++ templates (to someextent) etc.;

Page 13: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

ClangDao: bringing C/C++ libraries to your finger tips

List of bindings generated by ClangDaoScientific: DaoGSL GNU Science Library (GSL)

DaoBamTools BamToolsDaoGenomeTools GenomeToolsDaoSVM LibSVM (Support Vector Machine)

Visualization: DaoVTK Visualization ToolkitDaoMathGL MathGL

2D Graphics: DaoGraphicsMagick GraphicsMagick3D Graphics: DaoOpenGL OpenGL

DaoHorde3D Horde3D EngineDaoIrrlicht Irrlicht 3D Engine

Multimedia: DaoSDL Simple DirectMedia Layer (SDL)DaoSFML Simple and Fast Multimedia Library

GUI: DaoFLTK Fast Light Toolkit (FLTK)Miscellaneous: DaoXML libxml2

DaoBullet Bullet Physics EngineDaoGameKit GameKit Game EngineDaoGamePlay GamePlay Game Engine

Page 14: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

Future development

The development of Dao in the context of bioinformatics

BioDao : future<Dao::Bioinformatics> = mt.start::{DevelopDaoPackages( field => Bioinformatics )

}

It should lead to an open source project named BioDao;This project will provide a large number of modules andpackages for bioinformatics;It may start from screening candidate C/C++ bioinformaticslibraries for automatic or semi-automatic binding;Suggestions for such libraries are highly welcome.

Page 15: L Fu - Dao: a novel programming language for bioinformatics

Motivation Example Features Concurrent Programming JIT ClangDao Future

Thank you!

Links and Contactshttp://daovm.nethttps://github.com/[email protected]@ucsd.edu