julia language: inside the corporation

Post on 02-Dec-2014

3.874 Views

Category:

Data & Analytics

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Outline of my experience introducing and championing the usage of Julia Language inside a medium sized Financial organization. Cover reason for using it, reason why I thought it was a good fit and some advice on how to improve other first experience

TRANSCRIPT

|> inside the corporation

Andre Pemmelaar @QuantixResearch

About MeAndre Pemmelaar • 5-yrs Matsushita Financial System Solutions (Panasonic) • 12 Buy-Side Finance

• 7-yrs Japanese Gov’t Bond Options Market Maker (HNL) • 5-yrs Statistical Arbitrage (Global Equities)

• Low latency & Quantitative Algorithm • Primarily use mixture of basic statistics and machine

learning • R,Python,Java, F# …. and of course JULIA!

• Prefer function programming approach (F#, Scala, Haskell)

@QuantixResearch

My road to

My road to using

John Myles White 3.20.2013 at 9:38 am | Permalink Hi Andre, !In the abstract, I think Julia is the ideal language for doing both prototype modeling and transition to production. !But Julia is still very immature as a language, so I would not recommend it being used in production for another year or so. In addition, if you’re looking for an existing toolbox of models, R is the way to go. Even Python has still not caught up with R in this regard.

• Started reading about it in late 2012 ~ early 2013 • Wrote to John Myles White in Spring 2013

@QuantixResearch

• Decided it was too early -> kept following, but didn’t use

My road to using • Revisited ~ early 2014 • Began trying some simple projects

• Reinforcement Learning using tictactoe.jl • Found the code very easy to follow

• Started using the DataFrame.jl • Found it to be very stable and close enough to Panda (python)

• Started writing first serious attempt at something important in May 2014 • Orderbook simulation frame work

• Joined new company 3 months ago - using Julia almost exclusively for 3

month on real world problems in Finance

@QuantixResearch

Realized I could…

• Remain mostly functional in my approach to programming (but not 100%)

• Use fast for loops wherever appropriate (used in a lot of time series simulations)

• Easily code linear algebra, matrix calculations for machine learning, etc.(native in

Julia)

• Do it all it parallel (note: Julia’s parallel not yet 100% there yet)

• All of the above can be done in Python (Sci-kit, Numpy, etc) but often faster and with slightly less code in Julia

@QuantixResearch

My Moment

carefully insert here

Some background on my company

• One of Japan’s largest financial front-office system solution providers • Started off in derivative valuation and derivative OMS systems • Now offers an entire suite of products aimed at Japanese mega banks, and

2nd-tier financial organization • About 600 employees (about 60%~70 are technical) • Primarily production language is company isJava, with some done in C++,

or C • Quantitative analysis is done in Java (heavy duty large data set analysis) or

R for smaller datasets) with a few using Python users • Most quants are focused on Risk or Valuation, but a smaller team (mine)

makes use of predictive analytics, statistics, and ML to enhance various

algorithms

@QuantixResearch

Nothing sells like success• It helps to have a successful example to sell it internally

• In my case, during my first week I found some R code that was used every night (had lots of loops = ripe for porting to Julia)

• Re-wrote in Julia ->

• R took about 15:46m

• Java about +/- 20s

• Julia about 4.3 secs

• Note: Better Java programmer recently bested Julia version (3.9 secs)

On boarding new users

Making the first experience easier

• Set the expectation correctly • Documentation is sparse. • The stuff that is out there may not be current • Julia is fast, but can lose a lot of speed if coded improperly

@QuantixResearch

Poor Performance

Better Performance

Roadblocks

to initial adoption

I asked Julia colleagues, “What are/were the 3 biggest hurdles”

#3 Package breaking/incompatibility on update

#2 Lack of current documentation

#1 Lack of documentation

No one said bugs in base code, or lack of some critical feature. Everyone wants correct, examples of “here’s how you do this”

Roadblocks

to initial adoption

Really just two problems

1.Documentation

2.Update Chaos

DIY Documentation • Julia base documentation is good • The package’s docs vary greatly • The one great example is Gadfly

• Code, output, & explanation • Not so great doc ex: DataFrame

• No longer current • Many common tasks missing

• Create you own documentation • The single most difficult part of

learning Julia is the lack of current correct examples

• IJulia is fantastic for creating these!

• My Advice • Initially target early users

cases • DIY Document anything

people are struggling with

@QuantixResearch

Decide on the environment/tools

IJulia

@QuantixResearch

LightTables + Jewel

Decide on the environment & tools

• Julia is still new enough that small upgrades can break critical packages

• As the initial “Julia person” in your organization you will often be called on to solve various problems

• Solving new users problems is much easier if they are using the same tools and packages. Don’t underestimate this!

• At the beginning sharing exactly the same environment will make things smoother

• Recommend one person download the installers

• Create an thorough install read me file

@QuantixResearch

Our stack: • Julia 3.1 • IJulia • Light Tables

How did we do?• 6 people set out to learn Julia • 4 of them are now using it everyday • 1 is using it occasional along with Perl • 1 gave up • Why did that one give up?

• He as serious Java skills and good R • Started with Julia Studio (bad 1 st

experience) • Didn’t know about Light tables • Is physically separated from the rest

of us and thus didn’t get initial support to get through the initial low productivity period

@QuantixResearch

Julia: Real exampleRejection Order Algorithm

• The model:

• Determine if a order to lift a quote (execute against someones else's quote) in an OTC markets will be rejected

• Background: OTC market are “over the counter” and depending on the rules, the quoter can reject your order if it suits them

• Julia tools used:

• DataFrame.jl, StatsBase.jl, DecisionTrees.jl, SVM.jl

• Classification problem: 0 not rejected, 1 rejected

• Still on-going project: current best is about 0.54 Kappa

Julia: Real exampleRejection order algorithm con’t

Very unbalanced classes (0.1% are rejected)

• Regime shift means it needs to be somewhat adaptive

• Required us to change some of the libraries

• One of Julia’s great strength’s is that you can easily changed the libraries to suit you needs

What makes Julia great?• Speed? Julia is quite good, but Java can be as fast or faster. C

++ and C are faster

• Time to get a model out? Largely dependent on your knowledge of the tools you are using

• Parallelization? Not really. Still kinda raw. Memory usage can be a bit of an issue.

• Safer code via Functional approach? No. One can code functionally but doesn’t enforce it

• Easy to code and to access/read/understand others code? Yes

What makes Julia great?

Clear, concise code that can easily

be changed

When coded well, it is very fast

Great ability to mix loop based & matrix/vector operations

√ Java ∆ Python (Cython,etc) ∆ R (vectorized)

∆ Java (not really) √ Python ∆ R (only vectorized)∆ Java (not concise)

√ Python ∆ R (only R code. not C or C++)

Thank You!

top related