accelerating ruby with llvmllvm.org/devmtg/2009-10/phoenix_acceleratingruby.pdf · every context is...
TRANSCRIPT
Accelerating Rubywith LLVM
Oct 2, 2009
Evan Phoenix
Tuesday, October 6, 2009
RUBY
Tuesday, October 6, 2009
Strongly, dynamically typed
RUBY
Tuesday, October 6, 2009
Unified Model
RUBY
Tuesday, October 6, 2009
Everything is an object
RUBY
Tuesday, October 6, 2009
3.class # => FixnumRUBY
Tuesday, October 6, 2009
Every code context is equal
RUBY
Tuesday, October 6, 2009
Every context is a method
RUBY
Tuesday, October 6, 2009
Garbage Collected
RUBY
Tuesday, October 6, 2009
A lot of syntax
RUBY
Tuesday, October 6, 2009
Every code context is equalEvery context is a method
Garbage collectedA lot of syntax
RUBYStrongly, dynamically typed
Unified modelEverything is an object
3.class
Tuesday, October 6, 2009
Rubinius
Tuesday, October 6, 2009
Started in 2006
Rubinius
Tuesday, October 6, 2009
Build a ruby environment for fun
Rubinius
Tuesday, October 6, 2009
Unlike most “scripting” languages,write as much in ruby as possible
Rubinius
Tuesday, October 6, 2009
Core functionality of perl/python/ruby in C, NOT in their respective language.
Rubinius
Tuesday, October 6, 2009
C => ruby => C => rubyRubinius
Tuesday, October 6, 2009
Language boundaries suck
Rubinius
Tuesday, October 6, 2009
Started in 2006Built for fun
Turtles all the way down
Rubinius
Tuesday, October 6, 2009
Evolution
Tuesday, October 6, 2009
100% ruby prototype running on 1.8
Evolution
Tuesday, October 6, 2009
Hand translated VM to C
Evolution
Tuesday, October 6, 2009
Rewrote VM in C++
Evolution
Tuesday, October 6, 2009
Switch away from stackless
Evolution
Tuesday, October 6, 2009
Experimented with handwrittenassembler for x86
Evolution
Tuesday, October 6, 2009
Switch to LLVM for JIT
Evolution
Tuesday, October 6, 2009
Evolution100% ruby prototype
Hand translated VM to CRewrote VM in C++
Switch away from stacklessExperiment with assembler
Switch to LLVM for JIT
Tuesday, October 6, 2009
Features
Tuesday, October 6, 2009
Bytecode VM
Features
Tuesday, October 6, 2009
Simple interface to native code
Features
Tuesday, October 6, 2009
Accurate, generational garbage collector
Features
Tuesday, October 6, 2009
Integrated FFI API
Features
Tuesday, October 6, 2009
FeaturesBytecode VM
Interface to native codeGenerational GC
Integrated FFI
Tuesday, October 6, 2009
Benchmarks
Tuesday, October 6, 2009
def foo() ary = [] 100.times { |i| ary << i }end
300,000 timesTuesday, October 6, 2009
0
2.25
4.5
6.75
9
1.8 1.9 rbx rbx jit rbx jit +blocks
2.59
3.60
5.90
5.30
8.02
Seconds
Tuesday, October 6, 2009
def foo() hsh = {} 100.times { |i| hsh[i] = 0 }end
100,000 timesTuesday, October 6, 2009
0
7.5
15
22.5
30
1.8 1.9 rbx rbx jit rbx jit +blocks
12.0112.54
25.36
5.264.85
Seconds
Tuesday, October 6, 2009
def foo() hsh = { 47 => true } 100.times { |i| hsh[i] }end
100,000 timesTuesday, October 6, 2009
0
1.75
3.5
5.25
7
1.8 1.9 rbx rbx jit rbx jit +blocks
2.662.68
6.26
2.09
3.64
Seconds
Tuesday, October 6, 2009
Early LLVM Usage
Tuesday, October 6, 2009
Compiled all methods up front
Early LLVM Usage
Tuesday, October 6, 2009
Simple opcode-to-function translationwith inlining
Early LLVM Usage
Tuesday, October 6, 2009
Startup went from 0.3s to 80s
Early LLVM Usage
Tuesday, October 6, 2009
Early LLVM UsageCompiled all methods upfront
Simple opcode-to-function translationStartup from 0.3s to 80s
Tuesday, October 6, 2009
True JIT
Tuesday, October 6, 2009
JIT Goals
True JIT
Tuesday, October 6, 2009
JIT Goals
Choose methods that benefit the most
True JIT
Tuesday, October 6, 2009
JIT Goals
Compiling has minimum impact on performance
True JIT
Tuesday, October 6, 2009
JIT Goals
Ability to make intelligent frontend decisions
True JIT
Tuesday, October 6, 2009
Choosing Methods
Tuesday, October 6, 2009
Simple call counters
Choosing Methods
Tuesday, October 6, 2009
When counter trips, the fun starts
Choosing Methods
Tuesday, October 6, 2009
Room for improvement
Choosing Methods
Tuesday, October 6, 2009
Room for improvement
Increment counters in loops
Choosing Methods
Tuesday, October 6, 2009
Room for improvement
Weigh different invocations differently
Choosing Methods
Tuesday, October 6, 2009
Simple countersTrip the counters, do it
Choosing Methods
Room for improvementIncrement in loopsWeigh invocations
Tuesday, October 6, 2009
Which Method?
Tuesday, October 6, 2009
Leaf methods trip quickly
Which Method?
Tuesday, October 6, 2009
Leaf methods trip quickly
Consider the whole callstack
Which Methods?
Tuesday, October 6, 2009
Leaf methods trip quickly
Pick a parent expecting inlining
Which Methods?
Tuesday, October 6, 2009
Leaf methods tripConsider the callstack
Find a parent
Which Method?
Tuesday, October 6, 2009
Minimal Impact
Tuesday, October 6, 2009
After the counters trip
Minimal Impact
Tuesday, October 6, 2009
Queue the method
Minimal Impact
Tuesday, October 6, 2009
Background thread drains queue
Minimal Impact
Tuesday, October 6, 2009
Frontend, passes, codegen in background
Minimal Impact
Tuesday, October 6, 2009
Install JIT’d function
Minimal Impact
Tuesday, October 6, 2009
Install JIT’d function
Requires GC interaction
Minimal Impact
Tuesday, October 6, 2009
Trip the countersQueue the method
Minimal ImpactCompile in backgroundInstall function pointer
Tuesday, October 6, 2009
Good Decisions
Tuesday, October 6, 2009
Naive translation yields fixed improvement
Good Decisions
Tuesday, October 6, 2009
Performance shifts to method dispatch
Good Decisions
Tuesday, October 6, 2009
Improve optimization horizon
Good Decisions
Tuesday, October 6, 2009
Inline using type feedback
Good Decisions
Tuesday, October 6, 2009
Naive translation sucksInline using type feedback
Good Decisions
Performance in dispatchImprove optimizations
Tuesday, October 6, 2009
Type Feedback
Tuesday, October 6, 2009
Frontend translates to IR
Type Feedback
Tuesday, October 6, 2009
Read InlineCache information
Type Feedback
Tuesday, October 6, 2009
InlineCaches contain profiling info
Type Feedback
Tuesday, October 6, 2009
Use profiling to drive inlining!
Type Feedback
Tuesday, October 6, 2009
Frontend generates IRReads InlineCaches
Type FeedbackInlineCaches have profiling
Use profiling to drive inlining!
Tuesday, October 6, 2009
Inlining
Tuesday, October 6, 2009
Profiling info shows a dominant class
Inlining
Tuesday, October 6, 2009
21%
1 class98%
Tuesday, October 6, 2009
Lookup method in compiler
Inlining
Tuesday, October 6, 2009
For native functions, emit direct call
Inlining
Tuesday, October 6, 2009
For FFI, inline conversions and call
Inlining
Tuesday, October 6, 2009
Emit direct calls if possible
InliningFind dominant class
Lookup method
Tuesday, October 6, 2009
Inlining Ruby
Tuesday, October 6, 2009
Policy decides on inlining
Inlining Ruby
Tuesday, October 6, 2009
Drive sub-frontend at call site
Inlining Ruby
Tuesday, October 6, 2009
All inlining occurs in the frontend
Inlining Ruby
Tuesday, October 6, 2009
Generated IR preserves runtime data
Inlining Ruby
Tuesday, October 6, 2009
Generated IR preserves runtime data
GC roots, backtraces, etc
Inlining Ruby
Tuesday, October 6, 2009
No AST between bytecode and IR
Inlining Ruby
Tuesday, October 6, 2009
No AST between bytecode and IR
Fast, but limits the ability to generate better IR
Inlining Ruby
Tuesday, October 6, 2009
Policy decidesDrive sub-frontend
Inlining RubyPreserve runtime dataGenerates fast, ugly IR
Tuesday, October 6, 2009
LLVM
Tuesday, October 6, 2009
IR uses operand stack
LLVM
Tuesday, October 6, 2009
IR uses operand stack
Highlevel data flow not in SSA
LLVM
Tuesday, October 6, 2009
IR uses operand stack
Passes eliminate redundencies
LLVM
Tuesday, October 6, 2009
IR uses operand stack
Makes GC stack marking easy
LLVM
Tuesday, October 6, 2009
IR uses operand stack
nocapture improves propagation
LLVM
Tuesday, October 6, 2009
Exceptions via sentinal value
LLVM
Tuesday, October 6, 2009
Exceptions via sentinal value
Nested handlers use branches for control
LLVM
Tuesday, October 6, 2009
Exceptions via sentinal value
Inlining exposes redundant checks
LLVM
Tuesday, October 6, 2009
Inline guards
LLVM
Tuesday, October 6, 2009
Inline guards
Simple type guards
LLVM
Tuesday, October 6, 2009
if(obj->class->class_id == <integer constant>) {
Tuesday, October 6, 2009
Inline guards
Custom AA pass for guard elimination
LLVM
Tuesday, October 6, 2009
Inline guards
Teach pointsToConstantMemory about...
LLVM
Tuesday, October 6, 2009
if(obj->class->class_id == <integer constant>) {
Tuesday, October 6, 2009
if(obj->class->class_id == <integer constant>) {
Tuesday, October 6, 2009
Maximizing constant propagation
LLVM
Tuesday, October 6, 2009
Maximizing constant propagation
Type failures shouldn’t contribute values
LLVM
Tuesday, October 6, 2009
if(obj->class->class_id == 0x33) { val = 0x7;} else { val = send_msg(state, obj, ...);}
Tuesday, October 6, 2009
if(obj->class->class_id == 0x33) { val = 0x7;} else { return uncommon(state);}
Tuesday, October 6, 2009
Maximizing constant propagation
Makes JIT similar to tracing
LLVM
Tuesday, October 6, 2009
Use overflow intrinsics
LLVM
Tuesday, October 6, 2009
Use overflow intrinsics
Custom pass to fold constants arguments
LLVM
Tuesday, October 6, 2009
AA knowledge for tagged pointers
LLVM
Tuesday, October 6, 2009
AA knowledge of tagged pointers
0x5 is 2 as a tagged pointer
LLVM
Tuesday, October 6, 2009
Not in SSA formSimplistic exceptions
Inlining guards
LLVMMaximize constants
Use overflowTagged pointer AA
Tuesday, October 6, 2009
Issues
Tuesday, October 6, 2009
How to link with LLVM?
Issues
Tuesday, October 6, 2009
How to link with LLVM?
An important SCM issue
Issues
Tuesday, October 6, 2009
Ugly, confusing IR from frontend
Issues
Tuesday, October 6, 2009
instcombine confuses basicaa
Issues
Tuesday, October 6, 2009
Operand stack confuses AA
Issues
Tuesday, October 6, 2009
Inability to communicate semantics
Issues
Tuesday, October 6, 2009
Object* new_object(state)
Tuesday, October 6, 2009
Returned pointer aliases nothing
Only modifies state
If return value is unused, remove the call
Semi-pure?
Tuesday, October 6, 2009
Ugly IRLinking with LLVM
IssuesAA confusion
Highlevel semantics
Tuesday, October 6, 2009