experiments in sharing java vm technology with cruby
TRANSCRIPT
a.k.a IBM’s Ruby JIT Talk
Experiments in Sharing Java VM Technology with CRuby
1
RubyKaigi 2015
Matthew Gaudet
• Compiler Developer
@ IBM Canada
• Compilation
Technology since
2008 – JIT since
last year.
• First time off the
North American
Tectonic Plate!
4
No Pressure!
Why does have a Ruby JIT?
IBM + Cloud
It’s a Polyglot World Out There• Many languages! Great
reasons to use each
and every one of them!
• At IBM, we want to
support these
languages, helping
them to grow
capabilities.
• Let developers choose
the right language for
the job, rather than
selecting on capabilities
7
How can we minimizeduplicated effort?
8
The Plan:
9
OMR
OMRAn Open Source Toolkit for
Language Runtime
Technologies.
OMRHeritage: Built out of IBM’s
Java Runtime Technology.• Announced by Mark
Stoodley at JVMLS 2015
• See the complete talk on YouTube.
OMR
Garbage
Collector
JIT Compiler
Monitoring
Porting Library
…More!
Components
OMRGoal: Compatibility!
Integrate vs. Replace
OMR
Philosophy: Assemble the right
solution for your language.
Language FreeMonitoring Components
Your LanguageMonitoring Components
Language FreeGC Components
Your languageGC Components
Language Free JIT Components
Your Language JIT Components
Your LanguageRuby
Why so quiet? a.k.a Why haven’t you seen us posting on ruby-core?
16
“The future is already
here, it’s just not very
evenly distributed”
– William Gibson
18
• IBM is getting better at
Open Source
• We are coming from a
traditionally closed-
source part of IBM
• We are learning – part
of why I’m here is to
listen and learn.
19
“Open Source is a
huge part of IBM, but
isn’t evenly distributed”
– Me
Proving our Proof Of Concept
We wanted to ensure we were happy
with the concept.
We needed to know it would work before
coming to communities
20
This is real technology
Early parts of this
already shipping in
IBM Products:
• Automatic Binary
Optimizer for
COBOL
• IBM JDK 8
• … and more!
21
“I don’t always re-engineer
my runtime technologies…
but when I do, I do it in
production”
-- The IBM Runtimes Team
This is real technology
Working on Open
Source as fast as we
can, within the
constraints of
shipping software!
22
“I don’t always re-engineer
my runtime technologies…
but when I do, I do it in
production”
-- The IBM Runtimes Team
I’m a JIT person…. So let’s talk about JIT Compiler tech.
Testarossa
System Z
POWER
X86
C/C++
COBOL
ARM
OMR’s Compiler Technology: Testarossa
Started in 1999 as a dynamic language JIT for Java
COBOL binaries
Sta
tic C
om
pile
rs
• Written in C++ • 100+ Optimizations
MRI
YARV Interpreter
Garbage Collector
Ruby IL Generator Optimizer
Code Generator
ProfilerRuntime
Testarossa
Code Cache
JIT integration
Our effort to date has emphasized
Functional Correctness.
No big changes to how MRI works (to ease
adoption)
No restrictions on native code used by extension
modules
No benchmark tuning
Very simple compilation control
We Can Run Rails Applications
Integrating Testarossa into MRI
1. Initialization /
Termination
void Init_BareVM(void) {
…
globals.ruby_vm_global_constant_state_ptr =
&ruby_vm_global_constant_state;
globals.ruby_rb_mRubyVMFrozenCore_ptr =
&rb_mRubyVMFrozenCore;
globals.ruby_vm_event_flags_ptr =
&ruby_vm_event_flags;
vm_jit_init(vm, globals);
…
}
int ruby_vm_destruct(rb_vm_t *vm)
{
…
vm_jit_destroy(vm);
…
}
Integrating Testarossa into MRI
static VALUE
vm_jitted_p(rb_thread_t *th, rb_iseq_t *iseq)
{
...
if (iseq->jit.state == ISEQ_JIT_STATE_JITTED)
return Qtrue;
...
--iseq->jit.u.count;
if (iseq->jit.u.count < 0) {
return vm_jit(th,iseq);
}
return Qfalse;
}
1. Initialization /
Termination
2. Compilation Control
Integrating Testarossa into MRI
vm_exec(rb_thread_t *th)
{
...
vm_loop_start:
result = vm_exec_core(th, initial);
...
1. Initialization /
Termination
2. Compilation Control
3. Code Dispatch
Integrating Testarossa into MRI
vm_exec(rb_thread_t *th)
{
...
vm_loop_start:
result = vm_exec2(th, initial);
...
static inline VALUE
vm_exec2(rb_thread_t *th, VALUE initial)
{
VALUE result;
if (VM_FRAME_TYPE(th->cfp)
!= VM_FRAME_MAGIC_RESCUE &&
VM_FRAME_TYPE_FINISH_P(th->cfp) &&
vm_jitted_p(th, th->cfp->iseq) == Qtrue) {
return vm_exec_jitted(th);
} else {
return vm_exec_core(th, initial);
}
}
1. Initialization /
Termination
2. Compilation Control
3. Code Dispatch
Integrating Testarossa into MRI
require 'test/unit'
class JITModuleTest < Test::Unit::TestCase
def addone(x)
return x + 1
end
def test_jit_control
am = method(:addone)
# No testing occurs unless the JIT exists.
if RubyVM::JIT::exists?
assert_equal(false, RubyVM::JIT::compiled?(am) )
assert_equal(true,
RubyVM::JIT::compile(am) )
assert_equal(true,
RubyVM::JIT::compiled?(am) )
end
end
end
1. Initialization /
Termination
2. Compilation Control
3. Code Dispatch
4. Expose to Ruby
A brief YARV interlude
33
YARV: Yet Another Ruby VM
0000 trace 8
0002 trace 1
0004 putself
0005 getlocal a
0007 getlocal a
0009 opt_mult <ic:2>
0011 send :puts, 1, nil, 8, <ic:1>
0017 trace 16
0019 leave
def product(a)puts a * a
end
YARV DEFINE_INSN
instruction_name
(instruction_operands, ...)
(pop values, ...)
(return values, ...)
{
... // insn body
}
• Instructions are
stored in a
definition file
• Processed by Ruby
code into C, then
#included in
YARV core
35
Complex Op-Codes
DEFINE_INSN
getlocal
(lindex_t idx, rb_num_t level) //operand
() // pop
(VALUE val) // push
{
int i, lev = (int)level;
VALUE *ep = GET_EP();
for (i = 0; i < lev; i++) {
ep = GET_PREV_EP(ep);
}
val = *(ep - idx);
}
Many Ruby op-codes
have complex
semantics
36
ComplexOp-Codes
DEFINE_INSN
defined
(rb_num_t op_type, VALUE obj, VALUE needstr)
(VALUE v)
(VALUE val)
{
VALUE klass;
enum defined_type expr_type = 0;
enum defined_type type = (enum
defined_type)op_type;
val = Qnil;
switch (type) {
case DEFINED_IVAR:
if (rb_ivar_defined(GET_SELF(), SYM2ID(obj))) {
expr_type = DEFINED_IVAR;
}
break;
case DEFINED_IVAR2:
klass = vm_get_cbase(GET_ISEQ(), GET_EP());
break;
case DEFINED_GVAR:
if (rb_gvar_defined(rb_global_entry(SYM2ID(obj)))) {
expr_type = DEFINED_GVAR;
}
break;
case DEFINED_CVAR: {
NODE *cref = rb_vm_get_cref(GET_ISEQ(),
GET_EP());
klass = vm_get_cvar_base(cref, GET_CFP());
if (rb_cvar_defined(klass, SYM2ID(obj))) {
expr_type = DEFINED_CVAR;
}
break;
}
case DEFINED_CONST:
klass = v;
if (vm_get_ev_const(th, GET_ISEQ(), klass,
SYM2ID(obj), 1)) {
expr_type = DEFINED_CONST;
}
break;
case DEFINED_FUNC:
klass = CLASS_OF(v);
if (rb_method_boundp(klass, SYM2ID(obj), 0)) {
expr_type = DEFINED_METHOD;
}
break;
case DEFINED_METHOD:{
VALUE klass = CLASS_OF(v);
const rb_method_entry_t *me =
rb_method_entry(klass, SYM2ID(obj), 0);
if (me) {
if (!(me->flag & NOEX_PRIVATE)) {
if (!((me->flag & NOEX_PROTECTED) &&
!rb_obj_is_kind_of(GET_SELF(),
rb_class_real(klass)))) {
expr_type = DEFINED_METHOD;
}
}
}
{
VALUE args[2];
VALUE r;
args[0] = obj; args[1] = Qfalse;
r = rb_check_funcall(v, idRespond_to_missing, 2,
args);
if (r != Qundef && RTEST(r))
expr_type = DEFINED_METHOD;
}
break;
}
case DEFINED_YIELD:
if (GET_BLOCK_PTR()) {
expr_type = DEFINED_YIELD;
}
break;
case DEFINED_ZSUPER:{
rb_call_info_t cit;
if (vm_search_superclass(GET_CFP(), GET_ISEQ(),
Qnil, &cit) == 0) {
VALUE klass = cit.klass;
ID id = cit.mid;
if (rb_method_boundp(klass, id, 0)) {
expr_type = DEFINED_ZSUPER;
}
}
break;
}
case DEFINED_REF:{
val = vm_getspecial(th, GET_LEP(), Qfalse,
FIX2INT(obj));
if (val != Qnil) {
expr_type = DEFINED_GVAR;
}
break;
}
default:
rb_bug("unimplemented defined? type (VM)");
break;
}
if (expr_type != 0) {
if (needstr != Qfalse) {
val = rb_iseq_defined_string(expr_type);
}
else {
val = Qtrue;
}
}
}
Many Ruby op-codes
have complex
semantics very
complex semantics.
37
Testarossa Intermediate Language
iload aiload bisubbipush 2imulistore a
istore atreetop
iconst 2
imul
iload a
iload b
isub
Java Bytecode
• Testarossa uses a tree-based intermediate representation.
• tree represents a single expression
• treetop represents statement and program order.
treetop …
…
…
Testarossa Intermediate Language
getlocal 0
lloadi (slot0)treetop
aloadi (ep)
aload (rb_thread_t)
aloadi (cfp)
Ruby Bytecode
• Ruby IL Generation creates trees for each bytecode. • Complicated bytecode behavior leads to size expansion
th->cfp->ep[slot0]
OurStrategy:
• Mimic interpreter
for maximum
compatibility.
• Implement
simple opcodes
directly in IL
40
IL
OurStrategy:
• Build callbacks
to VM for
complex
opcodes.
• Automatically
generate
callbacks
from the
instruction
definition file.
41
IL
callback
OurStrategy:
• Fast-path
particular
patterns
• e.g trace
42
IL
callback
IL
IL
callback
OurStrategy:
• Don’t try to
handle
everything – let
interpreter
handle hard
cases!
43
IL
callback
IL
IL
callback
JIT Status:
Based on Ruby 2.2.3
Almost all opcodes supported
Running test-all, running real applications (like
Spree)
44
Performancea.k.a. We’ve still got room to grow!
45
46
Speedup R
ela
tive to Inte
rpre
ter
Micro Benchmarks3x
2x
1x
47
‘Production’ BenchmarksS
peedup R
ela
tive to Inte
rpre
ter
48
17.3% Improvement(Geomean)
22.5% Geomean
Improvement
Interpreter Interpreter + JIT
Onwards and Upwardsa.k.a. The Future
49
Challenges
MRI is challenging for JIT compilers!
• Highly dynamic • Unmanaged direct use of internal data
structures by extensions• Complicated runtime, with many subtle details: setjmp/longjmp intertwined with control flow and exception handling.
Opportunities to bring to Ruby
• Speculative optimizations powered by
decompilation and code-patching
• Class Hierarchy Based Optimization
• Guarded inlining
• Type Specialization
• Recompilation
• Interpreter and JIT Profiling
• Asynchronous Compilation
• More optimization! • OMR’s Ruby JIT uses only 10 / 100+ optimizations.
Opportunities
Type Specialization
Inlining
Value & Type propagation
Opportunities
RecompilationProfilin
3x3
IBM Wants to Help
Share our experience, help make Ruby faster!
Contribute VM improvements.
Collaborate on designs: e.g. a JIT Interface,
event notification hooks, etc.
Ruby+OMRTechnology Preview
Download @
goo.gl/P3yXuy
• Releasing a
preview so you
can start test-
driving
• Send us
feedback!
55
Thank You!ありがとう
57
Keep in Touch!
Matthew Gaudet,OMR JIT Developer
[email protected]@MattStudies
Mark Stoodley, OMR Project Lead
[email protected]@mstoodle
John Duimovich,CTO IBM Runtimes
[email protected]@jduimovich
goo.gl/P3yXuy
Trademarks, Copyrights, Disclaimers
IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of other IBM trademarks is available on the web at "Copyright and trademark information" at http://www.ibm.com/legal/copytrade.shtml
Other company, product, or service names may be trademarks or service marks of others.
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS OR SOFTWARE.
© Copyright International Business Machines Corporation 2015. All rights reserved.
Additional Important Disclaimers
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY.
WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS
PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED.
ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED ENVIRONMENT. YOUR OWN TEST
RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR INFRASTRUCTURE DIFFERENCES.
ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE.
IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY,
WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE.
IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE
RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:
- CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIR SUPPLIERS AND/OR
LICENSORS
Attributions
gopher.{ai,svg,png} was created by Takuya Ueda (http://u.hinoichi.net). Licensed under the
Creative Commons 3.0 Attributions license.
Rails Photo: Arne Hückelheim ,
https://en.wikipedia.org/wiki/Railroad_switch#/media/File:SunsetTracksCrop.JPG