jruby: the hard parts

71
The Hard Parts

Upload: charles-nutter

Post on 08-Sep-2014

992 views

Category:

Technology


2 download

DESCRIPTION

A survey of all the hard problems JRuby developers have had to solve, whether the JVM likes it or not. Topics include parsing, interpreting, compiling, optimization, native libraries, posix, startup time, console features, and much more.

TRANSCRIPT

Page 1: JRuby: The Hard Parts

The Hard Parts

Page 2: JRuby: The Hard Parts

Subverting the JVMAll the tricks, hacks, and kludges we’ve use to make

JRuby the best off-JVM language impl around.

Page 3: JRuby: The Hard Parts

Intro

• Charles Oliver Nutter

• Principal Software Engineer

• Red Hat, JBoss Polyglot Group

• @headius

[email protected]

Page 4: JRuby: The Hard Parts

Welcome!

• My favorite event of the year

• I’ve only missed one!

• I will quickly talk through JRuby challenges

• Not a comprehensive list. Buy me a beer.

• Rest of you can help solve them

Page 5: JRuby: The Hard Parts

Ruby

• Dynamic, object-oriented language

• Created in 90s by Yukihiro Matsumoto

• “matz”

• Matz’s Ruby Interpreter (MRI)

• Inspired by Python, Perl, Lisp, Smalltalk

• Memes: TMTOWTDI, MINASWAN, CoC,

Page 6: JRuby: The Hard Parts

# Output "I love Ruby"!say = "I love Ruby"!puts say!!# Output "I *LOVE* RUBY"!say['love'] = "*love*"!puts say.upcase!!# Output "I *love* Ruby"!# five times!5.times { puts say }!

Page 7: JRuby: The Hard Parts

JRuby

• Ruby for the JVM and JVM for the Ruby

• Started in 2001, dozens of contribs

• Usually the fastest Ruby

• At least 20 paid full-time man years in it

• Sun Microsystems, Engine Yard, Red Hat

Page 8: JRuby: The Hard Parts

Ruby is Hard to Implement!

Page 9: JRuby: The Hard Parts

Making It Go (Fast)

• Parser-generator hacks

• Multiple interpreters

• Multiple compilers

• JVM-specific tricks

Page 10: JRuby: The Hard Parts

Parsing Ruby

• Yacc/Bison-based parse.y, almost 12kloc

• Very complex, not context-free

• No known 100% correct parser that is not YACC-based

Page 11: JRuby: The Hard Parts
Page 12: JRuby: The Hard Parts
Page 13: JRuby: The Hard Parts
Page 14: JRuby: The Hard Parts

JRuby’s Parser

• Jay parser generator

• Maybe 5 projects in the world use it

• Our version of parse.y = 4kloc

• Two pieces, one is for offline parsing

• Works ok, but…

Page 15: JRuby: The Hard Parts

Parser Problems!

• Array initialization > 65k bytecode

• Giant switch won’t JIT

• Outlining the case bodies: better

• Case bodies as runnables in machine: best

• org/jruby/parser/RubyParser$445.class

• Slow at startup (most important time!)

Page 16: JRuby: The Hard Parts

Interpreter

• At least four interpreters we’ve tried

• Original: visitor-based

• Modified: big switch rather than visitor

• Experimental: stackless instr-based

• Current: direct execution of AST

• Execution state on artificial stack

Page 17: JRuby: The Hard Parts

The New Way

• JRuby 9000 introduces a new IR

• Traditional-style compiler IR

• Register-based

• CFG, semantic analysis, type and constant propagation, all that jazz

• Interpreter has proven it out…JIT next

Page 18: JRuby: The Hard Parts

Mixed-Mode

• JRuby has both interpreter and JIT

• Cost of generating JVM bytecode is high

• Our interpreter runs faster than JVM’s

• A jitted interpreter is (much) faster than unjitted bytecode

Page 19: JRuby: The Hard Parts

Native Execution

• Early JIT compiler just translated AST

• Bare-minimum semantic analysis

• Eliminate artificial frame use

• One-off opto for frequent patterns

• Too unwieldy to evolve much

Page 20: JRuby: The Hard Parts

New IR JIT

• Builds off IR runtime

• Per-instruction bytecode gen is simple

• JVM frame is like infinite register machine

• Potential to massively improve perf

• Early unboxing numbers…

Page 21: JRuby: The Hard Parts

Numeric loop performance

0

1.25

2.5

3.75

5

times faster than MRI 2.1JRuby 1.7 Rubinius

Page 22: JRuby: The Hard Parts

Numeric loop performance

0

15

30

45

60

times faster than MRI 2.1JRuby 1.7 Rubinius Truffle Topaz 9k+unbox

Page 23: JRuby: The Hard Parts

mandelbrot(500)

0

10

20

30

40

times faster than MRI 2.1JRuby 9k + indy JRuby 9k + unboxing JRuby 9k + Truffle

Page 24: JRuby: The Hard Parts

Whither Truffle?

• RubyTruffle merged into JRuby

• Same licenses as rest of JRuby

• Chris Seaton continues to work on it

• Very impressive peak numbers

• Startup, steady-state…needs work

• Considering initial use for targeted opto

Page 25: JRuby: The Hard Parts

JVM Tricks

• Lack of class hierarchy analysis in JIT

• Manually split methods to beat limits

• Everything is an expression, so exception-handling has to maintain current stack

• Tweaking JIT flags will just make you sad

• Unsafe

Page 26: JRuby: The Hard Parts

IRubyObject public RubyClass getMetaClass();

RubyBasicObject private RubyClass metaClass; public RubyClass getMetaClass() { return metaClass; }

RubyString RubyArray RubyObject

obj.getMetaClass()

Page 27: JRuby: The Hard Parts

public static RubyClass metaclass(IRubyObject object) { return object instanceof RubyBasicObject ? ((RubyBasicObject)object).getMetaClass() : object.getMetaClass();}

Page 28: JRuby: The Hard Parts

Compatibility

• Strings and Encodings

• IO

• Fibers

• Difficult choices

Page 29: JRuby: The Hard Parts

Strings

• All arbitrary-width byte data is String

• Binary data and encoded text alike

• Many supported encodings

• j.l.String, char[] poor options

• Size, data integrity, behavioral differences

Page 30: JRuby: The Hard Parts

The First Big Decision

• We realized we needed a byte[] String

• Had been StringBuilder-based until then

• That meant a lot of porting…

• Regex engine (joni)

• Encoding subsystem (jcodings)

• Low-level IO + transcoding (in JRuby)

Page 31: JRuby: The Hard Parts

JOni

• Port of Oniguruma regex library

• Pluggable grammars + arbitrary encodings

• Bytecode engine (shallow call stack)

• Interruptible

• Re-forked as char[] engine for Nashorn

• https://github.com/jruby/joni

Page 32: JRuby: The Hard Parts

Data: ‘a’-‘z’ in byte[] Match /.*tuv(..)yz$/

0s

1.5s

3s

4.5s

6s

j.u.regex JOni

Page 33: JRuby: The Hard Parts

Data: ‘a’-‘z’ from IO Match /.*tuv(..)yz$/

0s

0.7s

1.4s

2.1s

2.8s

j.u.regex JOni

Page 34: JRuby: The Hard Parts

Jcodings

• Character tables

• Used heavily by JOni and JRuby

• Transcoding tables and logic

• Replaces Charset logic from JRuby 1.7

• https://github.com/jruby/jcodings

Page 35: JRuby: The Hard Parts

NO GRAPH NEEDED

Page 36: JRuby: The Hard Parts

JRuby 9000

• Finished porting, connecting transcoders

• New port of IO operations

• Transcoding works directly against IO buffers; hard to simulate other ways

• Lots of fun native (C) calls to emulate…

Page 37: JRuby: The Hard Parts

Fibers

• Coroutines, goroutines, continuations

• MRI uses stack-swapping

• And limits Fiber stack size as a result

• Useless as a concurrency model

• Useful for multiplexing operations

• Try read, no data, go to next fiber

Page 38: JRuby: The Hard Parts

Fibers on JRuby

• Yep, they’re just native threads

• Transfer perf with j.u.c utils is pretty close

• Resource load is very bad

• Spin-up time is bad without thread pool

• So early or occasional fibers cost a lot

• Where are you, coro?!

Page 39: JRuby: The Hard Parts

Hard Decisions

• ObjectSpace walks heap, off by default

• Trace functions add overhead, off by default

• Full coroutines not possible

• C extension API too difficult to emulate

• Perhaps only item to really hurt us

Page 40: JRuby: The Hard Parts

Native Integration

• Process control

• More selectable IO

• FFI layer

• C extension API

• Misc

Page 41: JRuby: The Hard Parts

Ruby’s Roots

• Matz is/was a C programmer

• Early Ruby did little more than stitch C calls together

• Some of those roots remain

• ttys, fcntl, process control, IO, ext API

• We knew we needed a solution

Page 42: JRuby: The Hard Parts

JNA, and then JNR

• Started with jna-posix to map POSIX

• stat, symlink, etc needed to do basics

• JNR replaced JNA

• Wayne Meissner started his empire…

Page 43: JRuby: The Hard Parts

The Cancer

• Many off-platform runtimes are not as good as Hotspot

• Many of their users must turn to C for perf

• So, since many people use C exts on MRI, maybe we need to implement it?

• Or get a student to do it…

Page 44: JRuby: The Hard Parts

MRI C Extensions

• Very invasive API

• Direct pointer access, object internals, conservative GC, threading constraints

• Like bridging one JNI to another

• Experimental in JRuby 1.6, gone in 1.7

• Will not revisit unless new API

Page 45: JRuby: The Hard Parts

FFI

• Ruby API/DSL for binding C libs

• Additional tools for generating that code

• If you need to go native, it’s the best way

• In use in production JRuby apps

• ØMQ client, bson lib, sodium crypto, …

Page 46: JRuby: The Hard Parts

Ruby FFI exampleclass Timeval < FFI::Struct!  layout :tv_sec => :ulong,! :tv_usec => :ulong!end!!module LibC!  extend FFI::Library!  ffi_lib FFI::Library::LIBC!  attach_function :gettimeofday,! [ :pointer, :pointer ],! :int!end!!t = Timeval.new!LibC.gettimeofday(t.pointer, nil)

Page 47: JRuby: The Hard Parts

Layered Runtime

jffi

jnr-ffi

libffi

jnr-posix

jnr-constants

!

jnr-enxio jnr-x86asmjnr-unixsocket

etc etc

Page 48: JRuby: The Hard Parts

Native in JRuby

• POSIX stuff missing from Java

• Ruby FFI DSL for binding C libs

• Stdio

• selection, remove buffering, control tty

• Process launching and control

• !!!!!!

Page 49: JRuby: The Hard Parts

Process Control

• Java’s ProcessBuilder/Process are bad

• No channel access (no select!)

• Spins up at least one thread per process

• Drains child output ahead of you

• New process API based on posix_spawn

Page 50: JRuby: The Hard Parts

in_c, in_p = IO.pipe out_p, out_c = IO.pipe !pid = spawn('cat -n', :in => in_c, :out => out_c, :err => 'error.log') ![in_c, out_c].each(&:close) !in_p.puts("hello, world") in_p.close !puts out_p.read # => " 1 hello, world" !Process.waitpid(pid)

Page 51: JRuby: The Hard Parts

Usability

• Backtraces

• Command-line and launchers

• Startup time

Page 52: JRuby: The Hard Parts

Backtraces

• JVM backtraces make Rubyists’ eyes bleed

• Initially, Ruby trace maintained manually

• JIT emits mangled class to produce a Ruby trace element

• AOT produces single class, mangled method name

• Mixed-mode backtraces!

Page 53: JRuby: The Hard Parts

at java.lang.reflect.Method.invoke(Method.java:597) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:86) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:234) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1061) at groovy.lang.ExpandoMetaClass.invokeMethod(ExpandoMetaClass.java:910) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:892) at groovy.lang.Closure.call(Closure.java:279) at org.codehaus.groovy.runtime.DefaultGroovyMethods.callClosureForMapEntry(DefaultGroovyMethods.java:1911) at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:1184) at org.codehaus.groovy.runtime.dgm$88.invoke(Unknown Source) at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:270) at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:52) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:124) at BootStrap.populateBootstrapData(BootStrap.groovy:786) at BootStrap.this$2$populateBootstrapData(BootStrap.groovy) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:86) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:234) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1061) at groovy.lang.ExpandoMetaClass.invokeMethod(ExpandoMetaClass.java:910) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:892) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1009) at groovy.lang.ExpandoMetaClass.invokeMethod(ExpandoMetaClass.java:910) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:892) at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.callCurrent(PogoMetaClassSite.jav

Page 54: JRuby: The Hard Parts

at org.jruby.javasupport.JavaMethod.invokeStaticDirect(JavaMethod.java:362) at org.jruby.java.invokers.StaticMethodInvoker.call(StaticMethodInvoker.java:50) at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:306) at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:136) at org.jruby.ast.CallNoArgNode.interpret(CallNoArgNode.java:60) at org.jruby.ast.NewlineNode.interpret(NewlineNode.java:105) at org.jruby.ast.RootNode.interpret(RootNode.java:129) at org.jruby.evaluator.ASTInterpreter.INTERPRET_EVAL(ASTInterpreter.java:95) at org.jruby.evaluator.ASTInterpreter.evalWithBinding(ASTInterpreter.java:184) at org.jruby.RubyKernel.evalCommon(RubyKernel.java:1158) at org.jruby.RubyKernel.eval19(RubyKernel.java:1121) at org.jruby.RubyKernel$INVOKER$s$0$3$eval19.call(RubyKernel$INVOKER$s$0$3$eval19.gen) at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:210) at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:206) at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:599) at org.jruby.runtime.invokedynamic.InvocationLinker.invocationFallback(InvocationLinker.java:155) at ruby.__dash_e__.method__1$RUBY$bar(-e:1) at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:599) at org.jruby.runtime.invokedynamic.InvocationLinker.invocationFallback(InvocationLinker.java:138) at ruby.__dash_e__.block_0$RUBY$foo(-e:1) at ruby$__dash_e__$block_0$RUBY$foo.call(ruby$__dash_e__$block_0$RUBY$foo) at org.jruby.runtime.CompiledBlock19.yieldSpecificInternal(CompiledBlock19.java:117) at org.jruby.runtime.CompiledBlock19.yieldSpecific(CompiledBlock19.java:92) at org.jruby.runtime.Block.yieldSpecific(Block.java:111) at org.jruby.RubyFixnum.times(RubyFixnum.java:275) at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:599) at org.jruby.runtime.invokedynamic.InvocationLinker.invocationFallback(InvocationLinker.java:230) at ruby.__dash_e__.method__0$RUBY$foo(-e:1) at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:599) at org.jruby.runtime.invokedynamic.InvocationLinker.invocationFallback(InvocationLinker.java:138) at ruby.__dash_e__.__file__(-e:1) at ruby.__dash_e__.load(-e)

Page 55: JRuby: The Hard Parts
Page 56: JRuby: The Hard Parts

• org.jruby.RubyFixnum.times

• org.jruby.evaluator.ASTInterpreter.INTERPRET_EVAL

• rubyjit.Object$$foo_3AB1F5052668B3CD74A0B4CD4999CF6A65E92973271627940.__file__

• ruby.__dash_e__.method__0$RUBY$foo

Page 57: JRuby: The Hard Parts

Command Line

• Rubyists typically are at CLI

• Command line and tty must behave

• Epic bash and .bat scripts

• 300-500 lines of heinous shell script

• Unusable in shebang lines

• Repurposed NetBeans native launcher

Page 58: JRuby: The Hard Parts

system ~/projects/jruby $ time bin/jruby.bash -vjruby 9000.dev-SNAPSHOT (2.1.2) 2014-07-27 9cca1ec Java HotSpot(TM) 64-Bit Server VM 24.45-b08 on 1.7.0_45-b18 [darwin-x86_64]!real0m0.126suser0m0.092ssys 0m0.031s!system ~/projects/jruby $ time bin/jruby.bash -vjruby 9000.dev-SNAPSHOT (2.1.2) 2014-07-27 9cca1ec Java HotSpot(TM) 64-Bit Server VM 24.45-b08 on 1.7.0_45-b18 [darwin-x86_64]!real0m0.124suser0m0.089ssys 0m0.033s!system ~/projects/jruby $ time jruby -vjruby 9000.dev-SNAPSHOT (2.1.2) 2014-07-27 9cca1ec Java HotSpot(TM) 64-Bit Server VM 24.45-b08 on 1.7.0_45-b18 [darwin-x86_64]!real0m0.106suser0m0.080ssys 0m0.022s!system ~/projects/jruby $ time jruby -vjruby 9000.dev-SNAPSHOT (2.1.2) 2014-07-27 9cca1ec Java HotSpot(TM) 64-Bit Server VM 24.45-b08 on 1.7.0_45-b18 [darwin-x86_64]!real0m0.110suser0m0.085ssys 0m0.023s

Page 59: JRuby: The Hard Parts

Console Support

• Rubyists also typically use REPLs

• Readline support is a must

• jline has been forked all over the place

• Looking into JNA-based readline now

Page 60: JRuby: The Hard Parts

CLI == Startup Time

• BY FAR the #1 complaint

• May be the only reason we haven’t won!

• We’re trying everything we can

Page 61: JRuby: The Hard Parts

JRuby Startup

-e 1

gem --help

rake -T

Time in seconds (lower is better)

0 2.5 5 7.5 10

C Ruby JRuby

Page 62: JRuby: The Hard Parts

Tweaking Flags

• -client mode

• -XX:+TieredCompilation -XX:TieredStopAtLevel=1

• -X-C to disable JRuby’s compiler

• Heap sizes, code verification, etc etc

Page 63: JRuby: The Hard Parts

Nailgun?

• Keep a single JVM running in background

• Toss commands over to it

• It stays hot, so code starts faster

• Hard to clean up all state (e.g. threads)

• Can’t get access to user’s terminal

• http://www.martiansoftware.com/nailgun/

Page 64: JRuby: The Hard Parts

DripIsolated JVM

ApplicationCommand #1

Isolated JVM

ApplicationCommand #1

Isolated JVM

ApplicationCommand #1

Page 65: JRuby: The Hard Parts

Drip

• Start a new JVM after each command

• Pre-boot JVM plus optional code

• Analyze command line for differences

• Age out unused instances

• https://github.com/flatland/drip

Page 66: JRuby: The Hard Parts

Drip Init

• Give Drip some code to pre-boot

• Load more libraries

• Warm up some code

• Pre-execution initialization

• Run as much as possible in background

• We also pre-load ./dripmain.rb if exists

Page 67: JRuby: The Hard Parts

$ cat dripmain.rb# Preload some code Rails always needsrequire File.expand_path('../config/application', __FILE__)

Page 68: JRuby: The Hard Parts

JRuby Startup

rake -T

Time in seconds (lower is better)

0 2.5 5 7.5 10

C Ruby JRuby JRuby (best)JRuby (drip) JRuby (drip init) JRuby (dripmain)

Page 69: JRuby: The Hard Parts

CONCLUSION

Page 70: JRuby: The Hard Parts

Hard Parts• 64k bytecode limit

• Falling over JIT limits

• String char[] pain

• Startup and warmup

• Coroutines

• FFI at JVM level

• Too many flags

• Tiered compiler slow

• Interpreter opto

• Bytecode is a blunt tool

• Indy has taken too long

• Charlie may burn out

Page 71: JRuby: The Hard Parts

Thank You!

• Charles Oliver Nutter

• @headius

[email protected]

• http://blog.headius.com