runtime innovation - nextgen ninja hacking of the jvm, by ryan sciampacone

88
© 2013 IBM Corporation Runtime Innovation: NextGen Ninja Hacking of the JVM Ryan Sciampacone – IBM Managed Runtime Architect 14 th June 2013

Upload: zeroturnaround

Post on 20-Aug-2015

1.527 views

Category:

Technology


1 download

TRANSCRIPT

© 2013 IBM Corporation

Runtime Innovation: NextGen Ninja Hacking of the JVM

Ryan Sciampacone – IBM Managed Runtime Architect 14th June 2013

© 2013 IBM Corporation 2

Important Disclaimers

THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY.

WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED.

ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED ENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR INFRASTRUCTURE DIFFERENCES.

ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE.

IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE.

IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.

NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:

- CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS

© 2013 IBM Corporation 3

Introduction to the speaker

■  16 years experience developing and deploying Java SDKs

■  Recent work focus: ■  Managed Runtime Architecture

■  Java Virtual Machine improvements

■  Multi-tenancy technology

■  Native data access and heap density

■  Footprint and performance

■  Garbage Collection

■  Scalability and pause time reduction

■  Advanced GC technology

■  My contact information: – [email protected]

© 2013 IBM Corporation 4

What should you get from this talk?

■  JVM proving to be a fertile ecosystem for languages

■  Plenty of opportunity to innovate in other spaces

■  Runtime is the gateway to this innovation

■  Largely ignored the last few years, but this is where the core inventions can occur

© 2013 IBM Corporation 5

The runtime isn’t boring!

© 2013 IBM Corporation 6

Packed Objects

© 2013 IBM Corporation 7

Problem? What problem?

■  JNI just isn’t a great way to marshal data

■  Locality in Java can matter (e.g., JEP 142)

■  Existing native and data placement stories aren’t very good

■  In many cases, legacy systems exist – the interop is just terrible

■  So we want something that integrates well with the Java language and helps us…

© 2013 IBM Corporation 8

What are we trying to solve?

Simple enough…

Hash

Array

Entry

(object)

(object)

Object header

Object field / data

table key

value

© 2013 IBM Corporation 9

What are we trying to solve?

Simple enough…

■  Header overhead

Hash

Array

Entry

(object)

(object)

Object header

Object field / data

table key

value

© 2013 IBM Corporation 10

What are we trying to solve?

Simple enough…

■  Header overhead

■  Pointer chasing

Hash

Array

Entry

(object)

(object)

Object header

Object field / data

table key

value

© 2013 IBM Corporation 11

What are we trying to solve?

Simple enough…

■  Header overhead

■  Pointer chasing

■  Locality

Hash

Array

Entry

(object)

(object)

Object header

Object field / data

table key

value

© 2013 IBM Corporation 12

What are we trying to solve?

int int

Object header

Fields / data

Java Native

int[] d

anObject

Fighting the Java/Native interface

int int

int …

© 2013 IBM Corporation 13

Ok so we have some criteria…

■  Ability to do away with headers

■  Ability to bring multiple objects close together

■  On heap / off heap seamless referencing of data

■  This actually sounds a lot like C structure types

■  Packed Objects!

struct Address { char[4] addr; short port; } struct Header { struct Address src; struct Address dst; }

struct Header

port addr Address src

Address dst

port addr

© 2013 IBM Corporation 14

Packed Objects: Under the covers

int y int x

aPoint

Object header

Object field / data

© 2013 IBM Corporation 15

Packed Objects: Under the covers

int y int x

aPoint

Object header

Object field / data

offset target

aPackedPoint

int y int x

© 2013 IBM Corporation 16

Packed Objects: Under the covers

int y int x

aPoint

Object header

Object field / data

offset target

aPackedPoint

int y int x

© 2013 IBM Corporation 17

Packed Objects: In Practice

int y int x

aPoint Object field / data

int y int x

aPoint Point e Point s

aLine

Object header

© 2013 IBM Corporation 18

Packed Objects: In Practice

int y int x

aPoint

Object header

Object field / data

int y int x

aPoint Point e Point s

aLine

aPackedLine

int y int x

int y int x

offset target

© 2013 IBM Corporation 19

Packed Objects: In Practice

int y int x

aPoint

Object header

Object field / data

int y int x

aPoint Point e Point s

aLine

int y int x

int y int x

aPackedPoint s

aPackedPoint e

aPackedLine

offset target

© 2013 IBM Corporation 20

Packed Objects: In Practice

int y int x

aPoint

Object header

Object field / data

int y int x

aPoint Point e Point s

aLine

int y int x

int y int x

aPackedPoint s

aPackedPoint e

@Packed final class PackedPoint extends PackedObject {

int x; int y;

} @Packed final class PackedLine extends PackedObject {

PackedPoint s; PackedPoint e;

}

aPackedLine

offset target

© 2013 IBM Corporation 21

Packed Objects: In Practice

int y int x

aPoint

Object header

Object field / data

int y int x

aPoint Point e Point s

aLine

int y int x

int y int x

aPackedLine

offset target

© 2013 IBM Corporation 22

Packed Objects: In Practice

int y int x

aPoint

Object header

Object field / data

int y int x

aPoint Point e Point s

aLine

int y int x

int y int x

aPackedLine.e

aPackedLine

offset target

© 2013 IBM Corporation 23

Packed Objects: In Practice

int y int x

aPoint

Object header

Object field / data

int y int x

aPoint Point e Point s

aLine

int y int x

int y int x

aPackedLine.e

offset target

aPackedPoint

aPackedLine

offset target

© 2013 IBM Corporation 24

Packed Objects: In Practice with Native Access

int y int x

Object header

Struct field / data

Java Native

struct Point { int x; int y; }

© 2013 IBM Corporation 25

Packed Objects: In Practice with Native Access

int y int x

Object header

Struct field / data

offset target

aPackedPoint

Java Native

struct Point { int x; int y; }

@Packed final class PackedPoint extends PackedObject {

int x; int y;

}

© 2013 IBM Corporation 26

Packed Objects: In Practice with Native Access

int y int x

Object header

Struct field / data

offset target

aPackedPoint

Java Native

struct Point { int x; int y; }

Ø

@Packed final class PackedPoint extends PackedObject {

int x; int y;

}

© 2013 IBM Corporation 27

Lets Build Something in C!

■  Nested substructures

■  Compact representation

■  Alignment aspects

struct Address { char[4] addr; short port; } struct Header { struct Address src; struct Address dst; }

struct Header

port addr Address src

Address dst

port addr

© 2013 IBM Corporation 28

Let’s Build the Same “Something” in Java!

■  Headers

■  No locality

■  Alignment

class Address { byte[] addr; short port;

} class Header {

Address src; Address dst;

}

Address

port addr addr Header

Address dst Address src

Address

port addr addr

byte[]

byte[]

© 2013 IBM Corporation 29

What does the Java code look like under the covers?

■  From a code point of view, this isn’t terrible…

Bytecodes: aload1 getfield Header.dest LAddress; getfield Address.addr [B iconst0 baload bipush 192 ificmpeq ...

JIT (32 bit): mov EBX, dword ptr -4[ECX] // load temp1 mov EBX, dword ptr 8[EBX] // load dest mov EBX, dword ptr 4[EBX] // load addr movsx EDI, byte ptr 8[EBX] // array[0] cmp EDI, 192

© 2013 IBM Corporation 30

What if we did this with Packed Objects?

■  The Java code is pretty clean… and a pretty good result!

@Packed final class Address extends PackedObject {

PackedByte[[4]] addr; short port;

} @Packed final class PacketHeader extends PackedObject {

Address src; Address dest;

}

port addr Address src

Address dst

port addr

offset target

aPackedHeader

© 2013 IBM Corporation 31

Ok, what about the code under the covers?

■  Bytecodes don’t change… JIT code is pretty good too!

JIT (32 bit): mov EBX, dword ptr -4[ECX] // load temp1 mov EAX, dword ptr 4[EBX] // load target mov EDX, dword ptr 8[EBX] // load offset lea EBX, dword ptr [EAX + EDX] movsx EDI, byte ptr 8[EBX] // array[0] cmp EDI, 192

Bytecodes: aload1 getfield PackedHeader.dest LAddress; getfield Address.addr [B iconst0 baload bipush 192 ificmpeq ...

© 2013 IBM Corporation 32

What about native access?

Java Native

anObject

How do we implement this normally?

port addr

port addr

© 2013 IBM Corporation 33

JNI implementation

■  Usual “stash pointers in long types” tricks

■  JNI costs tend to be high

© 2013 IBM Corporation 34

DirectByteBuffer implementation

■  No extra JNI to write (this is good)

■  Keeping your indices straight is never fun

© 2013 IBM Corporation 35

Unsafe implementation

■  You shouldn’t be here

■  Still playing the indices game

© 2013 IBM Corporation 36

PackedObject answer

■  Looks like natural Java code

■  Foregoes JNI

■  Same type capable of on-heap representation

port addr

port addr offset

target

aPackedHeader

Ø

© 2013 IBM Corporation 37

Active work and next steps

■  Experimenting with this now

■  Yes, there are security aspects to be aware of here

■  This is potentially part of a larger look at Java / Platform interop

■  Not specifically viewed as a cure to GC problems

■  This forms the basis for many other solutions to existing problems…

© 2013 IBM Corporation 38

Multitenancy

© 2013 IBM Corporation 39

Just what do you mean by “multitenancy”?

With a multitenant architecture, a software application is designed to virtually partition its data and configuration, and each client organization works with a customized virtual application instance.

■  Working Definition – A single instance of a software application that serves multiple customers

à Each customer is a tenant. – Tenants can customize some parts of the application (look and feel) but not the code. – Infrastructure usually opaque

à opportunity for provider

Why? Cost Savings: As compared to single-tenant deployment model

© 2013 IBM Corporation 40

S5. Shared Application

JDK Support for Spectrum of Sharing / Multitenancy (Level 1-5)

S1. No Sharing

Infrastructure

Middleware

Application

Tenant

OS

Middleware

Application

Tenant

OS

Infrastructure

Middleware

Application

Tenant

OS

Middleware

Application

Tenant

OS

Infrastructure

Data Center floor Data Center floor

Middleware

Application

Tenant

Middleware

Application

Tenant

Infrastructure

Data Center floor

OS

Application

Tenant

Application

Tenant

Infrastructure

Data Center floor

OS

Middleware

Application data

Tenant Tenant

Infrastructure

Data Center floor

OS

Middleware

Application

S2. Shared Hardware

S3. Shared Operating System

S4. Shared Middleware

Application data

Application data

Application data

Application data

Application data

Application data

Application data

Application data

Application data

Sharing servers storage, networks in a data center

Hypervisors (e.g. KVM, VMWare) are used to virtualize the hardware

Multiple applications sharing the same middleware

Sharing the same application

Multiple copies of middleware in a single operating system

Application Changes

Application Changes ? Application

Changes

© 2013 IBM Corporation 41

= L Application Changes

© 2013 IBM Corporation 42

Hardware Virtualization

■  Hypervisors run multiple applications side-by-side safely

■  Advantages – Capture idle CPU cycles – Automatic de-duplication (RAM) – Ability to meter and shift resource toward demand – No need to change tenant applications

Hypervisor

Hardware

tenant tenant tenant tenant

© 2013 IBM Corporation 43

Hardware Virtualization

■  Hypervisors JVMs can run multiple applications side-by-side safely

■  Advantages – Capture idle CPU cycles – Automatic de-duplication (ability to share Java artifacts) – Ability to meter and shift resource toward demand – No need to change tenant applications

Hypervsisor

Hardware

tenant tenant tenant tenant

Java VM

Operating System

© 2013 IBM Corporation 44

Multitenancy: Low (or no) barrier to entry

■  Multitenancy is all about reducing duplication by transparently sharing a JVM – 1 GC, 1 JIT, shared heap objects – plus: JVM-enforced resource constraints to meter and limit consumption

■  Ergonomics: Opt-in to multitenancy with a single flag: -Xmt (multitenancy) – no application changes required

javad

Tenant1

Tenant2

One copy of common code + data lives in the javad process.

© 2013 IBM Corporation 45

JVM: Separating State

■  Static variables are a problem for sharing

■  Consider use of System.out in code we want to share below

© 2013 IBM Corporation 46

JVM: Separating State

getstatic does 2 things 1.  Triggers class initialization on first contact

–  Danger: Each ‘tenant’ needs to do this 2.  Resolves a name (out) to a storage location and reads from it

–  Danger: Each tenant needs dedicated storage

© 2013 IBM Corporation 47

JVM: Standard Static Fields

getstatic <cpIndex> 0: if constantPool[cpIndex] is resolved then { 1: fetch field address from constantPool 2: read field value from class 3: return field value 4: } else { 5: initialize class 6: determine address of field 7: store field address into constantPool 8: goto 1 9: }

© 2013 IBM Corporation 48

JVM: Isolated Static Fields

getstatic <cpIndex> 0: if constantPool[cpIndex] is resolved then { 1: fetch field address index and class from constantPool 2: fetch tenant->data[class->initIndex] 3: if (!initialized) { 4: initialize class; 5: } 6: read field value from tenant->data[index] 7: return field value 8: } else { 9: initialize class 10: determine address index of field 11: store field address index into constantPool 12: goto 1 13: }

© 2013 IBM Corporation 49

JVM: Separating State

■  There are other initialization triggers too – putstatic – invokestatic – instanceof – new

■  Some of these warrant special handling as they are heavily used (invokestatic) and initialization checks are expensive.

– Separate stack frame build from opcode

© 2013 IBM Corporation 50

More that just JVM State…

■  Throttling of resources – Threads, GC, sockets, files (IO in general), native memory

■  Past and existing examples do exist! – Commercial / In house custom solutions – JSR 181 Isolates / 284 Resource Management

■  Security is of course huge

© 2013 IBM Corporation 51

Other Thoughts – Native Libraries and shared state

■  Use separate processes to manage different state

■  Each process now holds the context

■  Challenges: Latency

Tenant1

Tenant2

JVM

Proxy Library

Shared Library

Shared Library

Tenant1

Tenant2

JVM

Shared Library

Shared State!

■  Native libraries contain state that may not be shareable across tenants

© 2013 IBM Corporation 52

Heap Memory (not Garbage Collection!)

© 2013 IBM Corporation 53

Scaling Garbage Collection

■  Existing Garbage Collection (GC) technology defers expensive costs

Heap

Local Collection

Global Collection

Heap

■  Global approach introduces linear scaling problem (100ms @ 1GB → 10s @ 100GB)

■  Goal: Scale using result-based, incremental (through partitioned heap) garbage collection

■  Target appropriate data sets based on environment factors

■  Key: Region Based Garbage Collection approach

© 2013 IBM Corporation 54

Challenge: Segregate Memory by Differentiators

Heap Varying memory

characteristics and requirements per

application

JVM must dynamically

recognize and adapt given platform

resources

IaaS optimization + sharing (consolidate) SSD Exploitation (memory efficiency) Multi-core scalability (scale up)

Very Large Heap GC (low pause times) Soft Real-Time

■  Objects with common properties •  Locality •  Usage frequency •  Lifetime •  Levels of “Read-only”ness

■  Optimize: Memory characteristics •  Page sharing status •  Memory speed aware •  Flash / SSD exploitation

■  “Results based” incremental operations

•  Productive GC every cycle •  Localized garbage collect

Custom

Placement improves cache performance Allocation efficiency (thoughtput) Reduced size working set (paging)

© 2013 IBM Corporation 55

Memory Hierarchy and Performance

■  Increased core and HW thread counts are not silver bullets to scaling in a memory intensive environment such as Java

■  Pressure from increased cores / HW threads

■  Cache hierarchy does not scale Cache

Hierarchy

BUS

Vs.

Logical Processors

Memory

Memory

Memory

Memory

CPU CPU

CPU CPU

■  Memory speed varies based on execution context

■  Magnitude of variance related to CPU socket count

■  Uneven application performance over time •  Data access can be fast or slow (or both) •  Behavior can be variable run to run due to data placement •  Threads scheduling can be unpredictable •  Same deployments on different hardware cause unpredictable performance shifts

© 2013 IBM Corporation 56

Let’s look at transferring data

Heap

A

B

C

© 2013 IBM Corporation 57

Let’s look at transferring data

Heap

A

B

C

Remote Transfer

© 2013 IBM Corporation 58

Let’s look at transferring data

Heap

A

B

C

Remote Transfer

© 2013 IBM Corporation 59

Let’s look at transferring data

Heap

A

B

C

Remote Transfer

© 2013 IBM Corporation 60

PackedObjects could help…

Heap

A

Remote Transfer

C

B

© 2013 IBM Corporation 61

PackedObjects could help…

Heap

A

B

C

Remote Transfer

A

Packed

C

B

© 2013 IBM Corporation 62

PackedObjects could help…

Heap

A

B

C

Remote Transfer

A

Packed

B

C

A

Packed

C

B

© 2013 IBM Corporation 63

Making the data transfer easier…

Heap

Remote Transfer

© 2013 IBM Corporation 64

Making the data transfer easier…

Heap

Remote Transfer

Specialized Heap Area

© 2013 IBM Corporation 65

Making the data transfer easier…

Heap

B

Remote Transfer

A B

C

A

C

Specialized Heap Area

© 2013 IBM Corporation 66

Making the data transfer easier…

Heap

B

Remote Transfer

A B

C

A

C

Specialized Heap Area

© 2013 IBM Corporation 67

Making the data transfer seamless

Heap

Remote Transfer

Specialized Heap Area

© 2013 IBM Corporation 68

B

C

A

B

C

A

Making the data transfer seamless

Heap

Remote Transfer

B

C

A

Specialized Heap Area

© 2013 IBM Corporation 69

B

C

A

B

C

A

Making the data transfer seamless

Heap

Remote Transfer

B

C

A Packed

Specialized Heap Area

offset target

Packed

offset target

© 2013 IBM Corporation 70

Tip of the Iceberg

■  Just scratching the surface of what is possible

■  GC technology has an effect on what is possible – But this should be invisible to the application and seamless in the language

■  Innovation in the heap space is not limited to GC

© 2013 IBM Corporation 71

Virtualization

© 2013 IBM Corporation 72

Finding your way in The Cloud

■  Constant pressure to reduce cost and deliver more services for a given cost

■  Maximize the available resources (Hardware)

■  CPU and Memory are typically the biggest – Other critical ones as well, including I/O

■  Plenty of solutions – Cloud and Multitenancy just to name a few

■  And so by and large we’re in a Virtualization scenario

© 2013 IBM Corporation 73

The Layers of Liars

■  Your application could end up querying 3 different layers to get a (hopefully) correct picture

Hypervisor

Hardware

OS

API! availableProcessors() [no accuracy / understanding]

Thinks it knows but doesn’t (being lied to)

Knows and controls, but can change its mind

App

© 2013 IBM Corporation 74

The Layers of Liars

Hypervisor

Hardware

App

OS

App

Image source (Wikipedia Commons) CPU: https://en.wikipedia.org/wiki/Central_processing_unit DRAM: http://en.wikipedia.org/wiki/Dynamic_random-access_memory

RAM

CPU

© 2013 IBM Corporation 75

The Layers of Liars

Hypervisor

Hardware

App

OS

App

Image source (Wikipedia Commons) CPU: https://en.wikipedia.org/wiki/Central_processing_unit DRAM: http://en.wikipedia.org/wiki/Dynamic_random-access_memory

RAM

CPU

© 2013 IBM Corporation 76

The Layers of Liars

Hypervisor

Hardware

App

OS

App App

OS

Image source (Wikipedia Commons) CPU: https://en.wikipedia.org/wiki/Central_processing_unit DRAM: http://en.wikipedia.org/wiki/Dynamic_random-access_memory

RAM

CPU

© 2013 IBM Corporation 77

The Layers of Liars

■  You don’t even have a constant picture of the universe

Hypervisor

Hardware

App

OS

App App

OS

Image source (Wikipedia Commons) CPU: https://en.wikipedia.org/wiki/Central_processing_unit DRAM: http://en.wikipedia.org/wiki/Dynamic_random-access_memory

RAM

CPU

© 2013 IBM Corporation 78

Knowing what you don’t know

(What does this even mean anymore?)

© 2013 IBM Corporation 79

Interfaces… more importantly behavior

■  Provide interfaces to the “Layers of liars” to enable middleware “smarts”

■  Runtime reads and reacts (in co-operation with application / other runtimes)

Hypervisor

Hardware

App

OS

Hypervisor

Hardware

App

OS

App App

OS

Adjust –Xmx Adjust –Xgcthreads Adjust thread pool counts …

© 2013 IBM Corporation 80

…And after all that

© 2013 IBM Corporation 81

Tenant

Nimble Reactions to External Events

Hypervisor

Hardware

OS

Application

Tenant

© 2013 IBM Corporation 82

Tenant

Nimble Reactions to External Events

Hypervisor

Hardware

OS

Application

Tenant

© 2013 IBM Corporation 83

Tenant Tenant

Nimble Reactions to External Events

Hypervisor

Hardware

OS

Application App

OS

Tenant

© 2013 IBM Corporation 84

Tenant Tenant Tenant Tenant

Nimble Reactions to External Events

Hypervisor

Hardware

OS

Application App

OS

Tenant

Hypervisor

Hardware

OS

Application

© 2013 IBM Corporation 85

Tenant Tenant Tenant Tenant

Nimble Reactions to External Events

Hypervisor

Hardware

OS

Application App

OS

Tenant

Hypervisor

Hardware

OS

Application

Tenant

RDMA + Packed Objects

© 2013 IBM Corporation 86

Questions?

© 2013 IBM Corporation 87

References

■  Get Products and Technologies: – IBM Java Runtimes and SDKs:

•  https://www.ibm.com/developerworks/java/jdk/ – IBM Monitoring and Diagnostic Tools for Java:

•  https://www.ibm.com/developerworks/java/jdk/tools/

■  Learn: – IBM Java InfoCenter:

•  http://publib.boulder.ibm.com/infocenter/java7sdk/v7r0/index.jsp

■  Discuss: – IBM Java Runtimes and SDKs Forum:

•  http://www.ibm.com/developerworks/forums/forum.jspa?forumID=367&start=0

© 2013 IBM Corporation 88

Copyright and Trademarks

© IBM Corporation 2012. All Rights Reserved.

IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., and registered in many jurisdictions worldwide.

Other product and service names might be trademarks of IBM or other companies.

A current list of IBM trademarks is available on the Web – see the IBM “Copyright and trademark information” page at URL: www.ibm.com/legal/copytrade.shtml