dynamic filtering: multi-purpose architecture support for language runtime systems

Post on 27-May-2015

459 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation for Dynamic Filtering: Multi-Purpose Architecture Support for Language Runtime Systems

TRANSCRIPT

Dynamic Filtering: Multi-Purpose

Architecture Support for Language

Runtime Systems Tim Harris, Sasa Tomic, Adrian Cristal, Osman Unsal

Microsoft Research, BSC-Microsoft Research Center

Old-to-young References (1)

• Observations ▫ Most allocated objects will die young ▫ Few references from older to younger objects

exist

Old-to-young References (2)

• Young Generation ▫ Most newly allocated objects are allocated in

the young generation ▫ The number of objects that survive a minor

collection is expected to be low

• Old Generation ▫ Objects that are longer-lived ▫ Major collections are infrequent

Card Table in Java HotSpot VM (1)

• Eliminating the need to scan the entire old generation ▫ Old generation is split into 512-byte chunks

called cards ▫ The card table is an array with one byte entry

per card in the heap ▫ Mark the container card dirty when a field is

updated ▫ Scan only dirty cards in minor collection

Card Table in Java HotSpot VM (2)

Card Table in Java HotSpot VM (3)

• A Write barrier to maintain the card table ▫ Executed every time a reference field is

updated ▫ Do impact performance on the execution a bit ▫ Allows for much faster minor collections

Observations

• Many barriers perform checks but no real work ▫ old-to-young references are rare

• Many barriers are self-healing ▫ No need to further check a logged old-to-young

reference

Approach

• Accelerate barriers by keeping track • Extend the instruction set with an operation

dftl ▫ Test if the barrier’s input are already in the

set

Original Barrier

• A simple original barrier in pseudo-code void writeBarrier(void **addr, void *tgt) {

if (inOldGen(addr) && inYoungGen(tgt)) { // T1

log(addr); // L1

}

}

• Unnecessary when ▫ (addr, tgt) pair has already passed the full

check ▫ addr has already been logged

A Dynamic Filter Way

void writeBarrierDyfl(void **addr, void *tgt) {

if ((!dyfl_card_pair(addr, tgt, 0x1)) && // A1

(!dyfl_addr(addr, 0x2))) { // A2

if (inOldGen(addr) && inYoungGen(tgt)) { // T1

dyfl_set_addr(addr, 0x2); // S2

log(addr); // L1

} else {

dyfl_set_card_pair(addr, tgt, 0x1); // S1

}

}

}

Test if a full check has already been done on (addr, tgt) pair

512-byte granularity for spatial locality Tag, distinguash this use from other uses

Single-address check Significant

Dynamic Filtering in the ISA

dyfl(i1, i2, mask, tag) Test dynamic filter

dyfl_set(i1, i2, mask, tag) Set dynamic filter

dyfl_clear(i1, i2, mask, tag) Clear specific entry

dyfl_clear(tag) Clear all with tag

Implementation Sketch

Design Details

• Tag assignment ▫ 16 tags available

• Sharing ▫ Extend the tag implicitly to distinguish multiple

hardware threads on a same core

• Implementation ▫ Independently from the caches

Using Dynamic Filtering

• Garbage Collection • Transaction Memory • Language-Based Security

Transactional Memory

• STM with eager updates ▫ Dynamic filtering can be used to check whether

or not a location has already been accessed in the current transaction

• STM with deferred updates ▫ Less applicable, since slow path work is needed

for most cases ▫ Track locations have not been written, which

can be accessed directed

Language-Based Security

• Control Flow Integrity (CFI) ▫ Record target-marker associations ▫ Record valid source-target address pairs

• XFI ▫ Data access check

Evaluation

• Simulator ▫ Based on an x86 simulator ▫ A single multi-threaded user-mode process ▫ Simple time model

Each instruction takes 1 cycle + number of cycles spent on memory accesses

▫ dyfl implementation is separate from the caches

▫ 2048-entry filter

Simulator v.s Real Hardware

Generational GC Hit Rates (1)

Generational GC Hit Rates (2)

Generational GC Hit Rates (3)

GC Acceleration Performance

GC Acceleration Performance

STM Performance (1)

STM Performance (2)

Sensitivity

JBBAtmoic, GC-STM JBBAtmoic, GC

JBBAtmoic, STM

Conclusion

• Dynamic filtering ▫ An abstraction for accelerating read/write-

barriers used by runtime system ▫ Provide a mechanism for testing whether or not

a given runtime check has already been made

top related