r environment and variable lookup apr. 2012 1. r environment and variable lookup outline r...

22
R Environment and Variable Lookup Apr. 2012 1

Upload: sherilyn-mcbride

Post on 03-Jan-2016

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

1

R Environment and Variable Lookup

Apr. 2012

Page 2: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

2

R Environment and Variable Lookup

Outline

R Environment and Variable Lookup R Byte-Code Interpreter Variable Cache Mechanism Unboxed Value Cache Proposal Others

Page 3: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

3

R Environment and Variable Lookup

R Environment Organization

Environment Frames– Frames are connected as a tree structure

// One variable binding cell structurestruct listsxp_struct { struct SEXPREC *carval; // value of the symbol struct SEXPREC *cdrval; // next binding cell struct SEXPREC *tagval; // symbol};

// One frame structurestruct envsxp_struct { struct SEXPREC *frame; struct SEXPREC *enclos; // parent struct SEXPREC *hashtab; // optional};

Frame B

Frame A

Frame C

enclos

enclos

Var binding cell Var binding cell

SEXP symbol

SEXP value

SEXP symbol

SEXP value

R_nilltag car

cdr cdr

tag car

hashtabHashtable

frame

Page 4: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

4

R Environment and Variable Lookup

R Environment Organization (2)

Hashtable– Implemented by VECSXP structure• A vector, each vector element is a R SEXP object (listsxp_struct)

– Calculating the buckle number• Hash(symbol) & hashTableMask

Var binding cell Var binding cell

SEXP symbol

SEXP value

SEXP symbol

SEXP value

R_NilValuetag car

cdr cdr

tag car

Var binding cell Var binding cell

SEXP symbol

SEXP value

SEXP symbol

SEXP value

R_NilValuetag car

cdr cdr

tag car

Hashtable

VECSXP object

Buckle 0Buckle 1Buckle 2Buckle 3

Page 5: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

5

R Environment and Variable Lookup

R Environment – Variable Lookup

Steps– Get the environment frame• From the current execution frame• Or from recursive lookup

– Check if it has the hashtable• No: – start from the first binding cell, do list search, compare symbol

• Yes:– Calculate the hash buckle number– Get the corresponding buckle’s first binding cell, do list search, compare symbol

– No found: return R_NilValue– Found: could return a binding cell

Page 6: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

6

R Environment and Variable Lookup

R Byte-Code Symbol

In R byte-code, each symbol has an index

A simple optimization– Use the index value to do directly look up

run <-function() { b <- a+202; print(b);};

GETVAR 1LDCONST 2ADD 3SETVAR 4POPGETFUN 5MAKEPROM 6CALL 7RETRUN

Idx Value

1 a

2 202

3 a+202

4 b

5 print

6 list(.Code, list(7L, GETVAR.OP, 0L, RETURN.OP), list(b)),

7 print(b)

Constant tableInstructions

Byte-code compiling

Page 7: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

7

R Environment and Variable Lookup

R byte-code Interpreter Variable Cache A cache to store the “bindings”– Not the exactly value

Cache size: 128– “More than 90% of the closures in base have constant pools with fewer than

128 entries when compiled” Cache Space Wasting– “On average about 1/3 of constant pool entries are symbols”– Optimization: re-order the constant table (not implemented)

Var binding cell Var binding cell

SEXP symbol

SEXP value

SEXP symbol

SEXP value

tag car tag car

Var binding cell

SEXP symbol

SEXP value

tag car

Var binding cell

SEXP symbol

SEXP value

tag car

index 0 1 2 3

Page 8: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

8

R Environment and Variable Lookup

R byte-code Interpreter Variable Cache (2)

Cache Storage – On Stack (by default)

Var binding cell

Var binding cell

Var binding cell

One Var

One Var

One Var

One Var

One Var

One Var

Stack top

Curr

ent f

ram

ePr

evio

us fr

ame

128 entries

Page 9: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

9

R Environment and Variable Lookup

R byte-code Interpreter Variable Cache (3) The reason to cache binding cell, not the exactly value– Easy for child frames to modify the value

Frame AA

Frame A

Frame AAA

enclos

enclos

Var binding cell

Symbol

aValue

5

tag carframe

Var binding cell

Var binding cell

Var binding cell

One Var

One Var

One Var

… AA’s

fram

e

AA <-function() { a <- 5 AAA() print(a);};

AAA <-function() { a <<- 100};

Case 1AAA <-function() { … //remove parent //frame’s val “a”};

Case 2

In AAA: set the binding cell’s value to 100.

Return back to AA, the value got in the cache is the right value

In AAA: set the binding cell’s value to “unbounded” value

Return back to AA, the value got in cache is “unbounded value” try to look up “a” in AA’s parent frame

Page 10: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

10

R Environment and Variable Lookup

R byte-code Interpreter Variable Cache (4) Cache Target: only variables defined in current frame– Not variables found in parent frame– Reason: intersection define problems

Example: Suppose AAA caches g parent A’s “a”

Frame AA

Frame A

Frame AAA

enclos

enclosVar binding cell

Var binding cell

Var binding cell

One Var

One Var

One Var

… AAA’

s fr

ameVar binding cell:

a 5

Frame AAAAenclos

Var binding cell: a 100

A <-function() { a <- 5; AA()};

AAA <-function() { b <- a; //cache a AAAA() print(a); //use a};

AA <-function() { AAA()};

AAAA <-function() { ...//define “a” //in AA frame};

The second using “a” should use the one in AA frame’sIf using the cached one incorrect semantics

cache

Define later

Page 11: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

11

R Environment and Variable Lookup

R byte-code Interpreter Variable Cache Steps

Used in SETVAR, GETVAR and similar instructions Two modes– SmallCache: constant table size <= 128• Use symbol index as direct reference• Get the binding cell

– Normal: constant table size > 128• Symbol index % 128 reference number• Get the binding cell• Compare the binding cell’s symbol

Cache initial value– R_NilValue

Page 12: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

12

R Environment and Variable Lookup

R byte-code Interpreter Variable Cache Steps (2)

SETVAR– Finding Cell Step• SmallCache– Get the binding cell directly by symbol index may return R_nilValue

• Normal– Get the binding cell by symbol index, symbol» If get the cell with right symbol and value is not unbounded return the cell» Use base method to find variable in local frame» Find the ncell in current frame update the local cache, and return the ncell» Not find and if the cell from previous step is not null but is unbounded value

Clean the local cache (The value is totally removed, no need cache)

– Setting Value Step• Use the cell to update the value directly• If the cell is R_NilValue Use base method to define a var in local frame

Page 13: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

13

R Environment and Variable Lookup

R byte-code Interpreter Variable Cache Steps (3) SETVAR cache update Normal mode– SETVAR first time: • Finding Cell Step: No valid binding cell• Setting Value Step: use base method to define a var

– SETVAR second time:• Finding Cell Step: find the cell and update the cache• Setting Value Step: use the cell to directly update the value

SETVAR in SmallCache Mode (Pure SETVAR)– SETVAR first time:• Finding Cell Step: No valid binding cell• Setting Value Step: use base method to define a var

– SETVAR second time:• Finding Cell Step: No valid binding cell because it only uses directly lookup– Still not update the cell

• Setting Value Step: use base method to define a var

Page 14: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

14

R Environment and Variable Lookup

R byte-code Interpreter Variable Cache Steps (4) GETVAR– SmallCache Mode• Directly lookup the cell– Invalid cell goto Normal Model– Valid Cell» Check the value type, may return the value directly, or force promoise

– Normal Mode• Get the binding cell by symbol index, symbol– If get the cell with right symbol and value is not unbounded return the cell– Use base method to find variable in local frame– Find the ncell in current frame update the local cache, and return the ncell– Not find and if the cell from previous step is not null but is unbounded value Clean the

local cache (The value is totally removed, no need cache)• Use the returned cell to get the value– May return the valid value or return error

Page 15: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

15

R Environment and Variable Lookup

R byte-code Interpreter Variable Cache Steps (5) GETVAR in SmallCache Model followed by SETVAR– SETVAR: not update the cache at all– GETVAR: first time• Goto normal model Update the Cache• Return the value

– GETVAR: second time• Return the value use the cache

PC STMT 1 LDCONST, 1 3 SETVAR, 2 5 POP 6 GETVAR, 2 8 SETVAR, 3 10 POP 11 GETVAR, 2 13 SETVAR, 3 15 POP 16 GETVAR, 2 18 SETVAR, 3 20 INVISIBLE 21 RETURNt

run <-function() { a <- 101; b <- a; #get var first time b <- a; #get var second time b <- a; #get var third time};

Set a

Get a: normal, and update cache

Get a: cache

Get a: cache

Page 16: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

16

R Environment and Variable Lookup

R Byte-Code Interpreter Variable Cache Mechanism

Others– There are some additional codes to handle• Unbounded values• Force Promise if the symbol’s value is a promise• Missing value handing

Some conclusion– The cache mechanism is correct– But very complex due to the complex R semantics– Optimize the Cache Mechanism is possible• E.g. cache parent frame’s variable

– But should be very complex

Page 17: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

17

R Environment and Variable Lookup

Unboxed Value Cache Proposal

Basic Assumptions– Not change the current cache mechanism– An additional cache only for unboxed values• Something like local register files

– Rules should be simple Basic Logic– GETVAR: Get the var from the byte-code interpreter logic,

unbox and populate the cache– SETVAR: only update the register files– Context Change• Box, and Write back using the byte-code interpreter logic• Context Change: function call, return, …

Page 18: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

18

R Environment and Variable Lookup

Basic Cache Design One Cache and One Cache State

– Cache only store the value, not the binding cell• Each cell, 64 bit width: store unboxed Real, Int, Logical

– Each Cache State• Not valid: no value available, need get it and unbox it• Valid: an unboxed version is stored in the cache• Modified: the value in the cache is modified Need write back later

– Cache State also stores the type of the value– Global Cache Counter• NumModified: How many cache cell’s values are modified• If >0, need write back during context change

Cache

Cache

Cache

Cache

Cache State

Cache State

Cache State

Cache State

Page 19: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

19

R Environment and Variable Lookup

Code Transformation For The Cache The Current Sequence Example

– It’s hard to populate the cache from this sequence Need combine– GETVAR + UNBOXREAL

About GUARD– No context change, no need additional guard

Define a New Instruction to replace the sequnce– GETUNBOXREAL

PC STMT...18 GETVAR, 220 GUARD, 2, 2023 UNBOXREAL

Page 20: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

20

R Environment and Variable Lookup

Logical of GETUNBOXREAL GETUNBOXREAL– Check the cache’s state– Valid/Modified:• Directly return the unboxed value

– Not valid (first time or context changed)• Get var first• execute the guard logic– May fall back to the un-opt code

• If success– Populate the cache with the unboxed value , set valid state, and return the value

Also define SETUNBOX– If the value on top of the stack is unboxed, use the SETUNBOX to replace

SETVAR– The shape of the stack is known during compiling time

Page 21: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

21

R Environment and Variable Lookup

Write-Back Policy

If meeting context change– Function call, return– Check the global state NumModified• = 0, no action• >0, iterate the cache– Use index to look for the symbol– Box value according to the type of this value– Set the var back

Page 22: R Environment and Variable Lookup Apr. 2012 1. R Environment and Variable Lookup Outline  R Environment and Variable Lookup  R Byte-Code Interpreter

22

R Environment and Variable Lookup

Some Findings in the Latest R-2.15.0

R-2.15.0– Released Mar. 2012–Many function level improvement– No found R interpreter/byte-code interpreter/Runtime changes• Very draft performance evaluation: No big changes in micro-test• Our current working version is R-2.14.1

Another finding– Started from R-2.14.0, there is a package called “parallel”– High level parallel wrapper to some coarse grain computation

tasks• R-2.15.0: some new APIs. Something like map/reduce style