native interfaces for r

54
Native Interfaces for R Seth Falcon Fred Hutchinson Cancer Research Center 20-21 May, 2010

Upload: seth-falcon

Post on 10-May-2015

2.391 views

Category:

Technology


2 download

DESCRIPTION

Presentation on native interfaces for the R programming language given as part of a course in advanced R programming at FHCRC: https://secure.bioconductor.org/SeattleMay10/

TRANSCRIPT

Page 1: Native interfaces for R

Native Interfaces for R

Seth Falcon

Fred Hutchinson Cancer Research Center

20-21 May, 2010

Page 2: Native interfaces for R

Outline

The .Call Native Interface

R types at the C-level

Memory management in R

Self-Study Exercises

Odds and Ends

Resources

Page 3: Native interfaces for R

which as implemented in R prior to R-2.11

which <- function(x) {seq_along(x)[x & !is.na(x)]

}

Page 4: Native interfaces for R

which as implemented in R prior to R-2.11

which <- function(x) {seq_along(x)[x & !is.na(x)]

}

1. is.na

2. !

3. &

4. seq_along

5. [

Page 5: Native interfaces for R

which in C

1 SEXP nid_which(SEXP v) {2 SEXP ans;3 int i, j = 0, len = Rf_length(v), *buf, *tf;4 buf = (int *) R_alloc(len, sizeof(int));5 tf = LOGICAL(v);6 for (i = 0; i < len; i++) {7 if (tf[i] == TRUE) buf[j] = i + 1; j++;8 }9 ans = Rf_allocVector(INTSXP, j);10 memcpy(INTEGER(ans), buf, sizeof(int) * j);11 return ans;}

Page 6: Native interfaces for R

which is now 10x faster

system.time(which(v))

R Version System time Elapsed time

R-2.10 0.018 0.485R-2.11 0.001 0.052

v is a logical vector with 10 million elements

Page 7: Native interfaces for R

Making use of existing code

Access algorithms

I RBGL

I RCurl

I Rsamtools

Interface to other systems

I RSQLite

I netcdf

I SJava

Page 8: Native interfaces for R

Review of C

#include <stdio.h>double _nid_avg(int *data, int len) {

int i;double ans = 0.0;for (i = 0; i < len; i++) {ans += data[i];

}return ans / len;

}main() {

int ex_data[5] = {1, 2, 3, 4, 5};double m = _nid_avg(ex_data, 5);printf("%f\n", m);

}

Page 9: Native interfaces for R

The .Call interface

Page 10: Native interfaces for R

Symbolic Expressions (SEXPs)

I The SEXP is R’s fundamental data type at the C-level

I SEXP is short for symbolic expression and is borrowed fromLisp

I History at http://en.wikipedia.org/wiki/S-expression

Page 11: Native interfaces for R

.Call Start to Finish Demo

Demo

Page 12: Native interfaces for R

.Call Disection

R

# R/somefile.Ravg <- function(x) .Call(.nid_avg, x)# NAMESPACEuseDynLib("yourPackage", .nid_avg = nid_avg)

C

/* src/some.c */#include <Rinternals.h>SEXP nid_avg(SEXP data) { ... INTEGER(data) ... }

Page 13: Native interfaces for R

.Call C function

#include <Rinternals.h>

SEXP nid_avg(SEXP data){

SEXP ans;double v = _nid_avg(INTEGER(data), Rf_length(data));PROTECT(ans = Rf_allocVector(REALSXP, 1));REAL(ans)[0] = v;UNPROTECT(1);return ans;/* shortcut: return Rf_ScalarReal(v); */

}

Page 14: Native interfaces for R

.Call C function

#include <Rinternals.h>

SEXP nid_avg(SEXP data){

SEXP ans;double v = _nid_avg(INTEGER(data), Rf_length(data));PROTECT(ans = Rf_allocVector(REALSXP, 1));REAL(ans)[0] = v;UNPROTECT(1);return ans;/* shortcut: return Rf_ScalarReal(v); */

}

Page 15: Native interfaces for R

.Call C function

#include <Rinternals.h>

SEXP nid_avg(SEXP data){

SEXP ans;double v = _nid_avg(INTEGER(data), Rf_length(data));PROTECT(ans = Rf_allocVector(REALSXP, 1));REAL(ans)[0] = v;UNPROTECT(1);return ans;/* shortcut: return Rf_ScalarReal(v); */

}

Page 16: Native interfaces for R

.Call C function

#include <Rinternals.h>

SEXP nid_avg(SEXP data){

SEXP ans;double v = _nid_avg(INTEGER(data), Rf_length(data));PROTECT(ans = Rf_allocVector(REALSXP, 1));REAL(ans)[0] = v;UNPROTECT(1);return ans;/* shortcut: return Rf_ScalarReal(v); */

}

Page 17: Native interfaces for R

.Call C function

#include <Rinternals.h>

SEXP nid_avg(SEXP data){

SEXP ans;double v = _nid_avg(INTEGER(data), Rf_length(data));PROTECT(ans = Rf_allocVector(REALSXP, 1));REAL(ans)[0] = v;UNPROTECT(1);return ans;/* shortcut: return Rf_ScalarReal(v); */

}

Page 18: Native interfaces for R

.Call C function

#include <Rinternals.h>

SEXP nid_avg(SEXP data){

SEXP ans;double v = _nid_avg(INTEGER(data), Rf_length(data));PROTECT(ans = Rf_allocVector(REALSXP, 1));REAL(ans)[0] = v;UNPROTECT(1);return ans;/* shortcut: return Rf_ScalarReal(v); */

}

Page 19: Native interfaces for R

.Call C function

#include <Rinternals.h>

SEXP nid_avg(SEXP data){

SEXP ans;double v = _nid_avg(INTEGER(data), Rf_length(data));PROTECT(ans = Rf_allocVector(REALSXP, 1));REAL(ans)[0] = v;UNPROTECT(1);return ans;/* shortcut: return Rf_ScalarReal(v); */

}

Page 20: Native interfaces for R

Function Registration

I In NAMESPACE

I In C via R_CallMethodDef (SeeShortRead/src/R_init_ShortRead.c for a nice example.

Page 21: Native interfaces for R

Outline

The .Call Native Interface

R types at the C-level

Memory management in R

Self-Study Exercises

Odds and Ends

Resources

Page 22: Native interfaces for R

Just one thing to learn

I Everything is a SEXPI But there are many different types of SEXPs

I INTSXP, REALSXP, STRSXP, LGLSXP, VECSXP, and more.

Page 23: Native interfaces for R

What’s a SEXP?

Page 24: Native interfaces for R

Common SEXP subtypes

R function SEXP subtype Data accessor

integer() INTSXP int *INTEGER(x)numeric() REALSXP double *REAL(x)logical() LGLSXP int *LOGICAL(x)character() STRSXP CHARSXP STRING ELT(x, i)list() VECSXP SEXP VECTOR ELT(x, i)NULL NILSXP R NilValueexternalptr EXTPTRSXP SEXP (accessor funcs)

Page 25: Native interfaces for R

STRSXP/CHARSXP Exercise

Try this:

.Internal(inspect(c("abc", "abc", "xyz")))

Page 26: Native interfaces for R

STRSXP/CHARSXP Exercise

> .Internal(inspect(c("abc", "abc", "xyz")))

@101d3f280 16 STRSXP g0c3 [] (len=3, tl=1)@101cc6278 09 CHARSXP g0c1 [gp=0x20] "abc"@101cc6278 09 CHARSXP g0c1 [gp=0x20] "abc"@101cc6308 09 CHARSXP g0c1 [gp=0x20] "xyz"

Page 27: Native interfaces for R

character vectors == STRSXPs + CHARSXPs

Page 28: Native interfaces for R

CHARSXPs are special

Page 29: Native interfaces for R

STRSXP/CHARSXP API Example

SEXP s = Rf_allocVector(STRSXP, 5);PROTECT(s);SET_STRING_ELT(s, 0, mkChar("hello"));SET_STRING_ELT(s, 1, mkChar("goodbye"));SEXP c = STRING_ELT(s, 0);const char *v = CHAR(c);UNPROTECT(1);

Page 30: Native interfaces for R

STRSXP/CHARSXP API Example

SEXP s = Rf_allocVector(STRSXP, 5);PROTECT(s);SET_STRING_ELT(s, 0, mkChar("hello"));SET_STRING_ELT(s, 1, mkChar("goodbye"));SEXP c = STRING_ELT(s, 0);const char *v = CHAR(c);UNPROTECT(1);

Page 31: Native interfaces for R

STRSXP/CHARSXP API Example

SEXP s = Rf_allocVector(STRSXP, 5);PROTECT(s);SET_STRING_ELT(s, 0, mkChar("hello"));SET_STRING_ELT(s, 1, mkChar("goodbye"));SEXP c = STRING_ELT(s, 0);const char *v = CHAR(c);UNPROTECT(1);

Page 32: Native interfaces for R

STRSXP/CHARSXP API Example

SEXP s = Rf_allocVector(STRSXP, 5);PROTECT(s);SET_STRING_ELT(s, 0, mkChar("hello"));SET_STRING_ELT(s, 1, mkChar("goodbye"));SEXP c = STRING_ELT(s, 0);const char *v = CHAR(c);UNPROTECT(1);

Page 33: Native interfaces for R

STRSXP/CHARSXP API Example

SEXP s = Rf_allocVector(STRSXP, 5);PROTECT(s);SET_STRING_ELT(s, 0, mkChar("hello"));SET_STRING_ELT(s, 1, mkChar("goodbye"));SEXP c = STRING_ELT(s, 0);const char *v = CHAR(c);UNPROTECT(1);

Page 34: Native interfaces for R

Outline

The .Call Native Interface

R types at the C-level

Memory management in R

Self-Study Exercises

Odds and Ends

Resources

Page 35: Native interfaces for R

R’s memory model (dramatization)

PROTECTstack

Preciouslist

All SEXPs

DRAMATIZATION

reachable

Page 36: Native interfaces for R

R’s memory modelI R allocates, tracks, and garbage collects SEXPsI gc is triggered by R functionsI SEXPs that are not in-use are recycled.I A SEXP is in-use if:

I it is on the protection stack (PROTECT/UNPROTECT)I it is in the precious list (R PreserveObject/R ReleaseObject)I it is reachable from a SEXP in the in-use list.

PROTECTstack

Preciouslist

All SEXPs

DRAMATIZATION

reachable

Page 37: Native interfaces for R

Preventing gc of SEXPs with the protection stack

PROTECT(s) Push s onto the protection stackUNPROTECT(n) Pop top n items off of the protection stack

Page 38: Native interfaces for R

Generic memory: When everything isn’t a SEXP

Allocation Function Lifecycle

R_alloc Memory is freed when returning from .CallCalloc/Free Memory persists until Free is called

Page 39: Native interfaces for R

Case Study: C implementation of which

Code review of nidemo/src/which.c

Page 40: Native interfaces for R

Outline

The .Call Native Interface

R types at the C-level

Memory management in R

Self-Study Exercises

Odds and Ends

Resources

Page 41: Native interfaces for R

Alphabet frequency of a text file

Yeast Gene YDL143W

frequency

d

c

g

t

a

0.00 0.10 0.20 0.30

English Dictionary

frequency

r

o

a

i

e

0.07 0.08 0.09 0.10

Page 42: Native interfaces for R

Self-Study in nidemo package

1. Alphabet FrequencyI Explore R and C implementationsI Make C implementation robustI Attach names attribute in CI multiple file, matrix return enhancement

2. External pointer example

3. Calling R from C example and enhancement

4. Debugging: a video how-to

Page 43: Native interfaces for R

Outline

The .Call Native Interface

R types at the C-level

Memory management in R

Self-Study Exercises

Odds and Ends

Resources

Page 44: Native interfaces for R

Odds and Ends

1. Debugging

2. Public API

3. Calling R from C

4. Making sense of the remapped function names

5. Finding C implementations of R functions

6. Using a TAGS file

Page 45: Native interfaces for R

Debugging Native Code

Rprintf("DEBUG: v => %d, x => %s\n", v, x);

$ R -d gdb(gdb) run> # now R is running

There are two ways to write error-free programs; only thethird one works.– Alan Perlis, Epigrams on Programming

Page 46: Native interfaces for R

R’s Public API

mkdir R-devel-build; cd R-devel-build~/src/R-devel-src/configure && make## hint: configure --help

ls include

R.h Rdefines.h Rinterface.h S.hR_ext/ Rembedded.h Rinternals.h

Details in WRE

Page 47: Native interfaces for R

Calling R functions from C

1. Build a pairlist

2. call Rf_eval(s, rho)

Page 48: Native interfaces for R

Remapped functions

I Source is unadorned

I Preprocessor adds Rf_

I Object code contains Rf_

Page 49: Native interfaces for R

Finding C implementations

cd R-devel/src/main

grep '"any"' names.c{"any", do_logic3, 2, ...}

grep -l do_logic3 *.clogic.c

Page 50: Native interfaces for R

R CMD rtags

cd R-devel/srcR CMD rtags --no-Rd# creates index file TAGS

Page 51: Native interfaces for R

Outline

The .Call Native Interface

R types at the C-level

Memory management in R

Self-Study Exercises

Odds and Ends

Resources

Page 52: Native interfaces for R

Writing R Extensions

RShowDoc("R-exts")

Page 53: Native interfaces for R

Book: The C Programming Language (K & R)

Java in a Nutshell 1264 pagesThe C++ Programming Language 1030 pagesThe C Programming Language 274 pages

Page 54: Native interfaces for R

Resources

I Read WRE and“K & R”

I Use the sources for R and packages

I R-devel, bioc-devel mailing lists

I Patience