poetry with r -- dissecting the code

Post on 27-Jan-2015

118 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

ERUG meeting -- December 11, 2012

TRANSCRIPT

introduction()

# Poetry is considered a form of literary art in which

# language is used for its aesthetic and evocative qualities. It

# contains multiple interpretations and therefore resonates

# differently in each reader. # # Code is the language used to communicate with computers. It has its

# own rules (syntax) and meaning (semantics). Like literature writers

# or poets, coders also have their own style that include - strategies

# for optimizing the code being read by a computer, and facilitating

# its understanding through visual organization and comments for other

# coders. # # Code can speak literature, logic, maths. It contains different

# layers of abstraction and it links them to the physical world of

# processors and memory chips. All these resources can contribute in

# expanding the boundaries of contemporary poetry by using code as a

# new language. Code to speak about life or death, love or hate. Code

# meant to be read, not run.

url("http://code-poems.com")

R version 2.15.2 (2012-10-26) -- "Trick or Treat"

Copyright (C) 2012 The R Foundation for Statistical Computing

ISBN 3-900051-07-0

Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.

You are welcome to redistribute it under certain conditions.

Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.

Type 'contributors()' for more information and

'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or

'help.start()' for an HTML browser interface to help.

Type 'q()' to quit R.

>

R version 2.15.2 (2012-10-26) -- "Trick or Treat"

Copyright (C) 2012 The R Foundation for Statistical Computing

ISBN 3-900051-07-0

Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.

You are welcome to redistribute it under certain conditions.

Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.

Type 'contributors()' for more information and

'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or

'help.start()' for an HTML browser interface to help.

Type 'q()' to quit R.

> print("Hello World")

[1] "Hello World"

>

Character vector of length 1 (mode

and type (typeof) comes with it)

Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.

Type 'contributors()' for more information and

'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or

'help.start()' for an HTML browser interface to help.

Type 'q()' to quit R.

> print("Hello World")

[1] "Hello World"

> print

function (x, ...)

UseMethod("print")

<bytecode: 0x0000000010a7a2c8>

<environment: namespace:base>

>

Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.

Type 'contributors()' for more information and

'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or

'help.start()' for an HTML browser interface to help.

Type 'q()' to quit R.

> print("Hello World")

[1] "Hello World"

> print

function (x, ...)

UseMethod("print")

<bytecode: 0x0000000010a7a2c8>

<environment: namespace:base>

>

This means that code

Is compiled, not interpreted

-- thus faster, but not for today…

Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.

Type 'contributors()' for more information and

'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or

'help.start()' for an HTML browser interface to help.

Type 'q()' to quit R.

> print("Hello World")

[1] "Hello World"

> print

function (x, ...)

UseMethod("print")

<bytecode: 0x0000000010a7a2c8>

<environment: namespace:base>

>

Environment where it is

defined. Not for now…

> print("Hello World")

[1] "Hello World"

> print

function (x, ...)

UseMethod("print")

<bytecode: 0x0000000010a7a2c8>

<environment: namespace:base>

> formals("print")

$x

$...

> ?print

starting httpd help server ... done

>

Arglist, help also helps. x is an input object.

print prints its argument and returns it invisibly (via invisible(x)).

It is a generic function which means that new printing methods can be easily added for new classes.

> print("Hello World")

[1] "Hello World"

> print

function (x, ...)

UseMethod("print")

<bytecode: 0x0000000010a7a2c8>

<environment: namespace:base>

> formals("print")

$x

$...

> ?print

starting httpd help server ... done

>

print prints its argument and returns it invisibly (via invisible(x)).

It is a generic function which means that new printing methods can be easily added for new classes.

Generic function.

summary(so_far)

# o R is interpreted language, but

# bite compiling is possible (see

# compiler package).

# o In the background, everything is

# about environments (which are

# similar to lists), but luckily, this

# is hidden from average user.

# o Everything is an object -- OO.

# o Objects come in classes.

# o Methods can be defined for objects.

> set.seed(1234)

> x <- runif(10)

> y <- 2 + 5 * x + rnorm(10)

> plot(x, y)

>

Random numbers are important.

r unif != run if, uniform

normal

Another generic function.

> set.seed(1234)

> x <- runif(10)

> y <- 2 + 5 * x + rnorm(10)

> plot(x, y)

> (n <- cor.test(x, y))

Pearson's product-moment correlation

data: x and y

t = 5.9327, df = 8, p-value = 0.0003487

alternative hypothesis: true correlation is not equal to 0

95 percent confidence interval:

0.6325328 0.9770136

sample estimates:

cor

0.9026646

>

Parenthesis: print in short.

> (n <- cor.test(x, y))

Pearson's product-moment correlation

data: x and y

t = 5.9327, df = 8, p-value = 0.0003487

alternative hypothesis: true correlation is not equal to 0

95 percent confidence interval:

0.6325328 0.9770136

sample estimates:

cor

0.9026646

> class(n)

[1] "htest"

>

> class(n)

[1] "htest"

> str(n)

List of 9

$ statistic : Named num 5.93

..- attr(*, "names")= chr "t"

$ parameter : Named int 8

..- attr(*, "names")= chr "df"

$ p.value : num 0.000349

$ estimate : Named num 0.903

..- attr(*, "names")= chr "cor"

$ null.value : Named num 0

..- attr(*, "names")= chr "correlation"

$ alternative: chr "two.sided"

$ method : chr "Pearson's product-moment correlation"

$ data.name : chr "x and y"

$ conf.int : atomic [1:2] 0.633 0.977

..- attr(*, "conf.level")= num 0.95

- attr(*, "class")= chr "htest"

>

It is a list, nothing more.

Attributes are important

Class is an attribute.

..- attr(*, "conf.level")= num 0.95

- attr(*, "class")= chr "htest"

> methods(class="htest")

[1] print.htest*

Non-visible functions are asterisked

>

..- attr(*, "conf.level")= num 0.95

- attr(*, "class")= chr "htest"

> methods(class="htest")

[1] print.htest*

Non-visible functions are asterisked

> print.htest

Error: object 'print.htest' not found

>

..- attr(*, "conf.level")= num 0.95

- attr(*, "class")= chr "htest"

> methods(class="htest")

[1] print.htest*

Non-visible functions are asterisked

> print.htest

Error: object 'print.htest' not found

> getAnywhere("print.htest")

A single object matching ‘print.htest’ was found

It was found in the following places

registered S3 method for print from namespace stats

namespace:stats

with value

function (x, digits = 4, quote = TRUE, prefix = "", ...)

{

cat("\n")

Gimme that damn thing!

cat(strwrap(x$method, prefix = "\t"), sep = "\n")

cat("\n")

cat("data: ", x$data.name, "\n")

out <- character()

if (!is.null(x$statistic))

out <- c(out, paste(names(x$statistic), "=", format(round(x$statistic,

4))))

if (!is.null(x$parameter))

out <- c(out, paste(names(x$parameter), "=", format(round(x$parameter,

3))))

if (!is.null(x$p.value)) {

fp <- format.pval(x$p.value, digits = digits)

out <- c(out, paste("p-value", if (substr(fp, 1L, 1L) ==

"<") fp else paste("=", fp)))

}

cat(strwrap(paste(out, collapse = ", ")), sep = "\n")

if (!is.null(x$alternative)) {

cat("alternative hypothesis: ")

if (!is.null(x$null.value)) {

if (length(x$null.value) == 1L) {

alt.char <- switch(x$alternative, two.sided = "not equal to",

less = "less than", greater = "greater than")

cat("true", names(x$null.value), "is", alt.char,

x$null.value, "\n")

}

else {

cat(x$alternative, "\nnull values:\n")

print(x$null.value, ...)

}

}

else cat(x$alternative, "\n")

}

if (!is.null(x$conf.int)) {

cat(format(100 * attr(x$conf.int, "conf.level")), "percent confidence interval:\n",

format(c(x$conf.int[1L], x$conf.int[2L])), "\n")

}

if (!is.null(x$estimate)) {

cat("sample estimates:\n")

print(x$estimate, ...)

}

cat("\n")

invisible(x)

}

<bytecode: 0x0000000010f7a3e0>

<environment: namespace:stats>

>

Return the value invisibly.

Defined in stats pkg.

> m <- lm(y ~ x)

>

lm linear regression

y ~ x formula, let’s get back to

this later

> m <- lm(y ~ x)

> m

Call:

lm(formula = y ~ x)

Coefficients:

(Intercept) x

1.740 4.857

>

lm linear regression

y ~ x formula, let’s get back to

this later

This makes life so much easier,

talk about it next year.

> m <- lm(y ~ x)

> m

Call:

lm(formula = y ~ x)

Coefficients:

(Intercept) x

1.740 4.857

> names(m)

[1] "coefficients" "residuals" "effects" "rank"

[5] "fitted.values" "assign" "qr" "df.residual"

[9] "xlevels" "call" "terms" "model"

>

lm linear regression

y ~ x formula, let’s get back to

this later

This makes life so much easier,

talk about it next year.

> print.lm

function (x, digits = max(3, getOption("digits") - 3), ...)

{

cat("\nCall:\n", paste(deparse(x$call), sep = "\n", collapse = "\n"),

"\n\n", sep = "")

if (length(coef(x))) {

cat("Coefficients:\n")

print.default(format(coef(x), digits = digits), print.gap = 2,

quote = FALSE)

}

else cat("No coefficients\n")

cat("\n")

invisible(x)

}

<bytecode: 0x0000000010542380>

<environment: namespace:stats>

>

> (s <- summary(m))

Call:

lm(formula = y ~ x)

Residuals:

Min 1Q Median 3Q Max

-0.7372 -0.4189 -0.2076 0.2832 1.2928

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.7401 0.4539 3.834 0.004990 **

x 4.8571 0.8187 5.933 0.000349 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6751 on 8 degrees of freedom

Multiple R-squared: 0.8148, Adjusted R-squared: 0.7917

F-statistic: 35.2 on 1 and 8 DF, p-value: 0.0003487

>

> coef(m)

(Intercept) x

1.740131 4.857122

>

> coef(m)

(Intercept) x

1.740131 4.857122

> coef(s)

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.740131 0.4538773 3.833924 0.0049900329

x 4.857122 0.8186989 5.932733 0.0003486669

>

> coef(m)

(Intercept) x

1.740131 4.857122

> coef(s)

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.740131 0.4538773 3.833924 0.0049900329

x 4.857122 0.8186989 5.932733 0.0003486669

> class(s)

[1] "summary.lm"

>

> coef(m)

(Intercept) x

1.740131 4.857122

> coef(s)

Estimate Std. Error t value Pr(>|t|)

(Intercept) 1.740131 0.4538773 3.833924 0.0049900329

x 4.857122 0.8186989 5.932733 0.0003486669

> class(s)

[1] "summary.lm"

> getAnywhere("print.summary.lm")

A single object matching ‘print.summary.lm’ was found

It was found in the following places

registered S3 method for print from namespace stats

namespace:stats

with value

function (x, digits = max(3, getOption("digits") - 3), symbolic.cor =

x$symbolic.cor,

signif.stars = getOption("show.signif.stars"), ...)

{

cat("\nCall:\n", paste(deparse(x$call), sep = "\n", collapse = "\n"),

"\n\n", sep = "")

resid <- x$residuals

df <- x$df

rdf <- df[2L]

cat(if (!is.null(x$weights) && diff(range(x$weights)))

print(summay(presentation))

# o Print method is simplest.

# o It conveys meaning to user.

# o Results are usually structures in

# different ways,

# o need methods to access information: > methods(class="lm")

[1] add1.lm* alias.lm* anova.lm

[4] case.names.lm* confint.lm* cooks.distance.lm*

[7] deviance.lm* dfbeta.lm* dfbetas.lm*

[10] drop1.lm* dummy.coef.lm* effects.lm*

[13] extractAIC.lm* family.lm* formula.lm*

[16] hatvalues.lm influence.lm* kappa.lm

[19] labels.lm* logLik.lm* model.frame.lm

[22] model.matrix.lm nobs.lm* plot.lm

[25] predict.lm print.lm proj.lm*

[28] qr.lm* residuals.lm rstandard.lm

[31] rstudent.lm simulate.lm* summary.lm

[34] variable.names.lm* vcov.lm*

> if (questions) {

+ answer(questions)

+ } else q("no")

# ERUG meeting -- December 11, 2012

# Poetry with R -- Dissecting the code

# P. Solymos

top related