introduction to r (& rstudio)mar 01, 2016  · introduction to r (& rstudio) fall r workshop...

55
Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016

Upload: others

Post on 29-Jul-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Introduction to R (& Rstudio)

Fall R Workshop August 23-24, 2016

Page 2: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Why R? •  FREE •  Open source

–  Constantly updating the functions is has –  Constantly adding new functions

•  Learning R will help you learn other programming languages –  e.g., Matlab, Python, jullia

•  Flexible –  Can do anything you want it to do

•  Free

Page 3: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

SETTING UP R (RSTUDIO ENVIRONMENT)

Page 4: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

What is R?

•  R is a language program and environment for statistical learning and graphics

Page 5: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Installing R & RStudio

•  Install R First http://www.r-project.org/

•  Install Rstudio http://www.rstudio.com/products/rstudio/

Page 6: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Getting Oriented with RStudio

Page 7: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Overview

•  Today: – Objects (Nouns) – Functions (Verbs) – Packages and Files

•  Tomorrow* – Functions (Review+) – Hands on Exercise

*Psych students only.

Page 8: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

OBJECTS

Page 9: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Object

•  A basic concept in (statistical) programming is called an object.

•  An object allows you to store a value (e.g. 4) or a thing (e.g. a function or description) in R. You use the object’s name to easily access this value or thing.

Page 10: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Object

> pi

[1] 3.141593

Page 11: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

There are two ways of assigning objects

•  Myobject = thingtoassign

•  my_object <- thingtoassign

Page 12: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Exercise

•  Try creating your own object •  In the console, assign the number 42 to

‘myvariable’ •  Then type ‘myvariable’ in the console to

print out the assigned value

Look! Your environment has stored your object!

Page 13: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

REMOVE POST-ITS!!

(and save!)

Page 14: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Exercise

•  Now let’s try adding together two objects •  Assign the value of 5 to a object called

apples. •  Assign the value of 2 to a object called

oranges. •  Give the command apples + oranges

and find out how much fruit you have!

Page 15: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Exercise

•  Now let’s try adding together two objects •  5 apples, 2 oranges, how much fruit in

your basket?

> apples <- 5 > oranges <- 2 > apples + oranges [1] 7

> fruit <- apples + oranges > fruit [1] 7

_

Page 16: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

install.packages("babynames")install.packages("ggplot2")

library(babynames)library(ggplot2)

MyName <- "Sara"birthday <- 1990MySex <- "F"

data("babynames")colnames(babynames)myName.df <- subset(babynames, name == MyName)

ggplot(myName.df, aes(x = year, y = prop, color=sex)) + geom_line() + geom_point(aes(x = birthday,

y = myName.df[myName.df$name == MyName & myName.df$year == birthday & myName.df$sex == MySex, "prop"]),

color="black") + ggtitle(paste("Popularity of", MyName))

_

Page 17: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

REMOVE POST-ITS!!

(and save!)

Page 18: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Basic Object Classes •  Numeric – Decimals

•  Integer – Natural numbers

•  Character – Text or String Characters – Always inside quotations. – Factors (or categories)

•  Logical – Boolean values (True or False) – No quotations. –  2 possible values: TRUE or FALSE

Page 19: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Basic Data Class

•  To check what data class your object is, you can use the class() function > class(fruit) [1] “numeric”

Page 20: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Exercise

Assign values to the different objects •  Name (character) •  Age (numeric) •  Height (numeric) •  Psychologist (logical)

•  Check the class of each.

Page 21: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Exercise Output

> name <- “sara”

> age <- 25

> height <- 5.58

> psych <- TRUE

> class(name)

“character”

Page 22: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

install.packages("babynames")install.packages("ggplot2")

library(babynames)library(ggplot2)

MyName <- "Sara"birthday <- 1990MySex <- "F"

data("babynames")colnames(babynames)myName.df <- subset(babynames, name == MyName)

ggplot(myName.df, aes(x = year, y = prop, color=sex)) + geom_line() + geom_point(aes(x = birthday,

y = myName.df[myName.df$name == MyName & myName.df$year == birthday & myName.df$sex == MySex, "prop"]),

color="black") + ggtitle(paste("Popularity of", MyName))

Page 23: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

REMOVE POST-ITS!!

(and save!)

Page 24: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Vectors

•  Group of objects

•  c() function to create a vector

•  c() is combine or concatenate. •  Note: when you combine objects, the new

objects will have the class of the least specific object.

Page 25: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Exercise

•  5 Volunteers

Page 26: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Factors

•  Character objects – Character strings represent distinct groups.

•  Control, Treatment •  Male, Female

•  Functions to create – factor() – as.factor().

•  table() to list counts of each category

Page 27: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Example

•  Make a factor

Page 28: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Indexing

•  Having a group of objects is great, but sometimes you only want one or a few of those objects.

Page 29: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Indexing

•  To index a vector – []

Page 30: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Indexing

•  To index a vector – []

•  For example > height[3]

[1] 5.2

Page 31: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Indexing

•  You can select multiple objects. •  To select many in a row:

> height[3:5]

[1] 5.2 4.9 5.3

•  To select many not in a row: > height[c(1,2,5)]

[1] 5.2 6.5 5.3

Page 32: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Indexing

•  You can go crazy and combine these! > height[c(2:3, 5)]

[1] 5.2 5.3 7.0

Page 33: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Exercise •  Create a new object called gender and

assign to it the genders of 5 people in this room. – Make sure you use c() to group the values

together. – Make sure you put “” around each value.

•  Tell R that gender is actually a factor.

•  Use table to find out how many men and how many women are in your group.

Page 34: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

REMOVE POST-ITS!!

(and save!)

Page 35: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Data Frames

•  Used for storing data by combining vectors of equal length.

Page 36: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Data Frames Name Gender Age Extraversion Performance

Lucas M 18 3.7 88

Doreen F 23 2.1 91

Will M 19 2.9 76

Erin F 24 3.1 85

Amy F 17 1.8 96

Bob M 22 3.2 89

Page 37: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Data Frames Name Gender Age Extraversion Performance

Lucas M 18 3.7 88

Doreen F 23 2.1 91

Will M 19 2.9 76

Erin F 24 3.1 85

Amy F 17 1.8 96

Bob M 22 3.2 89

Page 38: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Data Frames Name Gender Age Extraversion Performance

Lucas M 18 3.7 88

Doreen F 23 2.1 91

Will M 19 2.9 76

Erin F 24 3.1 85

Amy F 17 1.8 96

Bob M 22 3.2 89

Page 39: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Data Frames Name Gender Age Extraversion Performance

Lucas M 18 3.7 88

Doreen F 23 2.1 91

Will M 19 2.9 76

Erin F 24 3.1 85

Amy F 17 1.8 96

Bob M 22 3.2 89

Page 40: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Data Frames Name Gender Age Extraversion Performance

Lucas M 18 3.7 88

Doreen F 23 2.1 91

Will M 19 2.9 76

Erin F 24 3.1 85

Amy F 17 1.8 96

Bob M 22 3.2 89

Page 41: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Data Frames Name Gender Age Extraversion Performance

Lucas M 18 3.7 88

Doreen F 23 2.1 91

Will M 19 2.9 76

Erin F 24 3.1 85

Amy F 17 1.8 96

Bob M 22 3.2 89

Page 42: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Data Frames

•  Used for storing data tables by combining vectors and factors of equal length

•  data.frame()

Page 43: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Exercise

•  Class data frame

Page 44: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

install.packages("babynames")install.packages("ggplot2")

library(babynames)library(ggplot2)

MyName <- "Sara"birthday <- 1990MySex <- "F"

data("babynames")colnames(babynames)myName.df <- subset(babynames, name == MyName)

ggplot(myName.df, aes(x = year, y = prop, color=sex)) + geom_line() + geom_point(aes(x = birthday,

y = myName.df[myName.df$name == MyName & myName.df$year == birthday & myName.df$sex == MySex, "prop"]),

color="black") + ggtitle(paste("Popularity of", MyName))

Page 45: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

REMOVE POST-ITS!!

(and save!)

Page 46: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Indexing (again)

•  Data frames can be indexed just like vectors

•  EXCEPT: data frames have 2 dimensions

Page 47: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Indexing (again)

> dataset[1:5,4]

Age

1 18

2 18

3 19

4 18

5 21

Page 48: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Indexing (again)

•  dataset[rows, columns]

Page 49: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Indexing (again)

•  dataset[rows, columns] •  If you want all the columns, leave the

second part blank (and vice versa) > dataset[1:5,]

ID Name Sex Age

1 105 Jon M 18

2 106 Deanna F 18

3 107 Sara F 19

4 108 Debbie F 18

5 109 Randy M 21

Page 50: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Indexing (again)

•  But your variables have names. Use those!

> dataset$Age

[1] 18 19 18 18 21

Page 51: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Indexing (again)

•  But your variables have names. Use those!

> dataset$Age

[1] 18 19 18 18 21

> dataset[,“Age”]

[1] 18 19 18 18 21

Page 52: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Indexing (again)

•  But your variables have names. Use those!

> dataset[,c(“Age”,“Performance”)]

Age Performance

1 18 88

2 18 91

3 19 76

4 18 85

5 21 96

Page 53: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

install.packages("babynames")install.packages("ggplot2")

library(babynames)library(ggplot2)

MyName <- "Sara"birthday <- 1990MySex <- "F"

data("babynames")colnames(babynames)myName.df <- subset(babynames, name == MyName)

ggplot(myName.df, aes(x = year, y = prop, color=sex)) + geom_line() + geom_point(aes(x = birthday,

y = myName.df[myName.df$name == MyName & myName.df$year == birthday & myName.df$sex == MySex, "prop"]),

color="black") + ggtitle(paste("Popularity of", MyName))

Page 54: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Lists & Matrices

•  Lists: generic vector containing objects – list()

•  Matrices: mathematical matrix – matrix()

Page 55: Introduction to R (& Rstudio)Mar 01, 2016  · Introduction to R (& Rstudio) Fall R Workshop August 23-24, 2016 . Why R? • FREE • Open source – Constantly updating the functions

Questions?