intro to r - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • download rstudio from ! • on a...

22
Intro to R stat 480 Heike Hofmann

Upload: others

Post on 31-May-2020

24 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Intro to Rstat 480

Heike Hofmann

Page 2: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Outline

• Getting Started & Setup

• R basics (refresher)

• Syntax

• Examining Objects

• Extracting Parts

• Basic Graphics: Scatterplots, Histograms, Boxplots

Page 3: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Install R• On your own machine:

• Go to http://www.r-project.org/

• From CRAN, pick download site (ISU might be good)

• Download from base:

• Download newest R version

• Run the installation script

• Download RStudio from www.rstudio.org

• On a lab machine:

• Start RStudio by double-clicking the icon

Page 4: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Setting up

• Download file 07-r-intro.R

• Open the R script in RStudio

• Edit lines

• Cut and paste lines of code into the R interpreter window

Page 5: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

ConsoleFile !

Directory

Working !Environment

Pieces of RStudio

Page 6: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

The R language

• Learning a new language: grammar, vocabulary

• Loading, examining, summarizing data

• Creating data

• Getting help

• Miscellaneous useful stuff

Page 7: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Learning a new language is hard!

Page 8: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Learning a language

• Grammar / Syntax

• Vocabulary

• “Thinking in that language”

Page 9: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Grammar

• Basic algebra is the same

• but 2*x not 2x, 2^p instead of 2p

• Applying a function is similar

• Making a variable, use <- instead of =

• Everything in R is a vector

• Index a vector using [ ]

Like mathematics

Page 10: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Examples• x = 2 / 3

• √x

• a = 2(x + 3)2

• y = (1 2 3 5)T

• y1

• ∑y

• 2y

• f(y, 2) = 2y

x <- 2 / 3!

sqrt(x)!

a <- 2 *(x + 3)^2!

y <- c(1, 2, 3, 5)!

y[1]!

sum(y)!

2*y!

f <- function(x, y) return(x*y)!

f(y, 2)

Page 11: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

You try

• x = (4 1 3 9)T

• y = (1 2 3 5)T

• d = √∑ (xi - yi)2

• 2(y1 + x3)

Page 12: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Vocabulary

• What verbs (=functions) do you need to know?

• Loading data

• Accessing parts of things

• Statistical summaries

• ...

Page 13: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

R Reference Card

• Download the R Reference Card fromhttp://cran.r-project.org/doc/contrib/Short-refcard.pdf

• Open/Print so that you can glance at it while working

Page 14: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Loading data

• Import data with:

• read.csv() for csv files

• (and use file.choose() to help find your file)

• Save from excel as csv files (use Save As)

• Stored in a data.frame

Page 15: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Your turn

• Download FBI data fbi.xls from the website, open in Excel and export as csv file

• Load it into R fbi <- read.csv(file.choose())

• Did the data import work?

• Advanced: Try and break the data import, but adding odd characters to excel (try #, , “, ), read ?read.csv and figure out what’s going on

Page 16: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

•x!

•head(x)!

•summary(x)!

•str(x)!

•dim(x)

Examining variables

Try these commands out for the fbi object!

Page 17: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Navigating the R interpreter window

• Up/down arrow keys to retrieve previous lines

• Left/right arrow keys to move cursor along line

• Mouse click to set cursor position

• Delete to remove and re-type parts of command

Page 18: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Getting Help

•?command!

•help(command) !

•help.search(command)

Getting Out

•q()

Page 19: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

What do we have?

• A data.frame = a list of variables of the same length (but may be different types)

• Has row and column names

Page 20: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Extracting bits of data.frame

•x$variable!

•x[, "variable"]!

•x[rows, columns]!

•x[1:5, 2:3]!

•x[c(1,5,6), c("State","Year")]!

•x$variable[rows]

Page 21: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Statistical summaries

•mean, median, min, max, range!

•sd, var, cor

Page 22: Intro to R - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • Download RStudio from ! • On a lab machine:! • Start RStudio by double-clicking the icon. Setting up • Download

Your turn

• Compute correlation between Population and number of burglaries

• Look at first 10 data records

• Compute mean and standard deviation for each variable. Why do you get NAs? (read ?NA)

• Advanced: Read ?mean and ?sd, and fix missing value problem