intro to r - hofroe.nethofroe.net/stat480/07-r-intro.pdf · • download rstudio from ! • on a...
TRANSCRIPT
Intro to Rstat 480
Heike Hofmann
Outline
• Getting Started & Setup
• R basics (refresher)
• Syntax
• Examining Objects
• Extracting Parts
• Basic Graphics: Scatterplots, Histograms, Boxplots
Install R• On your own machine:
• Go to http://www.r-project.org/
• From CRAN, pick download site (ISU might be good)
• Download from base:
• Download newest R version
• Run the installation script
• Download RStudio from www.rstudio.org
• On a lab machine:
• Start RStudio by double-clicking the icon
Setting up
• Download file 07-r-intro.R
• Open the R script in RStudio
• Edit lines
• Cut and paste lines of code into the R interpreter window
ConsoleFile !
Directory
Working !Environment
Pieces of RStudio
The R language
• Learning a new language: grammar, vocabulary
• Loading, examining, summarizing data
• Creating data
• Getting help
• Miscellaneous useful stuff
Learning a new language is hard!
Learning a language
• Grammar / Syntax
• Vocabulary
• “Thinking in that language”
Grammar
• Basic algebra is the same
• but 2*x not 2x, 2^p instead of 2p
• Applying a function is similar
• Making a variable, use <- instead of =
• Everything in R is a vector
• Index a vector using [ ]
Like mathematics
Examples• x = 2 / 3
• √x
• a = 2(x + 3)2
• y = (1 2 3 5)T
• y1
• ∑y
• 2y
• f(y, 2) = 2y
x <- 2 / 3!
sqrt(x)!
a <- 2 *(x + 3)^2!
y <- c(1, 2, 3, 5)!
y[1]!
sum(y)!
2*y!
f <- function(x, y) return(x*y)!
f(y, 2)
You try
• x = (4 1 3 9)T
• y = (1 2 3 5)T
• d = √∑ (xi - yi)2
• 2(y1 + x3)
Vocabulary
• What verbs (=functions) do you need to know?
• Loading data
• Accessing parts of things
• Statistical summaries
• ...
R Reference Card
• Download the R Reference Card fromhttp://cran.r-project.org/doc/contrib/Short-refcard.pdf
• Open/Print so that you can glance at it while working
Loading data
• Import data with:
• read.csv() for csv files
• (and use file.choose() to help find your file)
• Save from excel as csv files (use Save As)
• Stored in a data.frame
Your turn
• Download FBI data fbi.xls from the website, open in Excel and export as csv file
• Load it into R fbi <- read.csv(file.choose())
• Did the data import work?
• Advanced: Try and break the data import, but adding odd characters to excel (try #, , “, ), read ?read.csv and figure out what’s going on
•x!
•head(x)!
•summary(x)!
•str(x)!
•dim(x)
Examining variables
Try these commands out for the fbi object!
Navigating the R interpreter window
• Up/down arrow keys to retrieve previous lines
• Left/right arrow keys to move cursor along line
• Mouse click to set cursor position
• Delete to remove and re-type parts of command
Getting Help
•?command!
•help(command) !
•help.search(command)
Getting Out
•q()
What do we have?
• A data.frame = a list of variables of the same length (but may be different types)
• Has row and column names
Extracting bits of data.frame
•x$variable!
•x[, "variable"]!
•x[rows, columns]!
•x[1:5, 2:3]!
•x[c(1,5,6), c("State","Year")]!
•x$variable[rows]
Statistical summaries
•mean, median, min, max, range!
•sd, var, cor
Your turn
• Compute correlation between Population and number of burglaries
• Look at first 10 data records
• Compute mean and standard deviation for each variable. Why do you get NAs? (read ?NA)
• Advanced: Read ?mean and ?sd, and fix missing value problem