1 an introduction ucf, methods in ecology, fall 2008 an introduction by danny k. hunt eric d....

19
1 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction – UCF, Methods in Ecology, Fall 2008 An Introduction An Introduction By By Danny K. Hunt & Eric D. Stolen Danny K. Hunt & Eric D. Stolen Working In R Working In R (with speaker notes)

Upload: randell-sullivan

Post on 18-Jan-2018

213 views

Category:

Documents


0 download

DESCRIPTION

3 An Introduction – UCF, Methods in Ecology, Fall 2008 More About Dataframes  Dataframes Reviewed –Are objects that consist of rows and columns –Closely related to MS Access or Excel tables –Specific requirements for dataframes  Observations are in rows  The response variable and explanatory variables are in columns  Same variable results go into the same column

TRANSCRIPT

Page 1: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

11An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

An IntroductionAn Introduction

ByByDanny K. Hunt & Eric D. StolenDanny K. Hunt & Eric D. Stolen

Working In RWorking In R(with speaker notes)

Page 2: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

22An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

What We Will LearnWhat We Will Learn

More About DataframesMore About DataframesSlicing, Dicing and Sorting DataSlicing, Dicing and Sorting DataManipulating Dataframes & Manipulating Dataframes & Aggregating DataAggregating DataSimple Iterative ProcessingSimple Iterative ProcessingBasic Data VisualizationBasic Data Visualization

Page 3: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

33An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

More About DataframesMore About Dataframes

Dataframes ReviewedDataframes Reviewed– Are objects that consist of rows and columnsAre objects that consist of rows and columns– Closely related to MS Access or Excel tables Closely related to MS Access or Excel tables – Specific requirements for dataframesSpecific requirements for dataframes

Observations are in rowsObservations are in rows The response variable and explanatory variables are in The response variable and explanatory variables are in

columnscolumns Same variable results go into the same columnSame variable results go into the same column

Page 4: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

44An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

ObservationObservation ResponseResponse TreatmentTreatment

Obs1Obs1 1.51.5 AA

Obs2Obs2 1.61.6 BB

Obs3Obs3 1.31.3 CC

…… …… ……

Dataframes ReviewedDataframes Reviewed– Specific requirements for dataframes (continued)Specific requirements for dataframes (continued)

Observations are in rowsObservations are in rows The response variable and explanatory variables are in The response variable and explanatory variables are in

columnscolumns

More About DataframesMore About Dataframes

Page 5: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

55An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

More About DataframesMore About Dataframes

ControlControl X1X1 X2X2

1.31.3 1.41.4 1.31.3

1.11.1 1.31.3 1.61.6

1.41.4 1.51.5 1.81.8

IDID ResponseResponse TreatmentTreatment

11 1.31.3 ControlControl

22 1.11.1 ControlControl

33 1.41.4 ControlControl

44 1.41.4 X1X1

55 1.31.3 X1X1

66 1.51.5 X1X1

77 1.31.3 X2X2

88 1.61.6 X2X2

99 1.81.8 X2X2

Dataframes ReviewedDataframes Reviewed– Specific requirements for dataframes (continued)Specific requirements for dataframes (continued)

Same variable results go into the same columnSame variable results go into the same column Good dataframe conventionsGood dataframe conventions

Page 6: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

66An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

More About DataframesMore About Dataframes

Loading Dataframes with DataLoading Dataframes with Data– Use the Use the read.table()read.table() series of functions series of functions

Particularly simple and efficient is the Particularly simple and efficient is the read.delimread.delim functionfunction

– Load Snake dataset and get acquaintedLoad Snake dataset and get acquainted snake <- read.delim(“c:\\pract_i-t\\snake_data.txt", row.names="ID")snake <- read.delim(“c:\\pract_i-t\\snake_data.txt", row.names="ID") names(snake)names(snake) snake[1:5,]snake[1:5,] summary(snake)summary(snake)

Page 7: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

77An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

More About DataframesMore About Dataframes

Loading Dataframes with Data (continued)Loading Dataframes with Data (continued)– Identifying columns as categorical variablesIdentifying columns as categorical variables

During read operations alpha-numeric fields are During read operations alpha-numeric fields are automatically encoded as factorsautomatically encoded as factors

Numeric fields are automatically assumed to be Numeric fields are automatically assumed to be continuous valuescontinuous values

Identify numeric fields as factors using:Identify numeric fields as factors using:– snake$landc <- factor(snake$landc)snake$landc <- factor(snake$landc)– snake$SEX <- factor(snake$SEX)snake$SEX <- factor(snake$SEX)

summary(snake)summary(snake)

Page 8: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

88An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

Slicing, Dicing and Sorting DataSlicing, Dicing and Sorting Data

SubscriptsSubscripts– Performing data extraction by indexing:Performing data extraction by indexing:

snake[1:5,]snake[1:5,] snake[40,]snake[40,] snake[,1]snake[,1] snake[,c(7, 1, 2)]snake[,c(7, 1, 2)] snake[-c(10, 20, 30, 40), c("Name", "SEX", "ha")]snake[-c(10, 20, 30, 40), c("Name", "SEX", "ha")]

Page 9: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

99An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

Slicing, Dicing and Sorting DataSlicing, Dicing and Sorting Data

Subscripts (continued)Subscripts (continued)– Performing data extraction by using filtering:Performing data extraction by using filtering:

snake[snake$landc == 1,]snake[snake$landc == 1,] snake[snake$landc %in% c(1, 3),]snake[snake$landc %in% c(1, 3),] snake[snake$mcp > 200,]snake[snake$mcp > 200,] snake[snake$mcp > 200 & snake$times < 60,]snake[snake$mcp > 200 & snake$times < 60,] snake[grep("^m", snake$Name ),]snake[grep("^m", snake$Name ),]

Page 10: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

1010An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

Slicing, Dicing and Sorting DataSlicing, Dicing and Sorting Data

Sorting DataSorting Data– snake[order(snake$Name),]snake[order(snake$Name),]– snake[order(snake$landc, -snake$mcp),]snake[order(snake$landc, -snake$mcp),]– snake[order(-snake$times),][1:10,]snake[order(-snake$times),][1:10,]

Page 11: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

1111An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

Manipulating Dataframes &Manipulating Dataframes &Aggregating DataAggregating Data

Adding Columns to a DataframeAdding Columns to a Dataframe– Close out and restart RClose out and restart R– Load the rapid fish datasetLoad the rapid fish dataset

fish <- read.delim("c:\\pract_i-t\\rapid_fish.txt", row.names="ID")fish <- read.delim("c:\\pract_i-t\\rapid_fish.txt", row.names="ID") rapid$impoundment <- factor(rapid$impoundment)rapid$impoundment <- factor(rapid$impoundment) rapid$season <- factor(rapid$season)rapid$season <- factor(rapid$season) rapid$open_veg <- factor(rapid$open_veg)rapid$open_veg <- factor(rapid$open_veg) rapid$sea_code <- factor(rapid$sea_code)rapid$sea_code <- factor(rapid$sea_code) rapid$imp_code <- factor(rapid$imp_code)rapid$imp_code <- factor(rapid$imp_code) rapid$cov_code <- factor(rapid$cov_code)rapid$cov_code <- factor(rapid$cov_code)

Page 12: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

1212An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

Manipulating Dataframes &Manipulating Dataframes &Aggregating DataAggregating Data

Adding Columns to a Dataframe (cont.)Adding Columns to a Dataframe (cont.)– Add “unique” columnAdd “unique” column

rapid$unique <- rapid$unique <- paste(rapid$Point, rapid$season, sep = "")paste(rapid$Point, rapid$season, sep = "")

rapid$unique <- factor(rapid$unique)rapid$unique <- factor(rapid$unique)

– Add log transformation of count columnAdd log transformation of count column rapid$lncount <- log(rapid$count + 1)rapid$lncount <- log(rapid$count + 1)

Page 13: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

1313An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

Manipulating Dataframes &Manipulating Dataframes &Aggregating DataAggregating Data

Overview the Rapid Fish DatasetOverview the Rapid Fish Dataset– names(rapid)names(rapid)– rapid[1:5,] rapid[1:5,] – summary(rapid)summary(rapid)

Filtered SummariesFiltered Summaries– summary(rapid[rapid$open_veg == "open",])summary(rapid[rapid$open_veg == "open",])– summary(rapid[rapid$open_veg == "vegetated",])summary(rapid[rapid$open_veg == "vegetated",])

Page 14: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

1414An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

Manipulating Dataframes &Manipulating Dataframes &Aggregating DataAggregating Data

Cross TabulationCross Tabulation– Using the Using the table()table() function function

table(rapid$impoundment)table(rapid$impoundment) table(rapid[, c("open_veg", "impoundment")])table(rapid[, c("open_veg", "impoundment")]) table(rapid[, c("open_veg", "impoundment", "season")])table(rapid[, c("open_veg", "impoundment", "season")])

Page 15: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

1515An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

Manipulating Dataframes &Manipulating Dataframes &Aggregating DataAggregating Data

Data AggregationData Aggregation– Using the Using the aggregation()aggregation() function function

aggregate(rapid$count, list(impoundment=rapid$impoundment), mean)aggregate(rapid$count, list(impoundment=rapid$impoundment), mean) aggregate(rapid$count, list(open_veg=rapid$open_veg, aggregate(rapid$count, list(open_veg=rapid$open_veg,

impoundment=rapid$impoundment, season=rapid$season), mean)impoundment=rapid$impoundment, season=rapid$season), mean)

Page 16: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

1616An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

Manipulating Dataframes &Manipulating Dataframes &Aggregating DataAggregating Data

Merging ResultsMerging Results– Using the Using the merge()merge() function function

n <- table(rapid[, c("open_veg", "impoundment", "season")])n <- table(rapid[, c("open_veg", "impoundment", "season")]) m <- aggregate(rapid$count, list(open_veg=rapid$open_veg, m <- aggregate(rapid$count, list(open_veg=rapid$open_veg,

impoundment=rapid$impoundment, season=rapid$season), mean)impoundment=rapid$impoundment, season=rapid$season), mean) names(m)[4] <- "mean"names(m)[4] <- "mean" s <- aggregate(rapid$count, list(open_veg=rapid$open_veg, s <- aggregate(rapid$count, list(open_veg=rapid$open_veg,

impoundment=rapid$impoundment, season=rapid$season), sd)impoundment=rapid$impoundment, season=rapid$season), sd) names(s)[4] <- "sd"names(s)[4] <- "sd" nm <- merge(n, m)nm <- merge(n, m) names(nm)[4] <- "n"names(nm)[4] <- "n" habimpsea <- merge(nm, s)habimpsea <- merge(nm, s) habimpseahabimpsea

Page 17: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

1717An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

Simple Iterative ProcessingSimple Iterative Processing

Repeating a ProcessRepeating a Process– Using the Using the for()for() control-flow construct control-flow construct

site <- sort(unique(rapid$impoundment))site <- sort(unique(rapid$impoundment)) for (i in 1:length(site)) {for (i in 1:length(site)) { print (summary(rapid[rapid$impoundment == site[i],]))print (summary(rapid[rapid$impoundment == site[i],])) }}

Page 18: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

1818An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

Basic Data VisualizationBasic Data Visualization

Visualizing DataVisualizing Data– Using the Using the plot()plot() function function

plot(as.numeric(rapid$sea_code), rapid$count)plot(as.numeric(rapid$sea_code), rapid$count)

– Using the Using the boxplot()boxplot() function function boxplot(rapid$count~rapid$impoundment)boxplot(rapid$count~rapid$impoundment)

– Using the Using the hist()hist() function function par(mfrow=c(1,2)) par(mfrow=c(1,2)) hist(rapid$count)hist(rapid$count) hist(rapid$lncount)hist(rapid$lncount)

Page 19: 1 An Introduction  UCF, Methods in Ecology, Fall 2008 An Introduction By Danny K. Hunt  Eric D. Stolen Working In R (with speaker notes)

1919An Introduction – UCF, Methods in Ecology, Fall 2008An Introduction – UCF, Methods in Ecology, Fall 2008

The EndThe End