a quick introduction to r
DESCRIPTION
Very quick introduction to the language R. It talks about basic data structures, data manipulation steps, plots, control structures etc. Enough material to get you started in R.TRANSCRIPT
![Page 1: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/1.jpg)
1
Introduction to RWhat is R?
Getting Started
Data structuresScalar (number, string,
Boolean, Date-time) , Vector, Matrix, Data frame, List
Input / Output
Plots
Control Logic
Working with Strings
Writing Functions
Angshuman Saha
![Page 2: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/2.jpg)
2
What is R?
• R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.
• R can be downloaded and installed from CRAN website (http://www.r-project.org/)
• CRAN stands for Comprehensive R Archive Network
• Installation comes with base, stat and a few other packages. Other than that, there are hundreds of contributed packages enabling users to a variety of specialized computation on data
![Page 3: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/3.jpg)
3
Getting Started in R
![Page 4: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/4.jpg)
4
Getting StartedDouble - click on the R icon on your desktop to start RThis launches the R GUI window
In the command prompt you can directly type your code and hit Enter. This will run the code. This however runs the code one line at a time.
1. Using command prompt
You can use a standard text editor like Notepad to create your R code and save it in a text file. You can manually copy the whole code from there and paste it in the RGUI window. This will run the whole code.
2. Using external text files
You may save your R code in a text file with extension “.r”. You can then source this file to run the code. Use “File>Source R code” from the menu to do this. Alternatively, you may type following command in R prompt source(“D:/myFirstRcode.r”) to run the code. You need to specify the full path of your R code file within double-quotes, while using source().
3. Using .r files
![Page 5: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/5.jpg)
5
Data Structure : Vector
![Page 6: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/6.jpg)
6
Vector > Creation
x = c(10, 12.3 , 45) # create a vector of 3 numbersx = c(FALSE, TRUE , TRUE, FALSE) # create a vector of 4 logical (boolean) variablesx = c(“red”, “green” , “blue”) # create a vector of 3 strings
x = c(1:15) # create a vector of integers 1 to 15x = 1:15 # equivalent to previous code
x = rep( 5.6 , 10) # repeat 5.6, 10 times. Vector of length 10 , all entries equal to 5.6x = rep( c(1,2) , c(3,2) ) # x= (1,1,1,2,2)
x = seq( 10 , 14 , 2) # sequence from 10 to 14 in steps of 2. x=(10,12,14)
x = vector(mode="numeric", length=0) # Initialize a zero length numeric vector, values will be put inside
it later
![Page 7: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/7.jpg)
7
Vector > Accessing Elements
x = c(10, 12.3 , 45, 55, 65, 75, 85) # create a vector y=x[2] # y has value 12.3 y=x[c(5,6,7)] # y is a vector with 5th,6th and 7th value of x y=x[ -c(5,6,7) ] # y is a vector with all but 5th,6th and 7th value of x y=x[c(1,1,3,4,7,7)] # y = (10,10,45,55,85,85)
Vector > Namingx = c(10, 45, 55 ) # create a vector names(x) = c(“first”, ”second”, ”third”) # name the elements of x
y=x[ “second” ] # y= 45. Elements can be accessed by name.
a = “third” ; y=x[ a ] # y = 55. Name can be passed through another variable
![Page 8: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/8.jpg)
8
Vector > operations
x = c(10, 45, 55 ) ; y = c(1, 5, 6 ) # create two vectors x and y
z = x + y # z=(11,50,61) . Element-wise additionz = x - y # z=(9,40,49) . Element-wise subtractionz = x * y # z=(10,225,330) . Element-wise multiplicationz = x / y # z=(10,9,1.66667) . Element-wise divisionz = x ^2 # z=(100,2025,3025) . Element-wise squaring
z = x[x>20] # z=(45,55) . All elements of x that are >20
z= which(x>20) # z= (2,3). Indices of x where x>20
z1 = x[x>20] ; z2 = x[ which( x>20 ) ] ; u= which(x>20) ; z3=x[u]# z1 z2 and z3 are all identical
![Page 9: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/9.jpg)
9
Data Structure : Matrix
![Page 10: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/10.jpg)
10
Matrix > Creationx = matrix( 10, nrow=3 , ncol = 5) # x is a 3 by 5 matrix with all entries = 10
Matrix can be created from a vector x = 1:12 ; mat = matrix(x , nrow = 4 , ncol=3)
[,1] [,2] [,3][1,] 1 5 9[2,] 2 6 10[3,] 3 7 11[4,] 4 8 12
By default, numbers are stacked column wise.To change that , use byrow = TRUE x = 1:12 ; mat = matrix(x , nrow = 4 , ncol=3 , byrow = TRUE)
[,1] [,2] [,3][1,] 1 2 3[2,] 4 5 6[3,] 7 8 9[4,] 10 11 12Row and column names can be
assigned colnames(mat) =c("col1","col2","col3")
rownames( mat ) = paste( “rowID ” , 1:4, sep=“_”)
col1 col2 col3rowID_1 1 2 3rowID_2 4 5 6rowID_3 7 8 9rowID_4 10 11 12
![Page 11: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/11.jpg)
11
Matrix > SubsettingConsider the Matrix – mat in previous example x = mat[ 2, ] # a vector containing second row of mat y = mat[ ,3 ] # a vector containing third column of mat x = mat[ “rowID_3”, ] # third row of mat x = mat[ ,”col2” ] # second column of mat newmat = mat[ 1:2, 2:3 ] # sub-matrix of mat newmat = mat[ c(1,2,4) , c(1,3) ] # sub-matrix of mat
diag_entries = diag(mat) # vector (1,5,9)
col1 col2 col3rowID_1 1 2 3rowID_2 4 5 6rowID_3 7 8 9rowID_4 10 11 12
Row / column names can be changed rownames(mat) [3] = “third” ; colnames(mat)[2]=“second col”
col1 Nm2 col3rowID_1 1 2 3rowID_2 4 5 6third 7 8 9rowID_4 10 11 12
Set all values > 9 to 99mat [mat>9] = 99
col1 Nm2 col3rowID_1 1 2 3rowID_2 4 5 6third 7 8 9rowID_4 99 99 99
![Page 12: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/12.jpg)
12
Matrix > OperationsElement-wise operations
mat1 = matrix(1:12, nrow=4, ncol = 3) mat2 = matrix( 10*(1:12), nrow=4, ncol = 3) mat3 = mat1 + mat2 # element-wise addition # Similarly we can have element-wise # subtraction , multiplication , division
mat1 = matrix(1:16, nrow=4, ncol = 4) mat2 = matrix( 10*(1:16), nrow=4, ncol = 4) mat3 = mat1 %*% mat2 # matrix-multiplication
Matrix multiplication
mat1 = matrix( rnorm(16) ,4,4) mat2 = solve( mat1 )
Matrix inversion
![Page 13: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/13.jpg)
13
Data Structure : Data Frame
![Page 14: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/14.jpg)
14
Data Frame > Background
• Data frame can be thought of as a matrix where the columns may be of different types (e.g. text, date, number, logical)
• Most datasets we work with can be stored as data frame
• Row / column subsetting works just like matrices
• Row and column names can be assigned
![Page 15: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/15.jpg)
15
Data Frame > Creation
Data frames can be created by stacking individual vectors column-wise
cust = c(“Bob” , “John” , “Jane”) age= c(67, 45, 52) ownHouse = c( FALSE , FALSE, TRUE) cust_dat = data.frame( Name= cust, Age = age, ownHouse = ownHouse)
Name Age ownHouse1 Bob 67 FALSE2 John 45 FALSE3 Jane 52 TRUE
Data frames can also be created by reading data from a csvcust_dat = read.csv( file = “custData.csv” , header = TRUE, stringsAsFactors = FALSE)
header = TRUE says that the 1st row of the file contains column names stringsAsFactors = FALSE do not convert character vectors to “factors”
![Page 16: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/16.jpg)
16
Data Frame > CreationConsider two data frames - cust1 & cust2
cust = rbind(cust1 , cust2)
Name Age ownHouse1 Bob 67 FALSE2 John 45 FALSE3 Jane 52 TRUE
Name Age ownHouse1 Bill 55 TRUE2 Jack 75 TRUE3 Deb 49 TRUE
Name Age ownHouse1 Bob 67 FALSE2 John 45 FALSE3 Jane 52 TRUE4 Bill 55 TRUE5 Jack 75 TRUE6 Deb 49 TRUE
Two data frames can be stacked below each other
A new data frame can be created by subsetting an existing data frame
cust = cust[cust$Age > 60 , ]
Name Age ownHouse1 Bob 67 FALSE5 Jack 75 TRUE
![Page 17: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/17.jpg)
17
Data Frame > Creation
cust0 = data.frame( Name=character(0) , Age=numeric(0) , ownHouse = logical(0) )
[1] Name Age ownHouse<0 rows> (or 0-length row.names)
An empty data frame can be created by specifying column names and types. It can be populated later.
An empty data frame can be created from an existing data frame
cust0 = cust[0,]
[1] Name Age ownHouse<0 rows> (or 0-length row.names)
![Page 18: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/18.jpg)
18
Data Frame > CreationTwo data frames can be merged by a common column
By default, only common records are returned.Using options - all , all.x , all.y – different record sets are obtained. Records may contain missing values.
Name Age ownHouse1 Bob 67 FALSE2 John 45 FALSE3 Jane 52 TRUE
Name PetCount hasCar1 Bob 1 TRUE2 John 0 FALSE3 Jill 5 TRUE
cust= merge(cust1,cust2 , by = "Name")
Name Age ownHouse PetCount hasCar1 Bob 67 FALSE 1 TRUE2 John 45 FALSE 0 FALSE
cust= merge(cust1,cust2 ,
by = "Name" , all = TRUE)
Name Age ownHouse PetCount hasCar1 Bob 67 FALSE 1 TRUE2 Jane 52 TRUE NA NA3 Jill NA NA 5 TRUE4 John 45 FALSE 0 FALSE
![Page 19: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/19.jpg)
19
Data Structure : List
![Page 20: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/20.jpg)
20
List > Background
• List can be thought of as a vector, whose elements may be of different types
LIST
vector
matrix
Another List
![Page 21: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/21.jpg)
21
List > Creation
An empty list mylist = list() # nothing is known about the list mylist = vector(mode=“list”, length=5) # length is known upfront
Non- empty list mylist = list( c(1,5,7) , “abc” , matrix(0,3,3) )
List with names mylist = list( comp1 = c(1,5,7) , comp2 = “abc” , comp3 = matrix(0,3,3) )
![Page 22: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/22.jpg)
22
List > Accessing the entries
By Index mylist = list( c(1,5,7) , “abc” , matrix(0,3,3) ) x = mylist[[1]] # x is a vector (1,5,7) x = mylist[[2]] # x is a string “abc” x = mylist[[1]] # x is a 3-by-3 matrix of zeros
By Name mylist = list( comp1 = c(1,5,7) , comp2 = “abc” , comp3 = matrix(0,3,3) ) x = mylist$comp1 # x is a vector (1,5,7) x = mylist$comp2 # x is a string “abc” x = mylist$comp3 # x is a 3-by-3 matrix of zeros
![Page 23: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/23.jpg)
23
List > Updating entries
By Index
By Name
mylist = list( comp1 = c(1,5,7) , comp2 = “abc” , comp3 = matrix(0,3,3) )
mylist[[4]] = 1024 # create a new entry at 4th position a number 1024 mylist = mylist[-3] # drop the third entry from mylistmylist[[2]] = “New Entry” # update the second entry
mylist$comp99 = 1024 # create a new entry at 4th position its name “comp99” mylist$comp1 = c(10,10) # update the entry – “comp1”
mylist = list( comp1 = c(1,5,7) , comp2 = “abc” , comp3 = matrix(0,3,3) ) names( mylist) # returns the vector – (“comp1” , “comp2” , “comp3”) names( mylist) = c(“A”,”B”,”C”) # change the names of the components names( mylist)[2] =”second” # change only the name of the second component
Renaming components
Subsets newlist = mylist[ c(1,3,4) ] # new list contains the first, third and fourth entry of mylist
![Page 24: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/24.jpg)
24
Data Structure : Date & Time
![Page 25: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/25.jpg)
25
Data Structure: DatesSys.time() # Returns the current system date and time.
x = strptime("02-07-2012",format="%m-%d-%Y")
x = strptime("02-feb-2012",format="%d-%b-%Y")
x = strptime("02-feb-2012 15:45:10",format="%d-%b-%Y %H:%M:%S")
String to Date-time
x = Sys.time() # on typing x in console you see : "2012-06-22 11:44:01 IST"
y = strftime(x , format="%d-%b-%Y") # "22-Jun-2012"
y = strftime(x , format="date: %d-%b-%Y >> Time: %H+%M+%S")
# "date: 22-Jun-2012 >> Time: 11+44+01«
y = strftime(x , format="%d-%b-%Y %a >> Time: %H hour %M min %S sec")
#"22-Jun-2012 Fri >> Time: 11 hour 44 min 01 sec"
Date-time to String
Study R help on date-time variables to learn about a large number of possible format options
![Page 26: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/26.jpg)
26
Data Structure: DatesTwo main (internal) formats for date-time are : POSIXct and POSIXlt
POSIXct : A short format of date-time, typically used to store date-time columns in a data frame
POSIXlt : A long format of date-time, various other sub-units of time can be extracted from here
x = Sys.time() # on typing x in console you see : "2012-06-22 11:44:01 IST"
y = as.POSIXlt(x) # Convert from POSIXct to POSIXlt
z = c(y$mon, y$year, y$hour, y$min, y$wday) # z = (5, 112, 11, 51, 5)
Examples
difftimex1 = strptime("02-07-2012 14:20:34",format="%m-%d-%Y %H:%M:%S ")
x2 = strptime("11-07-2012 14:20:34",format="%m-%d-%Y %H:%M:%S ")
y = x2-x1 # y is a difftime object
x1 + as.difftime( 1 , units="days") # "2012-02-08 14:20:34 IST“
x1 + as.difftime( 10 , units=“mins")# "2012-02-07 14:30:34 IST"
![Page 27: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/27.jpg)
27
Data Structure : Others
![Page 28: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/28.jpg)
28
Data Structures: Others
NULLNULL is typically used for initializing variables. The code “x=NULL” creates a
variable x of length zero. It can later be converted to other values by overwriting x with some other values. The function is.null() returns TRUE of FALSE and tells whether a variable is NULL or not.
Other than the data structures described so far, there are a few very useful data types.
NANA is used for denoting missing values. The code “x=NA” creates a
variable x with missing values. The function is.na() returns TRUE of FALSE and tells whether a variable is NA or not.
NaNNaN stands for “Not a Number”. The code “x= sqrt(-10) ; y = log(-10)”
sets value of x and y to NaN. Also prints a warning message in console. The function is.nan() lets you check whether the value of a variable is NaN or not.
Inf Inf stands for “Infinity”. The code “x= 10/0 ; y = -3/0” sets value of x to Inf and y to -Inf.The function is.finite() lets you check whether the value of a variable is infinity or not.
![Page 29: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/29.jpg)
29
Input / output
![Page 30: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/30.jpg)
30
InputRead data (row-column format) from a csv filex = read.csv(file = “D:/mydata.csv” , header = TRUE, stringsAsFactors = FALSE)# x is a data frame containing the data in csv
Read data (row-column format) from a delimited filex = read.table( file = “D:/mydata.csv” , sep = “,” , header = TRUE, stringsAsFactors = FALSE)# x is a data frame containing the data in csv# read.csv is a special case of read.table with sep=“,”. # In read.table you may specify any character(s) of your choice as a separator
Reading arbitrary data using a lower level function : scan()Using scan() user can read character by character from a file.
These functions have many more optional input arguments to let user control the way in which data is read.
![Page 31: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/31.jpg)
31
OutputWrite a R object in R workspace to disk
Write a data frame to a file on disk# Assume: x is a data frame
# write.csv() writes it to a csv file on diskwrite.csv( x, file = “D:/ out.csv” , row.names = FALSE, col.names=TRUE, na = “”)
# write.table() writes it to any user-specified file. # write.csv(0 is a special case of write.tablewrite.table( x, file = “D:/ out.txt” ,
row.names = FALSE, col.names=TRUE, na = “” , sep = “\t” )
# Assume: x is an object in R workspacesave( x, file = “D:/ out.RData”)
![Page 32: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/32.jpg)
32
Plots
![Page 33: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/33.jpg)
33
Plots – xy plot
x = rnorm(100, mean = 2 , sd = 2)y = rnorm(100, mean = 10 , sd = 1)plot(x,y, xlab = "x-variable" , ylab = "y-variable", main = "scatter plot example" , pch = 19 , cex= 0.7, col="blue")
X-y scatter plotmain
ylab
xlab
A large number of options available to control – axes, tick marks, axes labels, legends, font type and size …. etc
![Page 34: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/34.jpg)
34
Plots - overlay
x = rnorm(100, mean = 2 , sd = 2)y = rnorm(100, mean = 10 , sd = 1)plot(x,y,xlab = "x-variable" , ylab = "y-variable", main = "scatter plot example" , pch = 19 , cex= 0.7, col="blue")
Generate a plot
Add red points laterx1 = rnorm(30, mean = 0 , sd = 1) y1 = rnorm(30, mean = 12 , sd = 0.5)
points(x1,y1,pch = 15 , col="red" , cex=1)
![Page 35: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/35.jpg)
35
Plots – multi panel plotx = rnorm(100, mean = 2 , sd = 2)y = rnorm(100, mean = 10 , sd = 1)
par(mfrow=c(2,2))
plot(x,y,xlab = "x-variable" , ylab = "y-variable", main = "scatter plot example" , pch = 19 , cex= 0.7, col="blue")
hist(x, xlab = "x-variable" , ylab = "frequency", main = "histogram-x" , col = "grey", border="blue" , lwd=2 )
hist(y, xlab = "y-variable" , ylab = "frequency", main = "histogram-y" , col = "grey", border="blue" , lwd=2 )
plot(density(x),col="limegreen",lwd=2, xlab="x",ylab="density",main="density plot")
par( mfrow=c(2,2)) splits the plot region into a 2-by2 matrix.Next 4 plot commands create plots in cells (1,1),(1,2),(2,1),(2,2)
![Page 36: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/36.jpg)
36
Plots – saving to a filex = rnorm(100, mean = 2 , sd = 2)y = rnorm(100, mean = 10 , sd = 1)
png(file = "D:/testplots.png")
par(mfrow=c(2,2))plot(x,y,xlab = "X" , ylab = "Y", main = " " , pch = 19 , cex= 0.7, col="blue")
plot( 0,0, type="n", axes=F, xlab="",ylab="",main="")
text(0,0, "NO DATA")
hist(y, xlab = "Y" , ylab = "frequency", main = "histogram-y" , col = "grey", border="blue" , lwd=2 )
plot(density(x),col="limegreen",lwd=2, xlab="x",ylab="density",main="density plot (X) ")
dev.off()
The code creates the above plot and saves it
in a png file in the location :
D:/testplots.png
![Page 37: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/37.jpg)
37
Control Logic
![Page 38: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/38.jpg)
38
Control
# Generate k random numbers from N(0,1) # k is not fixed apriori. # Stop when sum of the value exceed 5x = NULL ; stopIter = FALSEwhile( !stopIter) { x= c(x,rnorm(1,mean=0,sd=1) ) sumx=sum(x) ; if (sumx >5){stopIter = TRUE}
}
While ()
for ()# Example of for loopx = rnorm(100) ; y = rep(0, length(x))for(i in 1:length(x) ){ y[i] = x[i] ^3 }
![Page 39: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/39.jpg)
39
Working with Strings
![Page 40: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/40.jpg)
40
Working with Strings x= nchar("WRA data Filtering") #counts number of characters – x= 18 in this case
MetID = 2 ; x = paste(“Met”, MetID, sep = “:”) # string concatenation - x= “Met:2”
x = substr(“Met 12”, start=1, stop = 5) # substring from position 1 to 5 - x= “Met 1”
x = strsplit("Met1 has no data" , split = " ") # splits the string by “ ”. Returns a listy = unlist(x) # y is a vector with 4 elements – “Met1” , “has”, “no”, “data”
x= sub( pattern = "Met1” , replacement = “Met2” , x = “Met1 is empty") # replaces the first match - x = “Met2 is empty”
x= gsub("Met1” , “Met2” , x = “Met1 is empty. Met1 has no data.") # replaces all matches - x = “Met2 is empty. Met2 has no data.”
x = c( “red” , “Blue” , “green” , “skyblue” )y = grep(pattern =“blue”, x = x, ignore.case = TRUE) # y = (2,4) – positions of matchesz = grep(pattern =“blue”, x = x, ignore.case = TRUE, value = TRUE)
# z = (“Blue”,”skyblue”) – returns the actual strings that match the pattern
![Page 41: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/41.jpg)
41
Regular Expressions x=c("ht_10m","ht:20m"," ht_30m")
y = gsub("^ht_","HT:",x) # y = ("HT:10m" , "ht:20m" , " ht_30m")# Replace “ht_” at the beginning of the string with “HT:”
y = gsub(“m$",”mtr",x) # y = ("ht_10mtr“ , "ht:20mtr“ , " ht_30mtr")# Replace “m” at the end of the string with “mtr”
y = gsub(“[0-9]+",”XXX", x) # y = ("ht_XXXm" , "ht:XXXm" , " ht_XXXm")
# Replace one or more occurrence of digits with “XXX”
y = gsub(“_[0-9]+",”XXX", x) # y = ("htXXXm" , "ht:20m" , " htXXXm") # Replace one or more occurrence of digits preceeded by “_” with
“XXX”
u = grep(“^ht_[0-9]+m", x) ; y = x ; y[-u] = “invalid!” # y = ("ht_10m" , "invalid!“ , "invalid!")
# Used for checking the validity of format of a stringRegular expressions provide a vast number of options in manipulating strings. Study R help on regular expressions to know more.
![Page 42: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/42.jpg)
42
Writing functions
![Page 43: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/43.jpg)
43
Function
GetSummary = function ( x = NULL){
output = list( SumOfSqr = NA , Mean_x = NA, Failed = TRUE )
#Input Validationif(is.null(x) || length(x) ==0 || ){ return(output) }x1 = x[is.numeric(x)] ; if(length(x1) == 0)
{ return(output) }###############
output$SumOfSqr = sum( x^2 , na.rm = T)output$Mean_x = mean(x , na.rm = T)output$Failed = FALSE
return(output)}
Define the function
Use the function
x = rnorm(1000) ; out = GetSummary(x)
Argument
Default Value
Return Value
Comment
![Page 44: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/44.jpg)
44
Further Resources
![Page 45: A quick introduction to R](https://reader033.vdocuments.site/reader033/viewer/2022061223/54c34d2b4a795960188b460c/html5/thumbnails/45.jpg)
45
Further Help on R- http://cran.r-project.org/
- http://www.r-project.org/search.htmlThis page provides links to search engines specific to
R
- Search for “R tutorial” , “R forum” …
Have fun exploring the
world of R