Jorge Cimentada and Basilio Moreno
6th of July 2019
Functions are R's black box…
Take the function mean as example.
mean(iris$Sepal.Length)
[1] 5.843333
Functions are just like other 'commands' in Stata, SPSS or SAS.
SPSS: mean()
Stata: mean; egen mean
SAS: MEAN
Around 250,000 to be more exact! 190 times more than SAS.
We don't have enough time to cover functions, for that, see here.
Today we'll cover the basics. Let's start!
Can anyone tell me what does the mean() function do?
sum() all numbers and divide by the total length() of the vector.mean_vector <- 1:100
sum(mean_vector)/length(mean_vector)
[1] 50.5
How can we turn this into a function?
our_mean <- function(x) {
sum(x)/length(x)
}
our_mean(mean_vector)
[1] 50.5
mean(mean_vector)
[1] 50.5
Great job!
our_mean <- function(x) {
sum(x)/length(x)
}
our_mean is the name of our functionx is the only argument (but there can be more!){} is the code to execute, more formally, the body of the function..1 Create a function called adder
.2 It accepts two arguments called x and y
.3 Inside the body, add y and x and don't give with it a name.
adder <- function(x, y) {
y + x
}
You often create function to avoid repeating code.
Example:
mtcars_two <- mtcars
mtcars_two$cyl <- as.character(mtcars$cyl)
mtcars_two$vs <- as.character(mtcars$vs)
mtcars_two$am <- as.character(mtcars$am)
mtcars_two$gear <- as.character(mtcars$gear)
mtcars_two$carb <- as.character(mtcars$carb)
Transforming, eh? Typical.
old_var and new_varFirst we start with the code that works
old_var <- "cyl"
new_var <- "cyl"
as.character(mtcars$old_var)
Does this work?
as.character(mtcars[, old_var])
Now we have to assign the new name.
mtcars$new_var <- as.character(mtcars[, old_var])
Does this work?
mtcars[new_var] <- as.character(mtcars[, old_var])
Okay, so we got this working…
old_var <- "cyl"
new_var <- "cyl"
mtcars[new_var] <- as.character(mtcars[, old_var])
Wrap it in a function!
to_character <- function(old_var, new_var) {
mtcars[new_var] <- as.character(mtcars[, old_var])
mtcars
}
our_mtcars <- to_character(new_var = "cyl", old_var = "cyl") # why did this order change?
class(our_mtcars$cyl)
[1] "character"
All good and well but this only works for the mtcars dataset!
df to the to_character functionmtcars with df inside the functionto_character <- function(df, old_var, new_var) {
df[new_var] <- as.character(df[, old_var])
df
}
Let's try it with the iris data!
This data frame is already available in the working environment.
Check head(iris)
our_iris <- to_character(iris, "Species", "Species") # why didn't I name the arguments?
class(our_iris$Species)
[1] "character"
Just as in our own function, functions can have many many arguments or options.
For example..
url <- "https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
mtcars <- read.csv(file = url, sep = ",", header = TRUE, row.names = 1)
Answer this:
When you don't know what a function or its arguments do, search for its help page.
?read.csvThings to consider:
?mean?sdWith this vector
vec <- sample(c(1:100, NA), 1000, replace = T)
mean and sd (standard deviation)In R everything is a function, which means that you should learn how to understand functions.
x <- table(sample(1:5, 100, replace = T))
Using ?barplot and barplot(), reproduce the plot from below exactly.
barplot(x) to see what you're missingTake it a bit further and create a plot like this..
x <- rnorm(100)
y <- x + rnorm(100, sd = 2)
This will require to read ?plot in detail! That's the whole point of understanding functions.
Start simple by running plot(x, y)!
Help files have several sections you need to be aware of.
For example, let's create a data frame. This would be the function to use.
?data.frame
How many arguments have I used?
data.frame(num = 1:10, char = letters[1:10], sample(c(T, F), 10, replace = T))
What changed from the example in the help document?
data.frame(num = 1:10, char = letters[1:10], sample(c(T, F), 10, replace = T),
row.names = 1, check.rows = TRUE, fix.empty.names = FALSE)
In the RECSM seminars you'll be using some advanced R which is why we need to take you to the limit!
lm (Fitting linear models) function and the mtcars dataset.by to split mtcars by the factor cyl and apply the summary functionmtcars called mpg_mean using ifelse. It gives back a 1 when mpg is above or equal to the mean and 0 when it's not.Remember to use ?function
lm(mpg ~ vs + cyl, data = mtcars)
by(mtcars, mtcars$cyl, summary)
mtcars$mpg_mean <- ifelse(mtcars$mpg >= mean(mtcars$mpg), 1, 0)
Packages are one of the most important things in R.
Where are R packages? In something called CRAN (Comprehensive R Archive Network)
How do you install them?
install.packages("cowsay")
install.packages("lme4")
How do you use them? Once installed we will have to call them in order to get them running in the current session.
library("cowsay")
library("lme4")
Here we have some more info provided by the help documents.
?cowsay::say
?lme4::nlmer
Read a bit, and then check the examples!
How do we repeat things?
for (column in mtcars) {
if (is.numeric(column)) {
print(is.numeric(column))
} else {
message("Not numeric")
}
}
Let's explain it in the console…
I think you're ready for some real R programming…