Last time
nsim <- 1000
n <- 100 # sample size
beta0 <- 0.5 # intercept
beta1 <- 1 # slope
results <- rep(NA, nsim)
for(i in 1:nsim){
x <- runif(n, min=0, max=1)
noise <- rchisq(n, 1)
y <- beta0 + beta1*x + noise
lm_mod <- lm(y ~ x)
ci <- confint(lm_mod, "x", level = 0.95)
results[i] <- ci[1] < 1 & ci[2] > 1
}
mean(results)
What if I want to repeat my simulations with a different sample size n
?
Simulation code for multiple sample sizes
nsim <- 1000
beta0 <- 0.5 # intercept
beta1 <- 1 # slope
results <- rep(NA, nsim)
n <- 100 # sample size
for(i in 1:nsim){
...
}
n <- 200 # new sample size
for(i in 1:nsim){
...
}
Are there any issues with this code?
Coding best practices
So far:
- No magic numbers
- Comment your code
- Use informative names
- Set a seed for reproducibility
Also: don’t repeat the same chunk of code multiple times
Functions
assess_coverage <- function(n, nsim){
results <- rep(NA, nsim)
for(i in 1:nsim){
x <- runif(n, min=0, max=1)
noise <- rchisq(n, 1)
y <- 0.5 + 1*x + noise
lm_mod <- lm(y ~ x)
ci <- confint(lm_mod, "x", level = 0.95)
results[i] <- ci[1] < 1 & ci[2] > 1
}
return(mean(results))
}
assess_coverage(n = 100, nsim = 1000)
Functions
Now I can change the value of n
without re-writing all the code!
assess_coverage(n = 100, nsim = 1000)
assess_coverage(n = 200, nsim = 1000)
Function components
Here is a simple function to calculate the absolute value:
my_abs <- function(x){
return(ifelse(x >= 0, x, -1*x))
}
my_abs(-3)
- name:
my_abs
- arguments:
x
- body: everything in the curly braces
{ }
Function arguments
- The arguments
n
and nsim
allow us to change the sample size and number of simulations
- What other parts of the simulation might we want to change?
assess_coverage <- function(n, nsim){
results <- rep(NA, nsim)
for(i in 1:nsim){
x <- runif(n, min=0, max=1)
noise <- rchisq(n, 1)
y <- 0.5 + 1*x + noise
lm_mod <- lm(y ~ x)
ci <- confint(lm_mod, "x", level = 0.95)
results[i] <- ci[1] < 1 & ci[2] > 1
}
return(mean(results))
}
Function arguments
assess_coverage <- function(n, nsim, beta0, beta1){
results <- rep(NA, nsim)
for(i in 1:nsim){
x <- runif(n, min=0, max=1)
noise <- rchisq(n, 1)
y <- beta0 + beta1*x + noise
lm_mod <- lm(y ~ x)
ci <- confint(lm_mod, "x", level = 0.95)
results[i] <- ci[1] < beta1 & ci[2] > beta1
}
return(mean(results))
}
Ordering and arguments
my_power <- function(x, y){
return(x^y)
}
- If you don’t name the arguments when calling a function, R assumes you passed them in the order of the function definition
Function defaults
my_power <- function(x, y){
return(x^y)
}
What will happen when I run the following code?
Function defaults
my_power <- function(x, y){
return(x^y)
}
What will happen when I run the following code?
Error in my_power(3): argument "y" is missing, with no default
Function defaults
my_power <- function(x, y=2){
return(x^y)
}
What will happen when I run the following code?
Function defaults
my_power <- function(x, y=2){
return(x^y)
}
What will happen when I run the following code?
Function defaults
my_power <- function(x, y=2){
return(x^y)
}
What will happen when I run the following code?
Function defaults
my_power <- function(x, y=2){
return(x^y)
}
What will happen when I run the following code?
Function defaults
my_power <- function(x, y){
return(x^y)
}
What will happen when I run the following code?
Function defaults
my_power <- function(x, y){
return(x^y)
}
What will happen when I run the following code?
Error in my_power(3): argument "y" is missing, with no default
Function defaults
my_power <- function(x=2, y=4){
return(x^y)
}
What will happen when I run the following code?
Function defaults
my_power <- function(x=2, y=4){
return(x^y)
}
What will happen when I run the following code?
Function arguments
We can also pass functions as arguments!
assess_coverage <- function(n, nsim, beta0, beta1, noise_dist){
results <- rep(NA, nsim)
for(i in 1:nsim){
x <- runif(n, min=0, max=1)
noise <- noise_dist(n)
y <- beta0 + beta1*x + noise
lm_mod <- lm(y ~ x)
ci <- confint(lm_mod, "x", level = 0.95)
results[i] <- ci[1] < beta1 & ci[2] > beta1
}
return(mean(results))
}
assess_coverage(n = 100, nsim = 1000, beta0 = 0.5, beta1 = 1,
noise_dist = rexp)
Summary
- Functions can be used to avoid repeating code
- Arguments allow us specify the inputs when we call a function
- If inputs are not named when calling the function, R uses the ordering from the function definition
- All arguments must be specified when calling a function
- Default arguments can be specified when the function is defined
- The input to a function can be a function!