<- 1:365 days
Class activity solutions, August 28
The birthday problem
- Creating a vector to store the days of the year (useful for sampling later):
- Choosing birthdays:
set.seed(33)
<- 30
n_students <- sample(days, n_students, replace=TRUE) birthdays
- Are there 30 unique birthdays, or do we have a repeated birthday?
length(unique(birthdays)) < n_students
[1] TRUE
We have at least one repeated birthday!
- Now let’s repeat the simulation many times:
set.seed(33)
<- 1:365 # days of the year
days <- 30
n_students
<- 10000
nsim <- rep(NA, nsim) # store the simulation results
results for(i in 1:nsim){
<- sample(days, n_students, replace=TRUE)
birthdays <- length(unique(birthdays)) < n_students
results[i]
}
mean(results)
[1] 0.7077
The probability of at least one shared birthday is approximately 71%.
- How many students do we need for the probability to be approximately 50%? The answer is 23:
set.seed(213)
<- 1:365 # days of the year
days <- 23
n_students
<- 10000
nsim <- rep(NA, nsim) # store the simulation results
results for(i in 1:nsim){
<- sample(days, n_students, replace=TRUE)
birthdays <- length(unique(birthdays)) < n_students
results[i]
}
mean(results)
[1] 0.5076