Data
We will work with data on global health and economic development. The data is called gapminder
, and is part of the gapminder
package in R. You will need to install the gapminder
package before beginning the questions.
dplyr practice with gapminder data
In this activity, you will practice using the dplyr
functions we learned in class.
- Fill in the following code to create a new data frame, containing only countries in 2007 with life expectancy at least 70 years and GDP per capita at most $20000.
new_gapminder <- gapminder |>
filter(year ...,
lifeExp ...,
gdpPercap ...)
- Fill in the following code to count the number of countries in each continent in the data for 2007.
gapminder |>
filter(...) |>
count(...)
- Fill in the following code to create a data frame with a new column that is the natural log of GDP per capita. (Hint: in R, the natural log function is
log
).
new_gapminder <- gapminder |>
mutate(log_gdp_percap = ...)
- Fill in the following code to calculate the median natural log of GDP per capita in countries with a life expectancy of at least 70 years in 2007. (Hint: in R, the median function is
median
).
gapminder |>
mutate(log_gdp_percap = ...) |>
filter(...) |>
summarize(...)
- Fill in the following code to calculate the median natural log of GDP per capita in countries with a life expectancy of at least 70 years in 2007, broken down by continent.
gapminder |>
mutate(...) |>
filter(...) |>
group_by(...) |>
summarize(...)
Does it matter whether we mutate
or filter
first in question 5?
Calculate the median life expectancy for each continent in 2007, and the correlation between log GDP per capita and life expectancy for each continent in 2007.
Using your summary statistics, describe the relationship between GDP per capita and life expectancy, and summarize the differences between continents.