The airlines
SQL database contains information on 48 million flights from the Bureau of Transportation Statistics (you have worked with a small subset of this data in the nycflights13
package). Information from the database can be obtained through SQL queries. For example, the flights contains the following tables:
4 records
Tables_in_airlines |
---|
airports |
carriers |
flights |
planes |
Make sure the mdsr
and DBI
packages are installed, then run the following code in R:
Let’s try a quick query to see that it works:
## name
## 1 Titan Airways
## 2 Tradewind Aviation
## 3 Comlux Aviation, AG
## 4 Master Top Linhas Aereas Ltd.
## 5 Flair Airlines Ltd.
Here, we are querying SQL through R. We store the query in a string, and then use the dbGetQuery
function to send that query to the database and return the results.
tailnum
and year
columns from the planes
table. For safety, we are only going to return the first 5 rows.MIN()
function, find the oldest date of manufacture (year
) in the planes
table.WHERE
, find the tailnum
of the oldest plane in the planes
table.Questions 2 and 3 required two separate queries: first to find the oldest plane, then to find the tail number. We can be more efficient by using a subquery to choose only the years equal to the smallest year
query <- "
SELECT tailnum, year
FROM planes
WHERE year = (SELECT MIN(year) FROM planes)
LIMIT 0,5;
"
dbGetQuery(db, query)
## tailnum year
## 1 N381AA 1956
SUM(1)
or COUNT(...)
to count rows.