Read all instructions before starting.

Motivation

In this assignment you will have an opportunity to practice programming in R.

Learning Outcomes

Completing this assignment, will provide you with practice opportunities to

Format

Working is individually is recommended, but working in pairs may be helpful.

Prerequisistes

Prior to working on this assignment, it is suggest that you review these lessons and refer to them during the assignment:

Tools Needed

Setup

Create a new project in R Studio and then, within that project, create a new R Notebook. Set the title parameter of the notebook to “Practice / Getting to know R”; set the author parameter of the notebook to your name; set the date parameter to today’s date.

Follow the instructions below and build an R code chunk for each of the questions below. If you don’t know how to proceed or understand the instructions, then be sure to follow the prerequisite tutorials.

You should not use any additional packages (such as purrr or tidyverse); you should learn to do the tasks using only ‘Base R’.

Use a level 2 header (using ##) for each new question and use the question number as your title, e.g., ## Question 3.

Label each code chunk with the question number and the objective, e.g.,

```{r Q1_LoadCSV}
   ... your code goes here ...
```

Tasks

  1. Load the CSV file FlightsWithAirlines.csv containing into a data frame called df.flights. Do not load the text (strings) attributes (columns) as factors, so use stringsAsFactors = FALSE as a parameter in your function that loads the data; load them as text. Load the file from the URL rather than downloading the file to your computer. To get the URL for the CSV, right-click on the link and then select “Copy Link Address” or a similar menu option for your browser; do not click on the link as that will cause the browser to attempt to download and display the file.

  2. Use the R function str() to understand the structure of the data frame.

  3. Use the R function head() to display the first 4 rows.

  4. Use the R function tail() to display the last 5 rows.

  5. Display only the carrier, flight, origin, and destination columns from the dataframe.

  6. Calculate the average (mean) departure delay (column named dep_delay) and display the result in R using the cat() function. Hint: Look up functions in Help in R Studio or online.

  7. Add a new column ‘tod’ to the data frame with a value of “am” or “pm” depending whether the departure time was AM (before 12 noon) or PM (on or after 12 noon). The time in the data file is in 24-hour format. Hint: Look up how to use the ifelse function. Use the dep_hr column. Print the dataframe to ensure the new column is there and is correct, but only display the carrier, flight, dep_hr, and the new tod columns.

  8. For each flight, display the carrier, flight number, and the actual departure time (scheduled departure plus departure delay) for flights that were delayed. Display the time in the format hh:mm in 24-hour format, e.g., display 23:20 rather than 11:20 or 11:20PM.


Hints & Resources

None yet.


Solution

Only watch the solution commentary once you have spend sufficient time solving the problem for yourself.

Solution Code: A-6.103-Solution.Rmd