Read all instructions before starting.

Motivation

In this assignment you will have an opportunity to practice programming in R and use various methods for normalizing continuous predictor variables and to encode categorical predictor variables for use in distance-based machine learning algorithms, such as kNN and k-means.

Learning Outcomes

Completing this assignment, will provide you with practice opportunities to

Format

Working is individually is recommended, but working in pairs may be helpful.

Prerequisistes

Prior to working on this assignment, it is suggest that you review these lessons and refer to them during the assignment:

Tools Needed

Setup

Create a new project in R Studio and then, within that project, create a new R Notebook. Set the title parameter of the notebook to “Practice / Normalization and Encoding”; set the author parameter of the notebook to your name; set the date parameter to today’s date.

Follow the instructions below and build an R code chunk for each of the questions below. If you don’t know how to proceed or understand the instructions, then be sure to follow the prerequisite tutorials.

Use a level 3 header (using ###) for each part of the exercise, e.g., ### One-Hot Encoding. Label your code chunks.

Tasks

  1. Download the data set and save it in your project folder.


Hints & Resources

None yet.


Solution

Not available.