Working in small groups or pairs, complete the following exercises.

An Icebreaker

First, introduce yourself briefly to the people at your table (or in your group). Decide a name for your table/group. Now imagine that you and your group mates are working together in a company as data analysts and you received the following data set:

id age marital education job balance day month duration
1 44 married secondary blue-collar 16178 21 nov 297
2 88 married secondary admin. 330 2 dec 357
3 36 divorced secondary blue-collar 853 20 jun 15
4 25<= single secondary technician 616 28 jul 117
5 33 single secondary services 310 12 m 54
6 37 married tertiary management 0 16 jul -268
7 42 married tertiary management 1205 15 mar 129
8 43 married secondary blue-collar 130 5 may 156
9 58 married primary u 99999 26 aug 168
10 41 married secondary admin. 3634 14 may 216
11 0 married primary management 92 2 feb 447
12 34 single secondary services 528D 2 sep 121
13 28 single secondary admin. 350 19 may 5
14 58 widowed tertiary management 136 8 jul 199
15 34 married unknown blue-collar 41 6 may 34

The dataset is randomly sampled from bank marketing data and manipulated for the purpose of the task which is located at UCI Machine Learning Repository https://archive.ics.uci.edu/ml/datasets/Bank+Marketing containing the variables:

id: Customer ID number

age: Numerical variable

marital: Categorical variable with three levels (married,single,divorced where widowed counted as divorced)

education: Categorical variable with three levels (primary, secondary, tertiary)

job: Categorical variable containing type of jobs

balance: Numerical variable, balance in the bank account

day: Numerical variable, last contacted month of the day

month: Categorical variable, last contacted month

duration: Numerical variable, duration of the contact time

Tasks

  1. Identify possible problems/errors in this data set. Collaboratively decide three major problems/errors that would be most problematic for your data analysis.

  2. Post your group’s opinion on discussion board. Don’t forget to read and comment on other groups’ responses.

Hands-on R exercise: swirl

The swirl package (you can read more on swirl project here ) teaches you R programming interactively, at your own pace, and right in the R console! You will use swirl in class to complete “R Programming - Basic Building Blocks” course.

In order to run swirl, you must have R 3.1.0 or later installed on your computer. In addition to R, it’s highly recommended that you install RStudio, which will make your experience with R much more enjoyable. If you need to install RStudio, you can do so here by selecting the appropriate installer for your operating system.

Tasks

  1. Open RStudio (I assume you have already installed R and RStudio).

  2. Install swirl by typing the following into the console:

install.packages("swirl")
  1. Type the following in the console to load the package, install the “R Programming E” course and then start the swirl.
library(swirl)
install_course("R Programming E")
swirl()
  1. When swirl opens a session, you will enter your name and make a course selection. Select “1: Basic Building Blocks” topic by typing 1 in the console.

  2. Follow the instructions and complete the Basic Building Blocks course. Don’t rush, you will have plenty of time to complete it.

Finished?

If you have finished the above tasks, work through the weekly list of tasks posted on the Canvas announcement page.

Return to Course Website