Week 12

class: center, middle, inverse, title-slide

# Week 12

---

class: inverse, center, middle

# Revision 2

---
# Question 1

Consider the following string. Which command would you use to replace the `x` with blank (whitespace)?

```r
string <- c("169 millimeters x 117 millimeters x 9.1 millimeters")
```

- `A. chartr(string, x)`
- `B. chartr(string, "x", "~")`
- `C. chartr(string, old = "x", new=" ")`
- `D. chartr(string, "x", " - ")`
--

CORRECT ANSWER: C

---
# Question 2

What is the result of the following R code?

```r
df1 <- c("VIC", "NSW", "TAS", "WA", "SA")

df2 <- c("WA", "SA", "NSW", "TAS", "VIC")

identical(df1, df2)
```

- `A. TRUE`
- `B. FALSE`
- `C. "WA", "SA", "NSW"`
- `D. "TAS", "VIC"`

--

CORRECT ANSWER: B

---
# Question 3

Which one of the following is NOT one of the print functions?

- `A. cat()`
- `B. print()`
- `C. noquote`
- `D. quote`

--

CORRECT ANSWER: D

---
# Question 4

Which one of the following removes all punctuations in the vector x?

```r
x <- c("hello!", "good-day.", "hi 5:)")
```

- `A. str_subset(x, "[:alnum:]")`
- `B. str_extract(x, "[:alnum:]")`
- `C. str_remove(x, "[:punct:]")`
- `D. str_replace_all(x, "[:punct:]", "")`

--
 
 
 
CORRECT ANSWER: D

---
# Question 5

According to the following code, what will be the result of y?

```r
x <- "Now, I am HAPPY"

y <- length(x)

y
```

- `A. 4`
- `B. 1`
- `C. 2`
- `D. 5`

--
 
 
 
CORRECT ANSWER: B

---
# Question 6

Which one of the following functions from `lubridate` package will convert `z` into a date format?

```r
z <- c("08.06.2018", "29062018", "23/03/2018", "30-01-2018")
```

- `A. ymd(z)`
- `B. dmy(z)`
- `C. ydm(z)`
- `D. hms(z)`

--
 
 
 
CORRECT ANSWER: B

---
# Question 7

In which one of the following, values are divided by their standard deviation (or root mean square)?

- `A. Box-Cox transformation`
- `B. logarithmic transformation`
- `C. z-score standardisation`
- `D. square root transformation`

--
 
 
 
CORRECT ANSWER: C

---
# Question 8

According to the following code, what will be the result of `y`?

```r
minmaxnormalise <- function(x) {(x - min(x)) / (max(x) - min(x))}

x <- c(5, 4, NA, 2, 5)
y <- minmaxnormalise(x)
y
```

- `A. 1.00 1.00 NA 1.00 1.00`
- `B. 1.00 0.67 NA 0.00 1.00`
- `C. NA NA NA NA NA`
- `D. 0.00 0.00 NA 1.00 1.00`

--
 
 
 
CORRECT ANSWER: C

---
# Question 9

Which one of the following packages has a function to detect multivariate outliers?

- `A. library(dplyr)`
- `B. library(MVN)`
- `C. library(tidyr)`
- `D. library(validate)`

--
 
 
 
CORRECT ANSWER: B

---
# Question 10

Which of the following can be used to deal with outliers?

- `A. Capping`
- `B. Transforming`
- `C. Imputing`
- `D. All of them`

--
 
 
 
CORRECT ANSWER: D

---
# Question 11

Which one of the following is the reason for the error given below?

```r
df <- data.frame(col1 = c(2, 0 / 0, NA, 1 / 0,-Inf, Inf),
 col2 = c(NA, Inf / 0, 2 / 0, NaN,-Inf, 4))

is.infinite(df)
```

- `A. is.infinite() function accepts only vectorial input.`
- `B. there is no infinite value in the data frame.`
- `C. data frame has missing values.`
- `D. there is a division by zero problem in the data frame.`

--
 
 
 
CORRECT ANSWER: A

---
# Question 12

Consider the following data frame. What command would you use to find the total missing values in each column?

```r
df <- data.frame(col1 = c(1:3, NA),
 col2 = c("this", NaN, "is", "text"),
 col3 = c(TRUE, FALSE, TRUE, TRUE),
 col4 = c(2.5, 4.2, 3.2, NA))
```

- `A. sum(is.na(df))`
- `B. is.na(df)`
- `C. is.nan(df)`
- `D. colSums(is.na(df))`

--
 
 
 
CORRECT ANSWER: D

---
# Question 13

According to the following code, what will be the result of y?

```r
x <- c(1:3, NA, 5, NA)
y <- which(is.na(x))
y
```

- `A. 4 6`
- `B. TRUE`
- `C. FALSE FALSE FALSE TRUE FALSE TRUE`
- `D. NA`

--
 
 
 
CORRECT ANSWER: A

---

# Dataset scenario for Questions 14 & 15

A relational database contains 2 data sets namely `sales` and `employees`.

The `sales` data set gives information about the each sale with an id followed by customer id and salesperson id with quantity of the item and payment type. Here is the `sales` data set:

```r
sales
```

```
## # A tibble: 4 x 6
## sales_id sales_person_id customer_id product_id quantity payment_type
## <dbl> <chr> <dbl> <dbl> <dbl> <chr> 
## 1 201 A1 1 102 2 Debit 
## 2 202 B3 2 101 3 Credit 
## 3 203 A1 3 101 1 Cash 
## 4 204 A2 1 103 5 Debit
```

---

# Dataset scenario for Questions 14 & 15 Cont.

The `employees` data set allows you to look up the name and surname of the sales person using the sales person id. Here is the `employees` data set:

```r
employees
```

```
## # A tibble: 6 x 3
## sales_person_id first_name last_name
## <chr> <chr> <chr> 
## 1 A1 John Doe 
## 2 A2 Jane Smith 
## 3 A3 Micheal Brown 
## 4 B1 Jim Johnson 
## 5 B2 Karen Wilson 
## 6 B3 Kate Taylor
```

* `employees` connects to `sales` via the `sales_person_id` variable.

???

```r
sales
```

```r
employees
```

```r
# Q16: How would you find the names of sales people who made a sale while dropping all the information in `sales` data set?

employees %>% semi_join(sales)
```

```
## Joining, by = "sales_person_id"
```

```
## # A tibble: 3 x 3
## sales_person_id first_name last_name
## <chr> <chr> <chr> 
## 1 A1 John Doe 
## 2 A2 Jane Smith 
## 3 B3 Kate Taylor
```

```r
# Q17: How would you find the names of sales people who didn't make a sale?

employees %>% anti_join(sales)
```

```
## Joining, by = "sales_person_id"
```

```
## # A tibble: 3 x 3
## sales_person_id first_name last_name
## <chr> <chr> <chr> 
## 1 A3 Micheal Brown 
## 2 B1 Jim Johnson 
## 3 B2 Karen Wilson
```

---
# Question 14

According to the given information, how would you find the names of sales people (employees) who made a sale while dropping all the information in the sales data set?

- `A. anti_join(employees, sales)`
- `B. semi_join(employees, sales)`
- `C. union(employees, sales)`
- `D. bind_cols(employees, sales)`

--
 
 
 
CORRECT ANSWER: B

---
# Question 15

According to the given information, how would you find the names of sales people who didn't make a sale?

- `A. anti_join(employees, sales)`
- `B. semi_join(employees, sales)`
- `C. union(employees, sales)`
- `D. bind_cols(employees,sales)`

--
 
 
 
CORRECT ANSWER: A

---
# For Questions 16 and 17

---

# For Questions 16 and 17

- Picture 1:

- Picture 2:

---

# For Questions 16 and 17

- Picture 3:

- Picture 4:

---
# Question 16

Consider the id_lookup and ratings data sets, what would be the result of:

```r
ratings %>% left_join(id_lookup)

#OR

left_join(ratings, id_lookup)
```

- `A. Picture 1`
- `B. Picture 2`
- `C. Picture 3`
- `D. Picture 4`

--
 
 
 
CORRECT ANSWER: A

---
# Question 17

Consider the id_lookup and ratings data sets, what would
be the result of:

```r
id_lookup %>% anti_join(ratings)

#OR

anti_join(id_lookup, ratings)
```
 
- `A. Picture 1`
- `B. Picture 2`
- `C. Picture 3`
- `D. Picture 4`
--
 
 
 
CORRECT ANSWER: D

---
# Question 18
Which one of the following will order this data frame in an ascending order using col2 , col3 and col1 , respectively?

```r
df <- data.frame(col1 = c(4, 3, 1),
 col2 = c(81, 12, 4),
 col3 = c(54, 22, 66))
```

- `A. df %>% select(col1, col2, col3)`
- `B. df %>% filter(col1, col2, col3)`
- `C. df %>% arrange(col1, col2, col3)`
- `D. df %>% arrange(col2, col3, col1)`

--
 
 
 
CORRECT ANSWER: D

---

# Question 19

According to the following code, what will be the class of df?

```r
df <- data.frame(col1 = 1:3,
 col2 = c("this", "is", "text"),
 col3 = c(TRUE, FALSE, TRUE),
 col4 = c(25.5, 44.2, 54.9))

df <- as.matrix(df)
class(df)
```

- `A. list`
- `B. vector`
- `C. matrix`
- `D. data.frame`

--
 
 
 
CORRECT ANSWER: C

---

# Question 20

According to the following code, what will be the ordering of the levels for y?

```r
y <- factor(c("low", "moderate", "low", "severe", "low", "high", "moderate", "severe"), 
 levels = c("low" , "moderate", "high" , "severe"), 
 ordered = TRUE)

y
```

- `A. moderate < high < severe < low`
- `B. low < severe < high < moderate`
- `C. low < moderate < high < severe`
- `D. severe < high < moderate < low`

--
 
 
 
CORRECT ANSWER: C

---

[Return to Course Website](../index.html)