Skip to content

Module 3 challenge :Data Analysis with R Programming (Google Data Analytics Professional Certificate) Answers 2025

Question 1

Why might an analyst use a tibble instead of a data frame?

Tibbles automatically only preview the first 10 rows of data
Tibbles automatically only preview as many columns as fit on screen
❌ Tibbles can create row names
❌ Tibbles can automatically change variable names

Explanation:
Tibbles are a modern version of data frames from the tidyverse, designed for large datasets.
They:

  • Show only the first 10 rows.

  • Display only the columns that fit your screen.
    They don’t modify names or create row names automatically.


Question 2

Which function(s) show the names of all columns?

str()
colnames()
❌ head()
❌ library()

Explanation:

  • colnames(df) shows all column names directly.

  • str(df) provides structure info, including names and types.

  • head(df) only shows first rows, not all column names.

  • library() loads packages, not view column names.


Question 3

How many variables does the ToothGrowth dataset contain?

3

Explanation:
ToothGrowth (from R’s datasets) has 3 variables:
len, supp, and dose.
Example:

glimpse(ToothGrowth)

Shows 60 observations and 3 variables.


Question 4

What will the column name be after running rename_with(employees, toupper)?

LAST_NAME
❌ Last_Name
❌ last_name
❌ Last_name

Explanation:
rename_with(df, toupper) converts all column names to uppercase.


Question 5

Sort penguins data by bill_length_mm:

arrange(penguins, bill_length_mm)
❌ arrange(=bill_length_mm)
❌ arrange(bill_length_mm, penguins)
❌ arrange(penguins)

Explanation:
The correct syntax for sorting in dplyr is:

arrange(dataset, column_name)

Question 6

Find the minimum bill depth for each species.

summarize(min_bill_depth = min(bill_depth_mm))

Minimum bill depth for Chinstrap species:
16.4

Explanation:

penguins %>%
drop_na() %>%
group_by(species) %>%
summarize(min_bill_depth = min(bill_depth_mm))

returns Chinstrap: 16.4 mm.


Question 7

Create a column is_large_animal if weight > 199 kg.

zoo_records %>% mutate(is_large_animal = weight > 199)
❌ Other options

Explanation:
mutate(new_column = condition) creates logical TRUE/FALSE column based on condition.


Question 8

Combine first and last names into full_name separated by space:

unite(users, “full_name”, first_name, last_name, sep = ” “)
❌ unite(users, first_name, last_name, “full_name”, sep = ” “)
❌ unite(users, “full_name”, first_name, last_name, sep = “, “)
❌ merge(users, …)

Explanation:
The correct syntax for unite() is:

unite(data, "new_column", col1, col2, sep = " ")

Question 9

Which statistical measure shows the spread of values around the mean?

Standard deviation
❌ Maximum
❌ Average
❌ Correlation

Explanation:
Standard deviation (sd) quantifies how far each data point deviates from the mean.


Question 10

Which function measures how much predicted and actual outcomes differ?

bias()
❌ mean()
❌ sd()
❌ cor()

Explanation:
bias() measures the difference between predicted and actual outcomes — it’s used to check model accuracy (mean of residuals).


🧾 Summary Table

Q# ✅ Correct Answer(s) Key Concept
1 Tibbles preview 10 rows & fit columns Tibble advantages
2 str(), colnames() Inspect column names
3 3 ToothGrowth variables
4 LAST_NAME rename_with + toupper
5 arrange(penguins, bill_length_mm) Sorting data
6 summarize(min_bill_depth = min(…)) → 16.4 Group summary
7 mutate(is_large_animal = weight > 199) Create logical column
8 unite(users, “full_name”, first_name, last_name, sep=” “) Combine columns
9 Standard deviation Spread from mean
10 bias() Model prediction error