Module 3 challenge :Go Beyond the Numbers: Translate Data into Insights (Google Advanced Data Analytics Professional Certificate) Answers 2025
Question 1
Fill in the blank: N/A and NaN are terms used to describe _____ data.
-
nominal ❌
-
qualitative ❌
-
string ❌
-
missing ✅
Explanation:
N/A (“not available”) and NaN (“not a number”) are standard indicators of missing data.
Question 2
Strategies to solve missing data problems (Select all that apply):
-
Ask the film studio to fill in the missing values. ✅
-
Create a NaN category. ✅
-
Use their best judgment to add in values themselves. ❌
-
Add in missing values using the average values from existing data. ✅
Explanation:
Appropriate approaches include: asking the data owner, imputing using statistical methods, or treating missingness as its own category.
Using “best judgment” without justification introduces bias.
Question 3
Which part refers to the dataframe being merged with df?
-
df_zip ✅
-
how=’left’ ❌
-
merge ❌
-
center_point_geom ❌
Explanation:
The syntax df.merge(df_zip, ...) merges df_zip into df.
Question 4
Which pandas function pulls all missing values?
-
pd.isnull() ✅
-
pd.ofnull() ❌
-
pd.findnull() ❌
-
pd.getnull() ❌
Explanation:isnull() identifies all NaN and None values.
Question 5
Type of outliers that form a group with similar abnormal behavior:
-
Global outliers ❌
-
Collective outliers ✅
-
Contextual outliers ❌
-
Atypical outliers ❌
Explanation:
Collective outliers occur when a group of points jointly appear abnormal.
Question 6
Assigning numbers to categories (dog=1, cat=2, etc.) is:
-
Data blending ❌
-
Aliasing ❌
-
Data partitioning ❌
-
Label encoding ✅
Explanation:
Label encoding converts categories into numerical labels.
Question 7
Heat map displays values using:
-
a series of markers ❌
-
slices ❌
-
colors ✅
-
vertical bars ❌
Explanation:
Heat maps express intensity or concentration using color gradients.
Question 8
What does pd.duplicated() return for non-duplicate values?
-
Unique ❌
-
False ✅
-
Duplicate ❌
-
True ❌
Explanation:duplicated() returns False when the value is NOT a duplicate.
Question 9
A duplicate should be _____ if it is valid and meaningful.
-
keep ✅
-
eliminate ❌
-
emphasize ❌
-
filter ❌
Explanation:
Some datasets legitimately contain repeated entries (e.g., returning customers, repeated transactions).
Question 10
Term for thoroughly analyzing data to ensure it is complete and error-free:
-
Input validation ❌
-
Normalization ❌
-
Data mapping ❌
-
Verification ✅
Explanation:
Verification ensures data accuracy, completeness, and quality.
🧾 Summary Table
| Q# | Correct Answer(s) |
|---|---|
| 1 | missing |
| 2 | Ask owner, create NaN category, impute average |
| 3 | df_zip |
| 4 | pd.isnull() |
| 5 | Collective outliers |
| 6 | Label encoding |
| 7 | colors |
| 8 | False |
| 9 | keep |
| 10 | Verification |