Module 2 challenge :Go Beyond the Numbers: Translate Data into Insights (Google Advanced Data Analytics Professional Certificate) Answers 2025
Question 1
What are some strategies data professionals use to understand the source of a dataset? (Select all that apply.)
-
Request relevant information from the team members who supplied the data. ✅
Explanation: Ask the providers about collection methods, provenance, and limitations — this is a primary way to understand source and context. -
Reduce outliers by ensuring data comes from a small sample. ❌
Explanation: Reducing outliers by shrinking sample size is not a valid strategy for understanding source and generally harms representativeness. -
Confirm the original data owner has no financial stake in the data’s output. ❌
Explanation: Checking conflicts of interest can be relevant for ethics, but it’s not a primary strategy for understanding the dataset’s source (and it’s not always feasible or sufficient). -
Determine where the data originally came from. ✅
Explanation: Tracing origin (instrument, system, owner, collection date/location) is essential for provenance and trust.
Question 2
What is the data storage file format for JavaScript?
-
CSV ❌
-
spreadsheet ❌
-
XML ❌
-
JSON ✅
Explanation: JSON (JavaScript Object Notation) is the standard data interchange/storage format in JavaScript.
Question 3
What type of data is gathered outside of an organization, but directly from the original source?
-
Second-party ✅
-
Third-party ❌
-
Fourth-party ❌
-
First-party ❌
Explanation: First-party = collected by the organization itself. Second-party = another organization’s first-party data obtained directly from that original source. Third-party = aggregated/resold via intermediaries.
Question 4
Which statement correctly uses head() to return the first 5 rows?
-
head=5 ❌
-
df.head(rows=5) ❌
-
df.head(5) ✅
-
df.head(5.df) ❌
Explanation: df.head(5) is the correct pandas syntax to show the first five rows.
Question 5
Which statement will assign the name “Salzburg Restaurants” to a bar graph in Python?
-
plt.xlabel(“Salzburg Restaurants”) ❌
-
plt.title(“Salzburg Restaurants”) ✅
-
plt.show(“Salzburg Restaurants”) ❌
-
plt.name(“Salzburg Restaurants”) ❌
Explanation: plt.title() sets the chart title; xlabel() labels the x-axis.
Question 6
Which element of the code renders the graphic of the plot?
-
geo_scope= ❌
-
title_text = ❌
-
fig.update_layout ❌
-
fig.show() ✅
Explanation: fig.show() actually renders/displays the figure; update_layout configures layout but does not display by itself.
Question 7
Which structuring method combines two different data frames along a specified starting column?
-
Filtering ❌
-
Merging ✅
-
Sorting ❌
-
Grouping ❌
Explanation: merge/join operations combine dataframes along matching columns (keys).
Question 8
Fill in the blank: A _____ is a data visualization that depicts the locality, spread, and skew of groups of values within quartiles.
-
box plot ✅
-
density map ❌
-
scatter plot ❌
-
Gantt chart ❌
Explanation: A box plot (box-and-whisker) shows median, quartiles, spread, and outliers.
Question 9
What is the name of a graph that represents a frequency distribution (how frequently each value occurs)?
-
Box plot ❌
-
Heat map ❌
-
Histogram ✅
-
Scatter plot ❌
Explanation: A histogram bins numeric values and shows their frequencies.
Question 10
(Use the described histogram of Sudoku solve times.)
According to the histogram, which statements are true? (Select all that apply.)
-
Most people solved the puzzle in 7–13 minutes. ✅
Explanation: The histogram’s tallest bars are centered near 9–11 minutes and counts decline symmetrically toward 7 and 13, so the majority fall in the 7–13 window. -
The solve time for this puzzle follows an approximately normal distribution. ✅
Explanation: The bars rise to a central peak and descend roughly symmetrically on both sides — that pattern is consistent with an approximately normal (bell-shaped) distribution. -
The solve time for this puzzle follows a bimodal distribution. ❌
Explanation: There is a single central peak (two adjacent high bars reflect the binning around the same mode), not two separate distinct modes — so it’s unimodal, not bimodal. -
The mean solve time was approximately 10 minutes. ✅
Explanation: The center of the distribution is around 9–11 minutes; the mean is therefore approximately 10 minutes.
🧾 Summary Table
| Q# | Correct Answer(s) |
|---|---|
| 1 | Request info from data suppliers ✅ ; Determine original source ✅ |
| 2 | JSON ✅ |
| 3 | Second-party ✅ |
| 4 | df.head(5) ✅ |
| 5 | plt.title(“Salzburg Restaurants”) ✅ |
| 6 | fig.show() ✅ |
| 7 | Merging ✅ |
| 8 | box plot ✅ |
| 9 | Histogram ✅ |
| 10 | Most in 7–13 min ✅ ; Approximately normal ✅ ; Not bimodal ❌ ; Mean ≈ 10 min ✅ |