Skip to content

Module 2 challenge :Go Beyond the Numbers: Translate Data into Insights (Google Advanced Data Analytics Professional Certificate) Answers 2025

Question 1

What are some strategies data professionals use to understand the source of a dataset? (Select all that apply.)

  • Request relevant information from the team members who supplied the data. ✅
    Explanation: Ask the providers about collection methods, provenance, and limitations — this is a primary way to understand source and context.

  • Reduce outliers by ensuring data comes from a small sample. ❌
    Explanation: Reducing outliers by shrinking sample size is not a valid strategy for understanding source and generally harms representativeness.

  • Confirm the original data owner has no financial stake in the data’s output. ❌
    Explanation: Checking conflicts of interest can be relevant for ethics, but it’s not a primary strategy for understanding the dataset’s source (and it’s not always feasible or sufficient).

  • Determine where the data originally came from. ✅
    Explanation: Tracing origin (instrument, system, owner, collection date/location) is essential for provenance and trust.


Question 2

What is the data storage file format for JavaScript?

  • CSV ❌

  • spreadsheet ❌

  • XML ❌

  • JSON ✅

Explanation: JSON (JavaScript Object Notation) is the standard data interchange/storage format in JavaScript.


Question 3

What type of data is gathered outside of an organization, but directly from the original source?

  • Second-party ✅

  • Third-party ❌

  • Fourth-party ❌

  • First-party ❌

Explanation: First-party = collected by the organization itself. Second-party = another organization’s first-party data obtained directly from that original source. Third-party = aggregated/resold via intermediaries.


Question 4

Which statement correctly uses head() to return the first 5 rows?

  • head=5 ❌

  • df.head(rows=5) ❌

  • df.head(5) ✅

  • df.head(5.df) ❌

Explanation: df.head(5) is the correct pandas syntax to show the first five rows.


Question 5

Which statement will assign the name “Salzburg Restaurants” to a bar graph in Python?

  • plt.xlabel(“Salzburg Restaurants”) ❌

  • plt.title(“Salzburg Restaurants”) ✅

  • plt.show(“Salzburg Restaurants”) ❌

  • plt.name(“Salzburg Restaurants”) ❌

Explanation: plt.title() sets the chart title; xlabel() labels the x-axis.


Question 6

Which element of the code renders the graphic of the plot?

  • geo_scope= ❌

  • title_text = ❌

  • fig.update_layout ❌

  • fig.show() ✅

Explanation: fig.show() actually renders/displays the figure; update_layout configures layout but does not display by itself.


Question 7

Which structuring method combines two different data frames along a specified starting column?

  • Filtering ❌

  • Merging ✅

  • Sorting ❌

  • Grouping ❌

Explanation: merge/join operations combine dataframes along matching columns (keys).


Question 8

Fill in the blank: A _____ is a data visualization that depicts the locality, spread, and skew of groups of values within quartiles.

  • box plot ✅

  • density map ❌

  • scatter plot ❌

  • Gantt chart ❌

Explanation: A box plot (box-and-whisker) shows median, quartiles, spread, and outliers.


Question 9

What is the name of a graph that represents a frequency distribution (how frequently each value occurs)?

  • Box plot ❌

  • Heat map ❌

  • Histogram ✅

  • Scatter plot ❌

Explanation: A histogram bins numeric values and shows their frequencies.


Question 10

(Use the described histogram of Sudoku solve times.)
According to the histogram, which statements are true? (Select all that apply.)

  • Most people solved the puzzle in 7–13 minutes. ✅
    Explanation: The histogram’s tallest bars are centered near 9–11 minutes and counts decline symmetrically toward 7 and 13, so the majority fall in the 7–13 window.

  • The solve time for this puzzle follows an approximately normal distribution. ✅
    Explanation: The bars rise to a central peak and descend roughly symmetrically on both sides — that pattern is consistent with an approximately normal (bell-shaped) distribution.

  • The solve time for this puzzle follows a bimodal distribution. ❌
    Explanation: There is a single central peak (two adjacent high bars reflect the binning around the same mode), not two separate distinct modes — so it’s unimodal, not bimodal.

  • The mean solve time was approximately 10 minutes. ✅
    Explanation: The center of the distribution is around 9–11 minutes; the mean is therefore approximately 10 minutes.


🧾 Summary Table

Q# Correct Answer(s)
1 Request info from data suppliers ✅ ; Determine original source ✅
2 JSON ✅
3 Second-party ✅
4 df.head(5) ✅
5 plt.title(“Salzburg Restaurants”) ✅
6 fig.show() ✅
7 Merging ✅
8 box plot ✅
9 Histogram ✅
10 Most in 7–13 min ✅ ; Approximately normal ✅ ; Not bimodal ❌ ; Mean ≈ 10 min ✅