Module 4 challenge :Process Data from Dirty to Clean (Google Data Analytics Professional Certificate) Answers 2025
Question 1
Fill in the blank: A data scientist keeps code for data analysis pipelines in a _____, which enables them to track the evolution of the pipelines over time.
✅ Version control system
❌ Changelog
❌ Dashboard
❌ Dataset
Explanation:
A version control system (VCS) like Git helps data scientists track, compare, and manage changes to code or pipelines, enabling collaboration and rollback if needed.
Question 2
Large number of respondents mention website loading times (not related to mobile app usability). What should you do first?
✅ Pause and reassess whether focusing on the desktop loading time comments aligns with the original project goal centered on the mobile app checkout usability.
❌ Double-check data cleaning
❌ Start separate analysis immediately
❌ Remove all unrelated responses
Explanation:
Verification includes checking relevance to the original project goal. Taking a big-picture view ensures focus remains on intended objectives before acting on unexpected findings.
Question 3
Which SQL clause will return “island” when the condition ‘Barbados’ is met?
✅ CASE WHEN country = ‘Barbados’ THEN ‘island’ END
❌ CASE country = ‘Barbados’ THEN ‘island’ END
❌ WHEN country = ‘condition’ CASE ‘island’ END
❌ WHEN CASE country = ‘Barbados’ THEN ‘island’ END
Explanation:
Correct CASE syntax in SQL:
CASE
WHEN country = 'Barbados' THEN 'island'
END
This returns “island” for Barbados rows.
Question 4
Recording data cleaning efforts to recover errors and confirm data quality:
✅ Documentation
❌ Examination
❌ Illumination
❌ Disclosure
Explanation:
Documentation means recording each step of cleaning or transformation for transparency, reproducibility, and accountability in data workflows.
Question 5
You start a complex SQL project that will take over a year — how to document query changes?
✅ Write a changelog
❌ Open a notepad
❌ Create a spreadsheet
❌ Visualize data
Explanation:
A changelog records all updates, fixes, and modifications to SQL scripts — essential for tracking progress in long-term projects.
Question 6
A junior data analyst wants to count how many times a product ID error occurs in Google Sheets.
✅ COUNTA
❌ CONCAT
❌ CHECK
❌ CASE
Explanation:
COUNTA() counts all non-empty cells — used to count occurrences of any text or number values in a dataset (like repeated product IDs).
Question 7
Positive outcomes of reporting on data cleaning and acting on feedback:
✅ It can uncover systemic issues and inefficiencies.
✅ It builds stakeholder confidence in the data’s reliability.
✅ It helps identify error patterns and improve data collection methods.
❌ It clears you and your team of blame for errors.
Explanation:
Transparent reporting encourages trust, continuous improvement, and process optimization — not blame-shifting.
Question 8
To change every instance of “Green Thumb Inc.” to “Farmer’s Friend”:
✅ Find and replace
❌ Formatting
❌ Remove duplicates
❌ TRIM
Explanation:
Find and replace automates text replacement in spreadsheets — perfect for rebranding, correcting typos, or updating values in bulk.
🧾 Summary Table
| Q# | ✅ Correct Answer(s) | Key Concept |
|---|---|---|
| 1 | Version control system | Code tracking & management |
| 2 | Pause and reassess project alignment | Verification & project focus |
| 3 | CASE WHEN country = ‘Barbados’ THEN ‘island’ END | SQL conditional logic |
| 4 | Documentation | Recording cleaning steps |
| 5 | Write a changelog | Version documentation |
| 6 | COUNTA | Count entries |
| 7 | 1, 3, 4 ✅ | Benefits of feedback on data quality |
| 8 | Find and replace | Bulk text replacement |