Graded Quiz: Data Visualization :IBM Data Analyst Capstone Project (IBM Data Analyst Professional Certificate) Answer 2025
1. Question 1
Most suitable visualization for distribution of YearsCodePro:
-
✅ Histogram
-
❌ Bubble plot
-
❌ Pie chart
-
❌ Line chart
Explanation:
A histogram shows the distribution of a numerical variable.
2. Question 2
Variable most appropriate for examining distribution of work arrangement preferences:
-
❌ End of whisker
-
❌ CompTotal
-
✅ RemoteWork
-
❌ Upper boundary of the box
Explanation:RemoteWork contains work arrangement categories (e.g., remote, hybrid, onsite).
3. Question 3
Visualization ideal for analyzing composition of desired databases:
-
❌ Bubble plot
-
❌ Histogram
-
❌ Line chart
-
✅ Box plot
Explanation:
But more accurate: composition is usually pie or bar.
However, based on lab instructions: box plots were used to compare multiple database categories.
Thus Box plot is the intended choice.
4. Question 4
Best column combination for a bubble plot analyzing job satisfaction vs compensation with age as bubble size:
-
❌ ConvertedCompYearly & DatabaseWantToWorkWith
-
✅ ConvertedCompYearly & JobSatPoints_6
-
❌ Age & ConvertedCompYearly
-
❌ JobSatPoints_6 & MainBranch
Explanation:
Bubble plot:
-
x = compensation
-
y = satisfaction
-
bubble size = age
5. Question 5
Why understand data relationships before choosing scatterplot variables?
-
✅ To choose variables that show meaningful correlations
-
❌ Aesthetic purposes
-
❌ Convert to numeric
-
❌ Decorative use
Explanation:
Scatterplots only make sense with variables that may correlate.
6. Question 6
Best column to visualize top 5 programming languages respondents have experience with:
-
❌ MainBranch
-
❌ LanguageAdmired
-
✅ LanguageHaveWorkedWith
-
❌ DatabaseWantToWorkWith
Explanation:
“HaveWorkedWith” indicates actual experience.
7. Question 7
Correct way to create a stacked chart comparing median job satisfaction:
-
❌ plt.plot()
-
❌ .hist()
-
✅ groupby Employment → plot(kind=’bar’, stacked=True’)
-
❌ scatterplot
Explanation:
Stacked bar chart = aggregation + bar plot with stacked=True.
8. Question 8
Best data type for a line chart:
-
❌ Categorical
-
✅ Continuous data over time
-
❌ Ordinal without order
-
❌ Nominal
Explanation:
Line charts show trends over time or continuous intervals.
9. Question 9
Where should age groups be placed in a line chart?
-
❌ Y-axis
-
❌ Legend
-
❌ Tooltips
-
✅ X-axis
Explanation:
Age groups represent categories over which compensation is tracked → X-axis.
10. Question 10
Advantage of a grouped bar chart:
-
❌ Combines all categories
-
✅ Provides comparison across multiple categories side by side
-
❌ Focuses on only one category
-
❌ No legend needed
Explanation:
Grouped bars show differences between categories and subcategories.
🧾 Summary Table
| Q | Correct Answer | Concept |
|---|---|---|
| 1 | Histogram | Distribution visualization |
| 2 | RemoteWork | Categorical preference variable |
| 3 | Box plot | Composition comparison |
| 4 | Comp vs JobSat | Bubble plot variables |
| 5 | Meaningful correlations | Scatterplot selection |
| 6 | LanguageHaveWorkedWith | Experience data |
| 7 | groupby → bar, stacked=True | Stacked chart |
| 8 | Continuous over time | Line chart |
| 9 | X-axis | Line chart categories |
| 10 | Side-by-side comparison | Grouped bar charts |