Graded Quiz: Building Unsupervised Learning Models :Machine Learning with Python (IBM AI Engineering Professional Certificate) Answers 2025
1. Question 1
Why run an unsupervised model first with no diagnosis labels?
-
❌ Rank patients by admission time
-
❌ Compress features into one index
-
❌ Assign diagnosis codes automatically
-
✅ To uncover natural patient groups that share similar vital sign patterns
Explanation:
Unsupervised learning finds hidden structures when no labels exist — perfect for grouping patients by similarity.
2. Question 2
How does hierarchical clustering help determine the number of clusters?
-
✅ Visualizes similarity levels using a dendrogram to help choose optimal cluster count
-
❌ Produces same clusters every time
-
❌ Automatically removes outliers
-
❌ Designed specifically for high-dimensional data
Explanation:
A dendrogram shows where clusters merge, helping select the right number of groups.
3. Question 3
What does a K-means centroid represent?
-
❌ Range of spending
-
❌ Distance between customer and center
-
❌ Number of customers improving
-
✅ The coordinate pair representing the average spend and purchase frequency
Explanation:
A centroid = mean position of all points in a cluster.
4. Question 4
Why is DBSCAN good for detecting unusual social media activity?
-
❌ Need exact number of clusters
-
✅ It finds clusters of any shape and detects outliers
-
❌ Assigns every user to a cluster
-
❌ Minimizes user-center distances
Explanation:
DBSCAN naturally identifies dense groups and marks isolated points as anomalies.
5. Question 5
Why use t-SNE for 2D scatterplots of customer behavior?
-
❌ Label each segment automatically
-
❌ Enforce linear projections
-
❌ Equal distance between points
-
✅ Maintain neighborhood similarities for visual discovery
Explanation:
t-SNE preserves local structure, making clusters visually meaningful.
6. Question 6
Why is PCA suitable for environmental research?
-
❌ Finds nonlinear relationships
-
❌ Combines all variables into one
-
❌ Forces uncorrelated features
-
✅ Reduces data to key components for simpler analysis
Explanation:
PCA highlights the most important variance directions, simplifying analysis with fewer features.
7. Question 7
Advantage of using t-SNE for app user interaction visualization?
-
❌ Eliminates noise
-
❌ Guarantees linear transformation
-
✅ Allows similar users to form clusters in low-dimensional space
-
❌ Reduces to one dimension
Explanation:
t-SNE produces meaningful cluster structures for visualizing high-dimensional behavioral patterns.
🧾 Summary Table
| Q# | Correct Answer | Key Concept |
|---|---|---|
| 1 | Uncover natural patient groups | Unsupervised learning |
| 2 | Dendrogram helps choose clusters | Hierarchical clustering |
| 3 | Centroid = average values | K-means |
| 4 | Finds shapes & detects outliers | DBSCAN |
| 5 | Preserves neighborhood structure | t-SNE |
| 6 | Reduce to key components | PCA |
| 7 | Similar users cluster in low-D space | t-SNE visualization |