Graded Quiz: Building Unsupervised Learning Models :Machine Learning with Python (IBM Data Analyst Professional Certificate) Answers Answers 2025
1️⃣ Question 1
Why run an unsupervised model when no diagnoses are available?
-
❌ Rank patients by admission time
-
❌ Compress features into one index
-
❌ Assign diagnosis codes
-
✅ To uncover natural patient groups that share similar vital sign patterns
Explanation:
Unsupervised learning finds patterns without labels, perfect when no diagnosis labels exist.
2️⃣ Question 2
How does hierarchical clustering (dendrogram) help determine the number of clusters?
-
✅ Visualizes similarity levels, helping decide an optimal cluster count
-
❌ Produces same clusters every time
-
❌ Removes outliers automatically
-
❌ Designed for high-dimensional data
Explanation:
A dendrogram shows where clusters merge, helping identify a cutoff.
3️⃣ Question 3
What does a K-means centroid represent?
-
❌ Range of spending
-
❌ Distance between customers and center
-
❌ Number of customers improving
-
✅ Average spend and purchase frequency of the cluster
Explanation:
A centroid is the mean of all points in a cluster.
4️⃣ Question 4
Why is DBSCAN good for detecting unusual activity?
-
❌ Needs exact number of clusters
-
✅ Identifies clusters of various shapes and detects outliers
-
❌ Assigns every user to a cluster
-
❌ Minimizes user-center distances
Explanation:
DBSCAN naturally handles noise and irregular clusters.
5️⃣ Question 5
Why is t-SNE used for 2D scatter plots?
-
❌ Auto-labels segments
-
❌ Enforces linear projections
-
❌ Equalizes distances
-
✅ Maintains neighborhood similarities for visual discovery
Explanation:
t-SNE preserves local structure, revealing natural groupings.
6️⃣ Question 6
Why is PCA suitable for environmental factor research?
-
❌ Finds nonlinear relationships
-
✅ Reduces data to key components for simpler analysis
-
❌ Combines all variables into one
-
❌ Forces features to be uncorrelated (this is a result, not the purpose)
Explanation:
PCA reduces dimensionality while retaining major variance.
7️⃣ Question 7
Advantage of t-SNE for user interaction visualization:
-
❌ Eliminates noise
-
❌ Guarantees linear transformation
-
✅ Allows clusters to appear clearly in low-dimensional space
-
❌ Reduces to a single dimension
Explanation:
t-SNE creates a meaningful lower-dimensional visualization where similar users cluster together.
🧾 Summary Table
| Q | Correct Answer |
|---|---|
| 1 | Uncover natural patient groups |
| 2 | Visualize similarity levels with dendrogram |
| 3 | Centroid = average of cluster |
| 4 | DBSCAN detects clusters + outliers |
| 5 | t-SNE preserves neighborhood similarities |
| 6 | PCA reduces data to key components |
| 7 | t-SNE forms meaningful 2D/3D clusters |