Module 3 challenge:The Nuts and Bolts of Machine Learning (Google Advanced Data Analytics Professional Certificate) Answers 2025
1. Key aspects of k-means
❌ The clustering process has four steps that repeat until the model disperses evenly.
✔ K-means organizes data into clusters by creating a logical scheme to make sense of it.
✔ Poor clustering can be caused by local minima, which means the model has converged in a sub-optimal way.
✔ K-means groups unlabeled data into k clusters based on their similarities.
2. NOT a step of k-means
❌ Recalculate the centroid of each cluster based on the points assigned to it
✔ Determine the value of k by calculating the mean number of points you want in each cluster
❌ Assign all points to their nearest centroid
❌ Repeat steps two and three until the model converges
3. Evaluate _____ space (inertia)
❌ midpoint
❌ intercluster
✔ intracluster
❌ converged
4. Agglomerative clustering
✔ Agglomerative clustering works by first assigning every point to its own cluster, then progressively combining clusters based on intercluster distance.
✔ There are numerous hyperparameters available for agglomerative clustering.
❌ The algorithm will stop before an intercluster distance threshold is reached.
✔ The algorithm will stop when the specified number of clusters is met.
5. Linkage: minimum pairwise distance
❌ Complete
✔ Single
❌ Average
❌ Ward
6. Silhouette coefficient ≈ -1
✔ The observation may be in the wrong cluster.
❌ The observation is in the correct cluster.
❌ The observation is on the boundary between clusters.
❌ The observation is suitably within its own cluster and well separated from other clusters.
7. Using inertia to evaluate k
❌ Plot the silhouette score for different values of k to determine where the elbow is
✔ Plot the inertia for different values of k to determine where the elbow is
❌ Choose the number of clusters that results in the highest inertia
❌ Choose the number of clusters that results in the lowest inertia
8. Elbow method
✔ When using the elbow method, data professionals find the sharpest bend in the curve.
✔ The elbow method uses a line plot to visually compare the inertias of different models.
❌ There is always an obvious elbow.
✔ The elbow method helps data professionals decide which clustering gives the most meaningful model.
9. Algorithm choice for 3 long narrow strips
✔ Using k-means to cluster this data could be sub-optimal because it works using distance from centroids, and therefore is best used on clusters that are round.
✔ DBSCAN would probably perform well to cluster this data, because the DBSCAN algorithm uses data density to determine cluster membership, not Euclidean distance from centroids.
❌ Running a k-means model with k=3 would result in a greater silhouette score than a model with k=2.
✔ Running a k-means model with k=4 would result in lower inertia than a model with k=3.
10. NOT unsupervised learning
✔ Naive Bayes
❌ K-means
✔ Logistic regression
❌ Agglomerative clustering
✅ Summary Table
| Q No. | Correct Answer(s) |
|---|---|
| 1 | 2, 3, 4 |
| 2 | Determine k by calculating mean number of points |
| 3 | intracluster |
| 4 | 1, 2, 4 |
| 5 | Single |
| 6 | Observation may be in wrong cluster |
| 7 | Plot inertia vs k to find elbow |
| 8 | 1, 2, 4 |
| 9 | 1, 2, 4 |
| 10 | Naive Bayes, Logistic regression |