Graded Quiz: Evaluating and Validating Machine Learning Models :Machine Learning with Python (IBM Data Analyst Professional Certificate)Answers Answers 2025
1️⃣ Question 1
In medical diagnosis, avoiding missed true positives is critical. Which metric matters most?
-
✅ Recall
-
❌ Accuracy
-
❌ F1 Score
-
❌ Precision
Explanation:
Recall measures how many actual positive cases you successfully identify, making it essential when missing a true positive is dangerous.
2️⃣ Question 2
Which metric is the square root of MSE?
-
❌ R-squared
-
❌ MAE
-
❌ MSE
-
✅ Root Mean Squared Error (RMSE)
Explanation:
RMSE = √MSE.
3️⃣ Question 3
Best metric to evaluate cluster separation?
-
✅ Silhouette score
-
❌ Elbow method
-
❌ Evaluation method
-
❌ Davies-Bouldin Index
Explanation:
Silhouette score measures intra-cluster similarity vs. inter-cluster separation.
4️⃣ Question 4
Model performs well on training but poorly on test data.
-
✅ Overfitting
-
❌ Cross-validation
-
❌ Train-test split
-
❌ Data snooping
Explanation:
Overfitting = memorizing training data instead of learning general patterns.
5️⃣ Question 5
Difference between Lasso and Ridge?
-
❌ Lasso only for feature selection
-
✅ Lasso = L1 penalty, Ridge = L2 penalty
-
❌ Lasso uses larger datasets
-
❌ Ridge = L1, Lasso = L2
Explanation:
L1 (Lasso) can shrink coefficients to zero → feature selection.
L2 (Ridge) shrinks coefficients but never to zero.
6️⃣ Question 6
How to mitigate data leakage?
-
❌ Use one feature only
-
✅ Avoid including features derived from the entire dataset
-
❌ Shuffle data
-
❌ Only separate train/test sets
Explanation:
Features like global averages or information from future data leak test information into training.
7️⃣ Question 7
Interpreting feature importance without considering relationships can cause:
-
❌ Scaling issues
-
❌ Minimal target variables
-
❌ Use only uncorrelated features
-
✅ Overlooking correlated features in importance scores
Explanation:
Correlated features can fool feature importance scores (importance split between them).
🧾 Summary Table
| Q | Correct Answer |
|---|---|
| 1 | Recall |
| 2 | RMSE |
| 3 | Silhouette score |
| 4 | Overfitting |
| 5 | Lasso = L1, Ridge = L2 |
| 6 | Avoid dataset-derived features |
| 7 | Overlook correlated features |