Quiz 4:Practical Machine Learning (Data Science Specialization) Answers 2025

1. Question 1

Train Random Forest (RF) and Boosted Tree (GBM) models on the vowel dataset (classification problem).

✅ RF Accuracy = 0.6082, GBM Accuracy = 0.5152, Agreement Accuracy = 0.6361
❌ RF Accuracy = 0.9987, GBM Accuracy = 0.5152, Agreement Accuracy = 0.9985
❌ RF Accuracy = 0.6082, GBM Accuracy = 0.5152, Agreement Accuracy = 0.5152
❌ RF Accuracy = 0.9881, GBM Accuracy = 0.8371, Agreement Accuracy = 0.9983

Explanation:
When both models (rf and gbm via caret::train) are trained on vowel data with seed 33833:

RF ≈ 0.6082 accuracy
GBM ≈ 0.5152 accuracy
When both agree, accuracy ≈ 0.6361.
This shows agreement yields higher reliability than individual weak predictions.

2. Question 2

Alzheimer’s disease dataset — models: RF, GBM, LDA; stacked using RF.

✅ Stacked Accuracy: 0.80 — better than random forests and lda and the same as boosting.
❌ Stacked Accuracy: 0.80 is better than all three
❌ Stacked Accuracy: 0.76 is better than RF and boosting but not lda
❌ Stacked Accuracy: 0.76 is better than lda but not RF or boosting

Explanation:
Stacking combines model strengths.

RF ≈ 0.74
GBM ≈ 0.80
LDA ≈ 0.76
Stacked RF ≈ 0.80, matching GBM, outperforming RF and LDA individually.

3. Question 3

Concrete dataset — Lasso model (lasso via caret or elasticnet).

✅ Cement
❌ CoarseAggregate
❌ Water
❌ Age

Explanation:
In the lasso model, as λ increases, coefficients shrink to zero.
The last variable to remain (strongest effect) is Cement, as it’s the most predictive of compressive strength.

4. Question 4

Tumblr blog visitors — bats() time series forecast model.

✅ 96%
❌ 92%
❌ 94%
❌ 100%

Explanation:
After fitting the bats() model on 2011 and earlier data and forecasting for 2012, about 96% of test observations fall within the 95% prediction interval. This indicates an accurate model with well-calibrated uncertainty.

5. Question 5

Concrete dataset — Support Vector Machine (e1071::svm) regression model, RMSE on test set.

✅ 6.72
❌ 11543.39
❌ 45.09
❌ 6.93

Explanation:
Fitting an SVM model with default kernel (radial) and parameters yields an RMSE ≈ 6.72 — strong predictive performance compared to simpler linear models.

🧾 Summary Table

Q#	✅ Correct Answer	Key Concept
1	RF=0.6082, GBM=0.5152, Agreement=0.6361	Comparing ensemble accuracy
2	Stacked Accuracy=0.80, matches GBM	Model stacking improves over base learners
3	Cement	Lasso shrinkage path variable retention
4	96%	Forecast coverage accuracy with `bats()`
5	6.72	SVM regression RMSE on test data