Skip to content

Quiz 4:Practical Machine Learning (Data Science Specialization) Answers 2025

1. Question 1

Train Random Forest (RF) and Boosted Tree (GBM) models on the vowel dataset (classification problem).

RF Accuracy = 0.6082, GBM Accuracy = 0.5152, Agreement Accuracy = 0.6361
❌ RF Accuracy = 0.9987, GBM Accuracy = 0.5152, Agreement Accuracy = 0.9985
❌ RF Accuracy = 0.6082, GBM Accuracy = 0.5152, Agreement Accuracy = 0.5152
❌ RF Accuracy = 0.9881, GBM Accuracy = 0.8371, Agreement Accuracy = 0.9983

Explanation:
When both models (rf and gbm via caret::train) are trained on vowel data with seed 33833:

  • RF ≈ 0.6082 accuracy

  • GBM ≈ 0.5152 accuracy

  • When both agree, accuracy ≈ 0.6361.
    This shows agreement yields higher reliability than individual weak predictions.


2. Question 2

Alzheimer’s disease dataset — models: RF, GBM, LDA; stacked using RF.

Stacked Accuracy: 0.80 — better than random forests and lda and the same as boosting.
❌ Stacked Accuracy: 0.80 is better than all three
❌ Stacked Accuracy: 0.76 is better than RF and boosting but not lda
❌ Stacked Accuracy: 0.76 is better than lda but not RF or boosting

Explanation:
Stacking combines model strengths.

  • RF ≈ 0.74

  • GBM ≈ 0.80

  • LDA ≈ 0.76

  • Stacked RF ≈ 0.80, matching GBM, outperforming RF and LDA individually.


3. Question 3

Concrete dataset — Lasso model (lasso via caret or elasticnet).

Cement
❌ CoarseAggregate
❌ Water
❌ Age

Explanation:
In the lasso model, as λ increases, coefficients shrink to zero.
The last variable to remain (strongest effect) is Cement, as it’s the most predictive of compressive strength.


4. Question 4

Tumblr blog visitors — bats() time series forecast model.

96%
❌ 92%
❌ 94%
❌ 100%

Explanation:
After fitting the bats() model on 2011 and earlier data and forecasting for 2012, about 96% of test observations fall within the 95% prediction interval. This indicates an accurate model with well-calibrated uncertainty.


5. Question 5

Concrete dataset — Support Vector Machine (e1071::svm) regression model, RMSE on test set.

6.72
❌ 11543.39
❌ 45.09
❌ 6.93

Explanation:
Fitting an SVM model with default kernel (radial) and parameters yields an RMSE ≈ 6.72 — strong predictive performance compared to simpler linear models.


🧾 Summary Table

Q# ✅ Correct Answer Key Concept
1 RF=0.6082, GBM=0.5152, Agreement=0.6361 Comparing ensemble accuracy
2 Stacked Accuracy=0.80, matches GBM Model stacking improves over base learners
3 Cement Lasso shrinkage path variable retention
4 96% Forecast coverage accuracy with bats()
5 6.72 SVM regression RMSE on test data