Skip to content

Quiz 1:Practical Machine Learning (Data Science Specialization) Answers 2025

1. Question 1

Which of the following are components in building a machine learning algorithm?

Statistical inference
Collecting data to answer the question
Training and test sets
❌ Artificial intelligence
❌ Machine learning

Explanation:
Machine learning uses statistical inference, data collection, and splitting into training/testing sets to build predictive models. “Artificial intelligence” and “machine learning” are broader umbrella terms, not components themselves.


2. Question 2

Suppose we build a prediction algorithm that is 100% accurate on the training data. Why might it not work well on new data?

Our algorithm may be overfitting the training data, predicting both the signal and the noise.
❌ We have used neural networks which have notoriously bad performance.
❌ We may be using bad variables that don’t explain the outcome.
❌ We are not asking a relevant question that can be answered with machine learning.

Explanation:
Overfitting occurs when the model memorizes noise in the training data instead of learning general patterns — resulting in poor generalization on new data.


3. Question 3

What are typical sizes for the training and test sets?

80% training set, 20% test set
90% training set, 10% test set
❌ 0% training set, 100% test set
❌ 50% in the training set, 50% in the testing set

Explanation:
A typical split keeps most data (80–90%) for training to fit the model and 10–20% for testing to evaluate performance. Exact ratios vary by dataset size.


4. Question 4

What are common error metrics for predicting binary variables?

Accuracy
❌ Root mean squared error
❌ Correlation
❌ Median absolute deviation
❌ R²

Explanation:
For binary outcomes (yes/no, clicked/didn’t click), typical performance metrics include accuracy, sensitivity, specificity, precision, and AUC — not regression-based metrics like RMSE or R².


5. Question 5

A link is clicked on 1 in 1,000 visits (prevalence = 0.001). Model has 99% sensitivity and 99% specificity.
If model predicts “clicked,” what’s the probability it actually is clicked?

9%
❌ 89.9%
❌ 90%
❌ 50%

Explanation:
Using Bayes’ theorem:

P(Clicked | Predicted Click)=0.99×0.0010.99×0.001+0.01×0.999≈0.09P(\text{Clicked | Predicted Click}) = \frac{0.99 × 0.001}{0.99×0.001 + 0.01×0.999} ≈ 0.09

So the positive predictive value (PPV)9%. Even with high sensitivity/specificity, rare events yield many false positives.


🧾 Summary Table

Q# ✅ Correct Answer(s) Key Concept
1 Statistical inference; Data collection; Training/test sets ML building components
2 Overfitting explains poor generalization Overfitting and generalization
3 80/20 or 90/10 split Common dataset partition ratios
4 Accuracy Binary classification error metric
5 9% Bayes theorem & base rate effect