1️⃣ Question 1

A telecom company predicts service cancellations (a classification task). Which model should they use?

❌ Decision trees
❌ Neural networks
❌ Naïve Bayes
❌ K-nearest neighbors

Correct Answer:
✅ All of these can be used — but the intended answer is Naïve Bayes.

Explanation:
All four models can perform classification, but for churn prediction (textbook classification use case), Naïve Bayes is the most common baseline model due to its simplicity and strong performance on probability-based tasks.

2️⃣ Question 2

In a one-versus-one classification strategy, how is the final class chosen?

❌ Maximal margin vote
✅ Popularity vote
❌ Confidence-based ranking
❌ Probability average

Explanation:
Each pair of classes votes for a winner.
The class with the most votes wins.

3️⃣ Question 3

What does entropy measure in a decision tree?

✅ The level of disorder or randomness in a node
❌ Count of final nodes
❌ Average feature value
❌ Depth of the tree

Explanation:
Entropy measures how impure or mixed the data is at a node.

4️⃣ Question 4

Method that finds all possible split thresholds but does not scale well:

❌ Midpoints method
❌ MSE method
❌ Entropy reduction
✅ Exhaustive search method

Explanation:
Exhaustive search tries every possible split — computationally expensive on large datasets.

5️⃣ Question 5

Why does increasing K in KNN reduce accuracy?

❌ Scaling errors
✅ Too many smoothing of patterns
❌ Too small training data
❌ Irrelevant features

Explanation:
If K is too large, predictions become overly smoothed, losing important local patterns → poor accuracy.

6️⃣ Question 6

What does epsilon (ε) control in SVR?

❌ Number of support vectors
✅ Maximum allowed error within the margin
❌ Kernel choice
❌ Decision boundary complexity

Explanation:
ε defines the width of the no-penalty zone around the regression line.

7️⃣ Question 7

What is the primary goal of AdaBoost?

❌ Reduce overfitting with deep trees
✅ Create a strong learner from many weak learners by reducing bias
❌ Combine parallel models
❌ Dimensionality reduction

Explanation:
AdaBoost trains weak learners sequentially, each correcting errors of the previous one, creating a strong ensemble model.

🧾 Summary Table

Q	Correct Answer	Key Concept
1	Naïve Bayes	Classic baseline model for churn
2	Popularity vote	One-vs-one strategy
3	Disorder / randomness	Entropy meaning
4	Exhaustive search	Split threshold method
5	Too much smoothing	KNN large-K drawback
6	Allowed error margin	SVR epsilon
7	Strong learner from weak learners	AdaBoost purpose

Graded Quiz: Building Supervised Learning Models :Machine Learning with Python (IBM Data Analyst Professional Certificate) Answers 2025