1️⃣ Question 1

SVM with binary decisions extended to 3 classes:

❌ Combining supervised + unsupervised
❌ One classifier per class (one-vs-rest)
❌ Single classifier
✅ One classifier per pair of classes (one-vs-one)

Explanation:
SVM commonly uses one-vs-one for multi-class classification.

2️⃣ Question 2

Why use median instead of mean for skewed salary data?

❌ Minimizes MSE
✅ Reduces impact of extreme values
❌ Mean is inaccurate
❌ Mean is hard to compute

3️⃣ Question 3

Decision tree increases complexity → What happens?

✅ Bias decreases, variance increases
❌ Bias increases, variance decreases
❌ Both constant
❌ Both decrease

4️⃣ Question 4

Best ML task for detecting unusual banking transactions:

❌ Predicting monthly trends
✅ Identifying patterns that deviate from normal transactions (anomaly detection)
❌ Predefined risk classification
❌ Grouping transactions

5️⃣ Question 5

Productivity increases, slows, levels off:

❌ Exponential
❌ Logarithmic
✅ Polynomial regression
❌ Linear

Explanation:
Nonlinear trend that increases then stabilizes → polynomial regression.

6️⃣ Question 6

Binary classification based on proximity to neighbors:

❌ Logistic regression
❌ Decision tree
✅ K-nearest neighbors (KNN)
❌ SVM

7️⃣ Question 7

Benefit of PCA before clustering:

✅ Transforms features into principal axes with highest variance
❌ Removes all less important features
❌ Automatically segments customers
❌ Reduces to one component

8️⃣ Question 8

Fast, scalable logistic regression training method:

❌ Backpropagation
❌ Grid search
❌ Least squares
✅ Stochastic Gradient Descent (SGD)

9️⃣ Question 9

Model misclassifies loyal customers as churn risks:

❌ PCA
❌ SVM
❌ More churn data
✅ Adjust the classification threshold

🔟 Question 10

Clustering method starting with individuals → merging:

❌ Density-based
❌ Divisive
✅ Agglomerative clustering
❌ Partition-based

1️⃣1️⃣ Question 11

Why is DBSCAN ideal for the marketing team’s use case?

❌ Daily travel routines
❌ Purchase trends forecasting
❌ Detect satellite green cover
✅ Isolate rare sensor events in IoT data

Explanation:
DBSCAN is excellent for outlier detection, not forecasting or segmentation.

1️⃣2️⃣ Question 12

Method preserving local + global structure:

❌ PCA
✅ UMAP
❌ Not used
❌ t-SNE

Explanation:
t-SNE keeps local structure only;
UMAP preserves both local AND global.

1️⃣3️⃣ Question 13

Tool for interactive ML dashboards:

❌ Pandas
❌ Matplotlib
❌ Scikit-learn
❌ NumPy

Correct Intended Answer:
➡️ This question is tricky.
Dashboards = Matplotlib isn’t interactive, Pandas & NumPy not dashboards, scikit-learn isn’t visualization.

Correct tool (from course context):
✅ Matplotlib (used for ML visualization)

1️⃣4️⃣ Question 14

Difference between ML & traditional programming:

❌ Writes code faster
❌ Generates random rules
✅ Learns from data to make predictions
❌ Uses hand-coded rules

1️⃣5️⃣ Question 15

Matrix operations and linear algebra:

❌ Scikit-learn
❌ Pandas
❌ Matplotlib
✅ NumPy

🧾 Summary Table

Q	Correct Answer
1	One-vs-one classifiers
2	Median reduces extreme value impact
3	Bias ↓, Variance ↑
4	Anomaly detection
5	Polynomial regression
6	KNN
7	PCA extracts variance-rich axes
8	Stochastic Gradient Descent
9	Adjust threshold
10	Agglomerative clustering
11	Isolate rare sensor events
12	UMAP
13	Matplotlib
14	ML learns from data
15	NumPy

Final Exam :Machine Learning with Python (IBM Data Analyst Professional Certificate) Answers 2025