1. Question 1

SVM multi-class classification strategy?

❌ Combine supervised + unsupervised
❌ One classifier per class
❌ Single combined classifier
✅ One classifier per pair of classes (One-vs-One)

Explanation:
Binary SVMs → use One-vs-One for multi-class problems.

2. Question 2

Why use median at leaf nodes with skewed salary data?

❌ Minimizes MSE
✅ Reduces impact of extreme values
❌ Mean is inaccurate
❌ Mean is hard to compute

Explanation:
Median is robust against outliers in skewed data.

3. Question 3

Effect of increasing decision tree complexity?

✅ Bias decreases, variance increases
❌ Bias increases
❌ Both constant
❌ Both decrease

Explanation:
More complex trees → fit training data better → overfitting.

4. Question 4

Finding unusual transactions?

❌ Predicting trends
✅ Identifying patterns that deviate from normal transactions (Anomaly Detection)
❌ Predefined classification
❌ Simple grouping

Explanation:
Goal = detect outliers/anomalies.

5. Question 5

Productivity increases → slows → stabilizes. Best regression?

❌ Exponential
❌ Logarithmic
✅ Polynomial regression
❌ Linear

Explanation:
Nonlinear curve with rise → plateau → polynomial fits well.

6. Question 6

Binary classification based on proximity?

❌ Logistic regression
❌ Decision tree
✅ K-nearest neighbors (KNN)
❌ SVM

Explanation:
KNN classifies based on nearest neighbors.

7. Question 7

Advantage of PCA before clustering?

✅ Transforms into high-variance principal axes revealing key features
❌ Removes all unimportant features
❌ Automatically segments
❌ Reduces to one component

Explanation:
PCA keeps maximum variance directions → simplifies clustering.

8. Question 8

Faster alternative to gradient descent for large datasets?

❌ Backpropagation
❌ Grid search
❌ Least squares
✅ Stochastic Gradient Descent (SGD)

Explanation:
SGD updates weights using small batches → much faster.

9. Question 9

Model misclassifies loyal customers as churn risks — fix?

❌ PCA
❌ Use SVM
❌ Add more churn data
✅ Adjust the classification threshold

Explanation:
Shifting threshold reduces false positives.

10. Question 10

Start with each customer as its own cluster → merge upward?

❌ Density-based
❌ Divisive
✅ Agglomerative clustering
❌ Partition-based

Explanation:
Agglomerative = bottom-up clustering.

11. Question 11

Why is DBSCAN ideal?

❌ Daily travel routines
❌ Forecast purchase trends
❌ Satellite green cover
✅ To isolate rare sensor events in IoT data

Explanation:
DBSCAN excels at detecting outliers/anomalies.

12. Question 12

Preserve local AND global structure in high-dimensional data?

❌ PCA
✅ UMAP
❌ Dimensionality reduction not used
❌ t-SNE

Explanation:
UMAP preserves both local + global structure better than t-SNE.

13. Question 13

Tool for visualizing ML insights?

❌ Pandas
✅ Matplotlib
❌ Scikit-learn
❌ NumPy

Explanation:
Matplotlib is the core Python visualization library.

14. Question 14

How is ML different from traditional programming?

❌ Writes code faster
❌ Generates random rules
✅ Learns from data to make predictions
❌ Hand-coded trees

Explanation:
ML learns patterns instead of using explicit rules.

15. Question 15

Library for matrix operations and linear algebra?

❌ Scikit-learn
❌ Pandas
❌ Matplotlib
✅ NumPy

Explanation:
NumPy = fast vectorized operations + linear algebra.

🧾 Summary Table

Q#	Correct Answer	Key Concept
1	One-vs-One	SVM multi-class
2	Median reduces impact of outliers	Skewed data
3	Bias ↓, Variance ↑	Overfitting trees
4	Detect unusual transactions	Anomaly detection
5	Polynomial regression	Nonlinear productivity curve
6	KNN	Proximity-based classification
7	PCA finds variance-rich axes	Dimensionality reduction
8	SGD	Fast optimization
9	Adjust threshold	Reduce false positives
10	Agglomerative	Hierarchical clustering
11	Isolate rare sensor events	DBSCAN
12	UMAP	Preserve local + global structure
13	Matplotlib	Visualization
14	Learns from data	ML vs Programming
15	NumPy	Matrix & algebra operations

Final Exam:Machine Learning with Python (IBM AI Engineering Professional Certificate) Answers 2025