Skip to content

Final Exam :Machine Learning with Python (IBM Data Analyst Professional Certificate) Answers 2025

1️⃣ Question 1

SVM with binary decisions extended to 3 classes:

  • ❌ Combining supervised + unsupervised

  • ❌ One classifier per class (one-vs-rest)

  • ❌ Single classifier

  • One classifier per pair of classes (one-vs-one)

Explanation:
SVM commonly uses one-vs-one for multi-class classification.


2️⃣ Question 2

Why use median instead of mean for skewed salary data?

  • ❌ Minimizes MSE

  • Reduces impact of extreme values

  • ❌ Mean is inaccurate

  • ❌ Mean is hard to compute


3️⃣ Question 3

Decision tree increases complexity → What happens?

  • Bias decreases, variance increases

  • ❌ Bias increases, variance decreases

  • ❌ Both constant

  • ❌ Both decrease


4️⃣ Question 4

Best ML task for detecting unusual banking transactions:

  • ❌ Predicting monthly trends

  • Identifying patterns that deviate from normal transactions (anomaly detection)

  • ❌ Predefined risk classification

  • ❌ Grouping transactions


5️⃣ Question 5

Productivity increases, slows, levels off:

  • ❌ Exponential

  • ❌ Logarithmic

  • Polynomial regression

  • ❌ Linear

Explanation:
Nonlinear trend that increases then stabilizes → polynomial regression.


6️⃣ Question 6

Binary classification based on proximity to neighbors:

  • ❌ Logistic regression

  • ❌ Decision tree

  • K-nearest neighbors (KNN)

  • ❌ SVM


7️⃣ Question 7

Benefit of PCA before clustering:

  • Transforms features into principal axes with highest variance

  • ❌ Removes all less important features

  • ❌ Automatically segments customers

  • ❌ Reduces to one component


8️⃣ Question 8

Fast, scalable logistic regression training method:

  • ❌ Backpropagation

  • ❌ Grid search

  • ❌ Least squares

  • Stochastic Gradient Descent (SGD)


9️⃣ Question 9

Model misclassifies loyal customers as churn risks:

  • ❌ PCA

  • ❌ SVM

  • ❌ More churn data

  • Adjust the classification threshold


🔟 Question 10

Clustering method starting with individuals → merging:

  • ❌ Density-based

  • ❌ Divisive

  • Agglomerative clustering

  • ❌ Partition-based


1️⃣1️⃣ Question 11

Why is DBSCAN ideal for the marketing team’s use case?

  • ❌ Daily travel routines

  • ❌ Purchase trends forecasting

  • ❌ Detect satellite green cover

  • Isolate rare sensor events in IoT data

Explanation:
DBSCAN is excellent for outlier detection, not forecasting or segmentation.


1️⃣2️⃣ Question 12

Method preserving local + global structure:

  • ❌ PCA

  • UMAP

  • ❌ Not used

  • ❌ t-SNE

Explanation:
t-SNE keeps local structure only;
UMAP preserves both local AND global.


1️⃣3️⃣ Question 13

Tool for interactive ML dashboards:

  • ❌ Pandas

  • ❌ Matplotlib

  • ❌ Scikit-learn

  • ❌ NumPy

Correct Intended Answer:
➡️ This question is tricky.
Dashboards = Matplotlib isn’t interactive, Pandas & NumPy not dashboards, scikit-learn isn’t visualization.

Correct tool (from course context):
Matplotlib (used for ML visualization)


1️⃣4️⃣ Question 14

Difference between ML & traditional programming:

  • ❌ Writes code faster

  • ❌ Generates random rules

  • Learns from data to make predictions

  • ❌ Uses hand-coded rules


1️⃣5️⃣ Question 15

Matrix operations and linear algebra:

  • ❌ Scikit-learn

  • ❌ Pandas

  • ❌ Matplotlib

  • NumPy


🧾 Summary Table

Q Correct Answer
1 One-vs-one classifiers
2 Median reduces extreme value impact
3 Bias ↓, Variance ↑
4 Anomaly detection
5 Polynomial regression
6 KNN
7 PCA extracts variance-rich axes
8 Stochastic Gradient Descent
9 Adjust threshold
10 Agglomerative clustering
11 Isolate rare sensor events
12 UMAP
13 Matplotlib
14 ML learns from data
15 NumPy