Skip to content

Graded Quiz: Model Evaluation and Refinement :Data Analysis with Python (IBM Data Analyst Professional Certificate) Answers 2025

1. Question 1

What does
cross_val_predict(lr_model, X_train, y_train, cv=3)
return?

  • ❌ Predicted values of the test set

  • ❌ List of residual errors

  • ❌ Average R² score

  • Predicted values for each training point using cross-validation

Explanation:

cross_val_predict returns cross-validated predictions, not scores or residuals.


2. Question 2

Correct way to define a list of alpha values for Ridge regression grid search:

  • parameter = [{‘alpha’: [1, 10, 100]}]

  • ❌ grid = alpha:[1,10,100]

  • ❌ alpha = Ridge([1, 10, 100])

  • ❌ parameter = [alpha: 1, 10, 100]

Explanation:

GridSearchCV requires a dictionary inside a list with parameter name as key.


3. Question 3

R² = 0.99 with a 100-degree polynomial → check for overfitting by:

  • Evaluate the model on the test dataset

  • ❌ Reduce features first

  • ❌ Use cross_val_predict on training

  • ❌ Select model based only on training score

Explanation:

Overfitting is detected when training score is high but test score is low.


4. Question 4

Why choose Ridge Regression?

  • To reduce overfitting by penalizing large coefficients

  • ❌ To remove irrelevant features

  • ❌ To increase flexibility

  • ❌ To reduce complexity only

Explanation:

Ridge adds L2 regularization, shrinking coefficients to prevent overfitting.


5. Question 5

(Image: blue curve follows noise, orange is true function)

  • ❌ Good fit

  • ❌ No conclusion

  • It displays overfitting

  • ❌ Underfitting

Explanation:

The blue curve “wiggles” excessively and follows noise → classic overfitting.


🧾 Summary Table

Q Correct Answer Key Concept
1 Predicted CV values cross_val_predict output
2 [{‘alpha’: [1,10,100]}] GridSearch parameters
3 Evaluate on test set Detecting overfitting
4 Penalize large coefficients Ridge regression benefit
5 Overfitting Model follows noise