Module-level Graded Quiz: Linear Regression :Introduction to Neural Networks and PyTorch (IBM AI Engineering Professional Certificate) Answers 2025

1. Question 1

What is wrong with the code?

❌ “LR” not required
❌ nn.Module not required
❌ super not required
✅ “linear” should be self.linear

Explanation:
Inside forward, PyTorch modules must be referenced using self.linear, not linear.

2. Question 2

Noise in linear regression refers to:

❌ Variation in parameters
✅ Random errors added to data points
❌ Lack of linearity
❌ Data collection errors

Explanation:
Noise = unavoidable random variability in real-world data.

3. Question 3

Purpose of Mean Squared Error:

❌ Model complexity
❌ Standard deviation
❌ Accuracy
✅ Average squared difference between predicted & actual values

4. Question 4

Primary goal of gradient descent:

❌ Standardize features
❌ Compute gradient of inputs
❌ Find maximum
✅ Minimize the cost function

5. Question 5

If learning rate is too large:

❌ Takes longer
❌ Converge suboptimally
✅ The algorithm may miss the minimum
❌ Converges too fast

Explanation:
Large LR → jumps over minimum → divergence.

6. Question 6

Why set requires_grad=True?

❌ For visualization
❌ Make immutable
❌ Improve performance
✅ To automatically compute gradients

7. Question 7

If learning rate is too small:

❌ Oscillations
❌ Large updates
❌ Rapid increases
✅ Convergence becomes very slow

8. Question 8

Cost surface represents:

❌ Data point plot
❌ Matrix
❌ Gradient
✅ Plot showing how parameters affect the cost function

9. Question 9

Role of the forward function:

❌ Initialize parameters
❌ Transform data
❌ Compute loss
✅ Compute predictions (forward pass)

10. Question 10

Significance of contour plots:

❌ Show data distribution
✅ Slices of the cost surface at different heights
❌ 3D view
❌ Visualize gradient directly

🧾 Summary Table

Q#	Correct Answer
1	linear → self.linear
2	Random errors/noise in data
3	Average squared prediction error
4	Minimize cost
5	Miss the minimum
6	Compute gradients automatically
7	Very slow convergence
8	How parameters affect cost
9	Forward pass → predictions
10	Cost surface slices (contour)