Linear Regression Quiz :Fitting Statistical Models to Data with Python (Statistics with Python Specialization) Answers 2025
1. Which scatterplot(s) would fitting a linear regression model be appropriate? (Select all that apply.)
-
✅ a
-
❌ b
-
✅ c
-
✅ d
-
❌ e
Explanation:
Linear regression is appropriate when the cloud of points shows an approximately straight-line pattern (positive or negative). I picked a, c, and d as those exhibiting reasonably linear patterns; b and e look non-linear or random.
2. Which scatterplot(s) would have a correlation coefficient close to 0? (Select all that apply.)
-
❌ a
-
✅ b
-
❌ c
-
❌ d
-
✅ e
Explanation:
Correlation near 0 occurs when there is little-to-no linear association. Plots b and e were chosen as showing no clear linear trend (random / circular patterns).
3. Which scatterplot has the highest absolute correlation (strongest linear relationship)?
-
❌ a
-
✅ b
-
❌ c
-
❌ d
Explanation:
The plot chosen shows the tightest straight-line pattern (points closely clustered around a line), indicating the largest absolute correlation.
4. What distribution do the true errors need to follow to perform inference procedures in linear regression?
-
❌ True errors must be N(0,1)
-
✅ True errors must be N(0, σ²)
-
❌ True errors must be Uniformly distributed
-
❌ True errors do not need any specific distribution
Explanation:
For standard t-tests and confidence intervals in linear regression we assume errors are normally distributed with mean 0 and constant variance σ². (Note: large-sample inference can be robust, but the classical assumption is Normal(0, σ²).)
5. Which are assumptions needed for hypothesis test on population slope? (Select all that apply.)
-
✅ True errors must be normally distributed.
-
✅ True errors have constant variance.
-
✅ The population relationship between dependent and explanatory variable is linear.
Explanation:
All three are part of the classical linear regression assumptions needed for valid inference on the slope (normality of errors, homoscedasticity, and correct linear form).
6. Interpretation of estimated slope b1 = 0.21 (predicting hotel rating from nightly cost)
-
❌ When a hotel’s nightly cost is $0 the hotel’s rating is expected to be 0.21 points.
-
❌ When a hotel rating is 0 points the hotel’s nightly cost is expected to be $0.21 dollars.
-
✅ The hotel rating is estimated to increase by 0.21 points for every additional dollar spent on nightly hotel cost, on average.
-
❌ The nightly hotel cost is estimated to increase by $0.21 dollars for every additional hotel rating point, on average.
Explanation:
The slope is change in response (rating) per unit change in predictor (dollars).
7. Residual for subject with head size 3500 cm³ and brain weight 1430.86 grams
-
✅ -183.3 grams
-
❌ 183.3 grams
-
❌ -4195.752 cm³
-
❌ 4195.752 cm³
Explanation:
Residual = observed − predicted. The numeric answer corresponds to the observed minus the model’s predicted brain weight for head size 3500 (gives −183.3 g).
8. Interpretation of R² = 0.6393
-
❌ 0.6393% of the variation …
-
✅ 63.93% of the variation in brain weight can be accounted for by the linear relationship with head size.
-
❌ We would expect brain weight to increase by 0.6393 grams for every additional cm³ …
-
❌ We would expect head size to increase by 0.6393 cm³ for every additional gram …
Explanation:
R² (0.6393) means ~63.93% of the variance in the response is explained by the predictor.
9. Appropriate p-value for testing a significant positive linear relationship
-
✅ 4.61e-11
-
❌ <2e-16
-
❌ 2.305e-11
-
❌ <1e-16
Explanation:
The reported p-value from the regression output is 4.61 × 10⁻¹¹, indicating a very small p-value (strong evidence of a relationship).
10. How does the 95% prediction interval width compare to the 95% confidence interval for the mean (both at head size 3400 cm³)?
-
✅ Wider
-
❌ Narrower
-
❌ Stays the same
Explanation:
Prediction intervals for an individual observation are wider than confidence intervals for the mean because they include both uncertainty in the mean and the residual variability.
11. How would the 95% confidence interval width at 3600 cm³ compare to that at 3400 cm³?
-
✅ Wider
-
❌ Narrower
-
❌ Stay the same
Explanation:
If 3600 cm³ is farther from the mean head size than 3400 cm³, the CI for the mean at 3600 will be wider (variance of the fitted mean increases as you move away from the mean of the x-values).
12. Cautions predicting brain weight for an 8-year-old with head size 1800 cm³ (select all that apply)
-
❌ Correlation does not imply causation for brain weight.
-
✅ Extrapolation — A head size of 1800 cm³ is outside the range of our data.
-
✅ Extrapolation — The model was created using only data for adults, not children.
-
❌ We do not know if the child is male or female.
-
❌ No cautions needed
Explanation:
The main issues are extrapolation (value outside observed adult range) and the model being built on adults only; predictions for a child are unreliable. (While causation is a general caveat, the two extrapolation cautions are the most directly relevant.)
13. Interpretation of estimated coefficient for age = −23.97
-
❌ The average brain weight for younger subjects is estimated to be 23.97 grams less than the average brain weight for older subjects.
-
❌ Keeping head size and sex constant, the average brain weight for younger subjects is estimated to be 23.97 grams less than the average brain weight for older subjects.
-
❌ The average brain weight for older subjects is estimated to be 23.97 grams less than the average brain weight for younger subjects.
-
✅ Keeping head size and sex constant, the average brain weight for older subjects is estimated to be 23.97 grams less than the average brain weight for younger adults.
Explanation:
Age is coded 0=young, 1=old; a negative coefficient for age means older subjects (age=1) have an average brain weight ~23.97 g less than younger subjects (age=0), holding head size and sex constant.
🧾 Summary Table
| Q# | Answer | Key idea |
|---|---|---|
| 1 | a, c, d (linear fits) | Linear pattern required |
| 2 | b, e (corr ≈ 0) | No linear trend |
| 3 | b (strongest linear) | Tightest linear clustering |
| 4 | N(0, σ²) | Classical error assumption |
| 5 | All three | Normality, homoscedasticity, linearity |
| 6 | Rating ↑ 0.21 pts per $1 | Slope = change in response per unit predictor |
| 7 | −183.3 g | Residual = observed − predicted |
| 8 | 63.93% | R² interpretation |
| 9 | 4.61e-11 | Very small p-value |
| 10 | Wider | Prediction interval > CI for mean |
| 11 | Wider | Further from x̄ → wider CI |
| 12 | Extrapolation & adult-only model | Predictions for child unreliable |
| 13 | Older ≈ 23.97 g less (holding other vars) | Interpretation of coded categorical coef. |