Quiz 3:Regression Models (Data Science Specialization) Answers 2025
-
Question 1
Consider the mtcars data set. Fit mpg ~ factor(cyl) + wt. Give the adjusted estimate for the expected change in mpg comparing 8 cylinders to 4.
✅ -6.071
❌ -3.206
❌ -4.256
❌ 33.991
Explanation: This is the coefficient for the factor level difference (8 cyl vs 4 cyl) in the regression that adjusts for weight. The adjusted effect is about -6.071 mpg.
-
Question 2
Compare effect of 8 vs 4 cylinders on mpg, adjusted vs unadjusted by weight. What can be said?
✅ Holding weight constant, cylinder appears to have less of an impact on mpg than if weight is disregarded.
❌ Including or excluding weight does not appear to change anything regarding the estimated impact of number of cylinders on mpg.
❌ Within a given weight, 8 cylinder vehicles have an expected 12 mpg drop in fuel efficiency.
❌ Holding weight constant, cylinder appears to have more of an impact on mpg than if weight is disregarded.
Explanation: Adding weight (a strong confounder) to the model reduces the raw (unadjusted) cylinder effect — i.e., some of the cylinder-mpg association is explained by weight.
-
Question 3
Fit model with mpg ~ factor(cyl) + wt and compare to model with interaction factor(cyl):wt. What is the LRT p-value conclusion at α=0.05?
✅ The P-value is larger than 0.05. So, according to our criterion, we would fail to reject, which suggests that the interaction terms may not be necessary.
❌ The P-value is small (less than 0.05). So, according to our criterion, we reject, which suggests that the interaction term is necessary
❌ The P-value is small (less than 0.05). Thus it is surely true that there is no interaction term in the true model.
❌ The P-value is small (less than 0.05). Thus it is surely true that there is an interaction term in the true model.
❌ The P-value is larger than 0.05. So, according to our criterion, we would fail to reject, which suggests that the interaction terms is necessary.
❌ The P-value is small (less than 0.05). So, according to our criterion, we reject, which suggests that the interaction term is not necessary.
Explanation: The likelihood-ratio test comparing the model with interaction to the simpler additive model returns a p-value > 0.05, so we don’t have evidence that adding interaction improves fit enough to justify it.
-
Question 4
Model:lm(mpg ~ I(wt * 0.5) + factor(cyl), data=mtcars)— How is the wt coefficient interpreted?
✅ The estimated expected change in MPG per half ton increase in weight for for a specific number of cylinders (4, 6, 8).
❌ The estimated expected change in MPG per one ton increase in weight.
❌ The estimated expected change in MPG per half ton increase in weight.
❌ The estimated expected change in MPG per half ton increase in weight for the average number of cylinders.
❌ The estimated expected change in MPG per one ton increase in weight for a specific number of cylinders (4, 6, 8).
Explanation: Multiplying wt by 0.5 rescales the coefficient to represent the change in mpg per 0.5 units of wt (i.e. per half ton), and because factor(cyl) is in the model the weight effect is interpreted conditional on (i.e., for a given) cylinder group.
-
Question 5
Data: x = c(0.586, 0.166, -0.042, -0.614, 11.72). Give the hat diagonal for the most influential point.
✅ 0.9946
❌ 0.2287
❌ 0.2804
❌ 0.2025
Explanation: The extreme x value (11.72) yields a leverage (hat diagonal) very near 1; that largest value corresponds to ≈ 0.9946.
-
Question 6
Same data; give the slope dfbeta for the point with the highest hat value.
✅ -0.378
❌ -0.00134
❌ -134
❌ 0.673
Explanation: The influential extreme point has a substantial effect on the slope; the dfbeta for the slope at that point is about -0.378 (i.e., removing that point changes the slope by ≈ -0.378).
-
Question 7
About comparing regression coefficient between Y and X with and without adjustment for Z. Which is true?
✅ It is possible for the coefficient to reverse sign after adjustment. For example, it can be strongly significant and positive before adjustment and strongly significant and negative after adjustment.
❌ Adjusting for another variable can only attenuate the coefficient toward zero. It can’t materially change sign.
❌ The coefficient can’t change sign after adjustment, except for slight numerical pathological cases.
❌ For the the coefficient to change sign, there must be a significant interaction term.
Explanation: Adjustment can change magnitude and sign of an association (e.g., Simpson’s paradox). Including confounders can flip the regression coefficient’s sign.
🧾 Summary Table
| Q# | ✅ Correct Answer | Key Concept |
|---|---|---|
| 1 | -6.071 | Adjusted factor-level effect (8 cyl vs 4) controlling for weight |
| 2 | Holding weight constant, cylinder appears to have less impact | Confounding: adjustment reduces apparent effect |
| 3 | Fail to reject (p > 0.05) — interaction not necessary | Likelihood-ratio test for interaction |
| 4 | Change in MPG per half-ton increase (for a specific cyl) | Interpretation after scaling predictor (I(wt*0.5)) |
| 5 | 0.9946 | High leverage (hat value) for outlier in x |
| 6 | -0.378 | Influence on slope (dfbeta) for most influential point |
| 7 | Coefficient can reverse sign after adjustment | Simpson’s paradox / confounding can change sign |