Skip to content

Logistic Regression Quiz :Fitting Statistical Models to Data with Python (Statistics with Python Specialization) Answers 2025

1. Which collected variables could be predicted using a logistic regression model?

(Recall logistic regression predicts a binary outcome / probability of a dichotomous event.)

  • ❌ Sex (male vs. female) — No (this is a binary variable that could be predicted by logistic regression only if it were the outcome; but as a predictor it’s not something we predict with logistic regression here).

  • Whether a shot on goal traveled more than 20 feetYes. (This is a binary outcome: >20 ft = yes/no.)

  • ❌ Height — No. (Continuous — better predicted with linear regression.)

  • Scoring a soccer goal on a given shotYes. (Binary outcome: goal vs. no goal.)

  • ❌ Age (years) — No. (Continuous — linear regression is appropriate.)

Explanation: Logistic regression is for binary outcomes (or probabilities of categories). Choose variables that are binary as the response.


2. Which of the illustrated graphs could be a possible form/shape for a logistic regression model?

(Select the sigmoid (S-shaped), monotone curve bounded between 0 and 1.)

  • Graph that shows a monotone S-shaped (sigmoid) curve bounded between 0 and 1Yes.

  • ❌ Graphs that are linear, U-shaped, or unbounded — No.

Explanation: The logistic function maps real-valued inputs to probabilities in (0,1) and produces an S-shaped, monotone curve. (Pick the graph that is a bounded sigmoid; the other shapes are not logistic.)


3. Of the two logit-transformed values, which corresponds to a higher original probability?

  • ❌ -2

  • 0.25

  • ❌ They are the same

  • ❌ Can’t tell

Explanation: The logit function is monotone increasing: a larger logit corresponds to a larger probability. 0.25 > −2, so 0.25 maps to the higher probability.


4. Interpretation of coefficient 0.0037 (single-variable logistic model with BMI predicting smoking 100+ cigarettes)

  • ❌ For each increase by one in BMI, the probability increases by about 0.0037.

  • ❌ For each increase by one in BMI, the odds increases by about 0.0037.

  • For each increase by one in BMI, the log odds of smoking 100 cigarettes increases by about 0.0037, on average.

  • ❌ For each increase in one in BMI, the odds increases multiplicatively by about 0.0037.

Explanation: In logistic regression the raw coefficient is the change in log odds per one-unit increase in the predictor.


5. Interpretation of coefficient 0.0169 for Age in the model with BMI and Age

  • ❌ For BMI → odds change 0.0169

  • ❌ For Age → odds change 0.0169

  • For each increase of one in Age, the log odds of smoking 100 cigarettes increases by about 0.0169 while holding BMI constant, on average.

  • ❌ For each increase of one in Age, the log odds increases by 0.0169 (without conditioning)

Explanation: In a multivariable logistic model the coefficient for Age is the change in log odds per year of age controlling for BMI.


6. At two-sided 10% significance level, which coefficients are statistically significant?

  • ❌ Both coefficients are significant

  • ❌ Neither coefficient is significant

  • ❌ Only the coefficient for BMI is significant

  • Only the coefficient for Age is significant

Explanation: The 95% CI for Age (0.014, 0.020) does not include 0, so Age is significant even at 10%. The BMI 95% CI includes 0 (for example, −0.005 to 0.011), so BMI is not significant.


7. If the 95% CI for Age is (0.014, 0.020), how would a 90% CI change?

  • It would be narrower

  • ❌ It would be wider

  • ❌ It would stay the same

  • ❌ Can’t tell

Explanation: A 90% confidence interval uses a smaller critical value and therefore is narrower than a 95% interval (less coverage → less width).


8. Predicted log odds for BMI = 22 and Age = 45 using the model with BMI and Age (pick the closest)

  • -0.417

  • ❌ 0.8265

  • ❌ 0.327

  • ❌ -0.7367

  • ❌ Can’t tell

Explanation: Using the model intercept and coefficients reported in the output (intercept ≈ −1.259, BMI ≈ 0.0037, Age ≈ 0.0169), the linear predictor ≈ −1.259 + 0.0037·22 + 0.0169·45 ≈ −0.417 (closest choice).


9. Is that predicted log odds for BMI=22, Age=45 trustworthy as interpolation or extrapolation?

  • ❌ No, this is extrapolation

  • ❌ No, this is interpolation

  • ❌ Yes, this is extrapolation

  • Yes, this is interpolation

Explanation: The sample covers Age 20–80 and BMI 14.5–64.6, so Age=45 and BMI=22 fall well within the observed ranges — this is interpolation and thus the prediction is reasonable (subject to model assumptions).


10. Fill in the blanks. With 95% confidence, the increase in log odds of smoking 100+ cigarettes for each +1 BMI (holding Age constant) is between ____ and ____.

  • ❌ -1.2435 and 0.149

  • ❌ 0.014 and 0.020

  • ❌ -1.535 and -0.952

  • -0.005 and 0.011

  • ❌ Can’t tell

Explanation: The reported 95% CI for the BMI coefficient in the multivariable model is (−0.005, 0.011), which contains 0 and therefore indicates no statistically significant BMI effect at the 5% level.


🧾 Summary Table

Q# Answer (selected) Key point
1 Whether >20 ft; Scoring a goal ✅ Logistic for binary outcomes
2 The S-shaped, monotone sigmoid graph ✅ Logistic maps to (0,1)
3 0.25 ✅ Logit is monotone ↑ → larger logit → larger p
4 Log-odds increases by 0.0037 ✅ Coefficient = change in log odds
5 Age coef = log-odds change of 0.0169 (holding BMI) ✅ Multivariable interpretation
6 Only Age significant ✅ Age CI excludes 0; BMI CI includes 0
7 90% CI is narrower ✅ Less coverage → narrower
8 Predicted log odds ≈ −0.417 Using intercept and coefficients (closest choice)
9 Yes — interpolation Values lie inside observed ranges
10 BMI 95% CI = (−0.005, 0.011) Contains 0 → not significant