Skip to content

Name That Model :Fitting Statistical Models to Data with Python (Statistics with Python Specialization) Answers 2025

1

You are predicting a binary outcome (win = 1 / lose = 0), one observation per team, no hierarchical/repeated structure.

  • ❌ Linear regression model

  • Logistic regression model

  • ❌ Multilevel linear regression model with random team effects

  • ❌ Multilevel logistic regression model with random team effects

  • ❌ Marginal linear model, fitted using GEE

  • ❌ Marginal logistic model, fitted using GEE

Explanation:
Outcome is binary and observations are independent across teams → use a standard logistic regression (subject-level binary model).


2

Binary DV (ever had MDD), area cluster sampling, and you want to estimate between-cluster variance and explain it with cluster-level covariates.

  • ❌ Linear regression model

  • ❌ Logistic regression model

  • ❌ Multilevel linear regression model with random cluster effects

  • Multilevel logistic regression model with random cluster effects

  • ❌ Marginal linear model, fitted using GEE

  • ❌ Marginal logistic model, fitted using GEE

Explanation:
Binary outcome + clustering + interest in estimating between-cluster variance (and explaining it) → a multilevel (mixed-effects) logistic model with random cluster effects is appropriate.


3

Continuous outcome (birth weight), simple random sample (no clustering), predictors at individual level.

  • Linear regression model

  • ❌ Logistic regression model

  • ❌ Multilevel linear regression model with random hospital effects

  • ❌ Multilevel logistic regression model with random hospital effects

  • ❌ Marginal linear model, fitted using GEE

  • ❌ Marginal logistic model, fitted using GEE

Explanation:
Continuous outcome + simple random sample → standard linear regression (OLS) is appropriate for predicting birth weight.


4

Now data from many hospitals, and you want to estimate variance in expected birth weight between hospitals and explain it with hospital-level covariates.

  • ❌ Linear regression model

  • ❌ Logistic regression model

  • Multilevel linear regression model with random hospital effects

  • ❌ Multilevel logistic regression model with random hospital effects

  • ❌ Marginal linear model, fitted using GEE

  • ❌ Marginal logistic model, fitted using GEE

Explanation:
Interest in estimating between-hospital variance (a variance component) and explaining it → use a multilevel linear model with random hospital effects (continuous outcome).


5

Binary forced-choice outcome, multiple respondents per neighborhood, not interested in cluster variance but want population-averaged effect and to account for within-neighborhood correlation.

  • ❌ Linear regression model

  • ❌ Logistic regression model

  • ❌ Multilevel linear regression model with random neighborhood effects

  • ❌ Multilevel logistic regression model with random neighborhood effects

  • ❌ Marginal linear model, fitted using GEE

  • Marginal logistic model, fitted using GEE

Explanation:
You want the population-averaged (marginal) relationship and only need to account for within-neighborhood correlation (not estimate variance components) → use a GEE logistic model (marginal logistic model).


🧾 Summary Table

Q# Selected model Why
1 Logistic regression Binary outcome, independent teams
2 Multilevel logistic (random cluster) Binary outcome + clusters + want between-cluster variance
3 Linear regression Continuous outcome + simple random sample
4 Multilevel linear (random hospital) Continuous outcome + multiple hospitals + estimate between-hospital variance
5 Marginal logistic (GEE) Binary outcome, clustered data, want population-averaged effect (not variance components)