✅ Q1 — Model (ready to submit)
(Uses standardized inputs Z_… ; score range kept within −3.5 .. +3.5)
Model Score =
0.40·Z_income − 0.30·Z_credit_card_debt − 0.20·Z_auto_debt − 0.15·Z_years_at_current_address − 0.10·Z_age + 0.07·Z_years_at_current_employer
Notes:
-
Uses at least two inputs (in fact 6 standardized inputs).
-
Coefficients chosen to reflect that higher income → lower default risk (+), higher debts → higher default risk (−).
-
Score will be used directly in the AUC Calculator (higher → more likely default).
(You can paste this into the quiz answer box as the model function.)
✅ Q9–Q11 — Conceptual MCQs (your ✔️/❌ format)
9. True Positive Rate is…
❌ Equal to the Test Incidence
✔️ Greater than the Test Incidence
❌ Less than the Test Incidence
10. Positive Predictive Value (PPV)…
✔️ Greater than .25
❌ Equal to .25
❌ Less than .25
11. Negative Predictive Value (NPV)…
✔️ Greater than .75
❌ Equal to .75
❌ Less than .75
(These follow because a model that reduces uncertainty will have PPV > base rate and NPV > (1 − base rate); and TPR typically > test incidence for a useful predictive model when interpreted per the course’s statements.)
❗ What I cannot compute right now (and why)
I cannot produce numeric answers for Q2, Q3, Q4, Q5, Q6, Q7, Q8, Q12, Q13 because they require numeric outputs from your Training and Test sets (AUC, thresholds, confusion matrices, and cost-per-event values). Those values come from either:
-
the raw Training/Test CSV (with actual labels and score predictions), or
-
your AUC/threshold table exported from the AUC Calculator spreadsheet (columns with threshold, TPR, FPR, TP, FP, TN, FN, precision, NPV, cost-per-event, etc.)
If you paste/upload either the CSVs or the AUC/threshold table now, I will compute every missing numeric answer and return them in the exact formats the quiz expects.
✅ Exactly what to upload / paste (pick one)
Option A — Raw CSVs (preferred)
Provide two CSVs or one CSV with a set column indicating train or test. Columns needed:
-
actual: 1 = default, 0 = non-default
-
score: your model score (higher → more likely default). If you only have raw inputs instead of score, provide standardized input columns (Z_income, Z_credit_card_debt, …), and I will compute score with the model above.
Option B — AUC / Threshold Table
Provide a table with one row per threshold and these columns (or equivalents):
threshold, TP, FP, TN, FN, TPR, FPR, Precision, NPV, cost_per_event
(If you used the AUC Calculator spreadsheet, copy the full table; I only need the columns I listed or the spreadsheet export.)
🔧 How I will compute the missing questions (so you know exactly what I’ll return)
If you give me the data I will:
-
Compute model score for every row if not present (using the model above).
-
Calculate ROC & AUC (two decimal places) on the Training Set → answer Q2.
-
Evaluate AUC on the Test Set (no retraining) → answer Q3.
-
Using given costs (FN = $5,000 ; FP = $2,500) compute cost-per-event across thresholds on Training Set and find the threshold that minimizes cost-per-event → return the threshold (Q4) and the minimum cost-per-event as an integer (Q5).
-
Apply the training-set optimal threshold to the Test Set and compute the cost-per-event on the test set using same formula → Q6 (integer).
-
Compute dollar savings per event vs issuing to everyone ($1,250 baseline):
Savings_per_event = 1250 - Test_cost_per_event → Q7 (integer).
-
Payback days: bank spent $750,000; daily applicants = 1,000.
Days_to_payback = round(750000 / (Savings_per_event * 1000)) → Q8 (integer days).
-
From confusion matrix at the chosen threshold on Training Set, return True Positive Rate (TPR) → Q12.
-
From confusion matrix at chosen threshold on Training Set, return Test Incidence -> Q13.
-
I will supply all returned numbers with the formatting the quiz expects (two decimals for AUC, integers for costs and days) and produce a final 🧾 Summary Table with every quiz answer ready to copy/paste.
🧾 Exact Python code you can run locally (or paste here) — produces all answers automatically
(If you prefer I run it for you, upload the CSV and I’ll run it and paste results.)
(If you run this, replace df = pd.read_csv(...) and ensure your set column is train/test or split accordingly.)
📋 Quick spreadsheet / AUC-Calculator tips (if you prefer Excel)
-
Use the AUC Calculator copy for Training set and for Test set.
-
On the Training copy: identify the column where Cell J2 (overall minimum cost-per-event) is minimal. The threshold is in row 10 of that column. That threshold = answer to Q4.
-
The value in Cell J2 (of that column) = Q5 (give as integer).
-
Switch to Test copy, find same column index (same threshold column), and read row 17 (cost-per-event) — that is Q6 (integer).
-
Q7 = 1250 - Q6 (integer). Q8 = round(750000 / (Q7 * 1000)).
-
Q12 = TPR at your training-set chosen threshold = TP / (TP + FN) (found in the table row for that threshold). Q13 = Test incidence = (TP + FP) / N_train from that threshold’s row.
✅ A. Model Questions (Q1–Q8, Q12, Q13)
| Q No. |
Status |
Summary |
| 1 |
✔️ Answered |
Model provided using standardized inputs (Z-scores):
Score = 0.40·Z_income − 0.30·Z_credit_card_debt − 0.20·Z_auto_debt − 0.15·Z_years_at_current_address − 0.10·Z_age + 0.07·Z_years_at_current_employer |
| 2 |
❌ Pending |
Needs Training Set AUC → upload spreadsheet or AUC table. |
| 3 |
❌ Pending |
Needs Test Set AUC → upload spreadsheet or AUC table. |
| 4 |
❌ Pending |
Needs Threshold from AUC calculator (row 10 of min-cost column). |
| 5 |
❌ Pending |
Needs minimum cost-per-event from cell J2. |
| 6 |
❌ Pending |
Needs Test Set cost-per-event using Training Set threshold. |
| 7 |
❌ Pending |
Needs saving-per-event = 1250 − (your answer to Q6). |
| 8 |
❌ Pending |
Needs payback days = 750000 ÷ (1000 × saving-per-event). |
| 12 |
❌ Pending |
Needs True Positive Rate from AUC calculator. |
| 13 |
❌ Pending |
Needs Test Incidence (TP + FP probability sum). |
✅ B. MCQ Model Theory Questions (Q9–Q11)
| Q No. |
Correct Option |
Format |
| 9 |
✔️ Greater than the Test Incidence |
TPR must exceed the base rate for a useful model |
| 10 |
✔️ Greater than .25 |
PPV must exceed base rate of default |
| 11 |
✔️ Greater than .75 |
NPV must exceed 1 − base rate |