Graded Quiz: Data Wrangling :Data Analysis with Python (Applied Data Science Specialization) Answers 2025
1. Replace missing value for continuous attribute
-
❌ Use an educated guess
-
❌ Use the mean square error
-
❌ Use the difference between min and max
-
✅ Use the average of the other values in the column
Explanation: For continuous numeric data, the standard method is replacing missing values with the mean.
2. First step before deciding bin values
-
❌ Divide average by standard deviation
-
✅ Visualize the distribution with a histogram
-
❌ Convert object types
-
❌ Use IQR
Explanation: You must first understand the data’s distribution before deciding how to bin it.
3. Best data type for inconsistent city names (“N.Y.”, “Ny”, “New York”)
-
✅ object
-
❌ float
-
❌ DataFrame
-
❌ int
Explanation: Text data should be stored as object type in Pandas.
4. Primary purpose of normalization
-
❌ Make features identical
-
❌ Remove outliers
-
✅ Ensure features have similar ranges for fair comparison
-
❌ Remove missing values
Explanation: Normalization rescales features so no feature dominates because of its scale.
5. Convert categorical values for ML
-
❌ Converts numerical to categorical
-
✅ Turns categorical values into numerical values
-
❌ Divide values into bins
-
❌ Change data type
Explanation: Encoding converts categories (e.g., “red”, “blue”) into numeric form for ML algorithms.
6. First step in data preparation
-
✅ Cleaning missing or inconsistent values
-
❌ Normalizing values
-
❌ Running models
-
❌ Encoding categorical variables
Explanation: Data cleaning always comes before transformation or modeling.
7. Prepare “fuel type” column (“gas”, “diesel”)
-
❌ cut()
-
✅ get_dummies()
-
❌ dropna()
-
❌ astype()
Explanation: get_dummies() performs one-hot encoding for categorical variables.
8. Convert “N/A” to NaN
-
✅ replace()
-
❌ astype()
-
❌ dropna()
-
❌ fillna()
Explanation: replace("N/A", np.nan) converts placeholder strings to actual missing values.
🧾 Summary Table
| Q | Correct Answer | Key Concept |
|---|---|---|
| 1 | Average (mean) | Imputing continuous missing values |
| 2 | Visualize histogram | Bin selection |
| 3 | object | Text data type |
| 4 | Normalize ranges | Feature scaling |
| 5 | Convert categorical → numerical | Encoding |
| 6 | Clean data first | Data preparation order |
| 7 | get_dummies() | One-hot encoding |
| 8 | replace() | Fix inconsistent missing values |