Graded Quiz: Fundamentals of Language Understanding :Gen AI Foundational Models for NLP & Language Understanding (IBM AI Engineering Professional Certificate) Answers 2025
1. How to create a fixed-length input from “I enjoy reading”?
❌ Replace all words with POS tags
❌ Merge words using punctuation rules
✅ Add one-hot vectors for ‘I’, ‘enjoy’, and ‘reading’
❌ Add token positions with vocabulary count
Explanation:
One-hot vectors produce fixed-length numeric representations based on vocabulary size, suitable for feeding into neural networks.
2. What does argmax do in document classification?
❌ Converts logits into probabilities
✅ Selects the index of the output neuron with the highest logit
❌ Tokenizes raw text
❌ Determines number of layers
Explanation:argmax identifies the class with the highest score, making it essential for final label prediction.
3. What improves training efficiency by adjusting learning rate over time?
❌ Change loss function every epoch
✅ Apply a learning rate scheduler each epoch
❌ Increase batch size
❌ Use different optimizers per batch
Explanation:
Learning rate schedulers gradually reduce or adjust the learning rate to maintain stable and efficient training.
4. Issue when training data is not shuffled?
❌ Overfit due to early stopping
❌ Skip validation
❌ Batch size increases
✅ Gradient descent may converge to a suboptimal local minimum
Explanation:
Without shuffling, the model sees data in fixed patterns, which biases optimization and harms generalization.
5. Proper way to compute a single context vector for N-gram model?
❌ Replace vocabulary with embeddings
✅ Sum the one-hot vectors of the context words into one
❌ Use embedding vectors of context words
❌ Multiply by attention weights
Explanation:
Summing one-hot vectors merges multiple context tokens into one fixed vector representation in classic N-gram neural models.
6. Best KPI for choosing the best model on unseen text?
❌ Prediction
❌ Context
✅ Accuracy
❌ Loss
Explanation:
Accuracy directly measures correctness of predictions on unseen data and is a primary metric for model selection.
🧾 Summary Table
| Q# | Correct Answer | Key Concept |
|---|---|---|
| 1 | One-hot vectors | Fixed-length inputs |
| 2 | Argmax | Class selection |
| 3 | LR scheduler | Efficient training |
| 4 | Suboptimal convergence | Importance of shuffling |
| 5 | Sum one-hot vectors | Context vector computation |
| 6 | Accuracy | Model evaluation KPI |