1. How to create a fixed-length input from “I enjoy reading”?

❌ Replace all words with POS tags
❌ Merge words using punctuation rules
✅ Add one-hot vectors for ‘I’, ‘enjoy’, and ‘reading’
❌ Add token positions with vocabulary count

Explanation:
One-hot vectors produce fixed-length numeric representations based on vocabulary size, suitable for feeding into neural networks.

2. What does `argmax` do in document classification?

❌ Converts logits into probabilities
✅ Selects the index of the output neuron with the highest logit
❌ Tokenizes raw text
❌ Determines number of layers

Explanation:
argmax identifies the class with the highest score, making it essential for final label prediction.

3. What improves training efficiency by adjusting learning rate over time?

❌ Change loss function every epoch
✅ Apply a learning rate scheduler each epoch
❌ Increase batch size
❌ Use different optimizers per batch

Explanation:
Learning rate schedulers gradually reduce or adjust the learning rate to maintain stable and efficient training.

4. Issue when training data is not shuffled?

❌ Overfit due to early stopping
❌ Skip validation
❌ Batch size increases
✅ Gradient descent may converge to a suboptimal local minimum

Explanation:
Without shuffling, the model sees data in fixed patterns, which biases optimization and harms generalization.

5. Proper way to compute a single context vector for N-gram model?

❌ Replace vocabulary with embeddings
✅ Sum the one-hot vectors of the context words into one
❌ Use embedding vectors of context words
❌ Multiply by attention weights

Explanation:
Summing one-hot vectors merges multiple context tokens into one fixed vector representation in classic N-gram neural models.

6. Best KPI for choosing the best model on unseen text?

❌ Prediction
❌ Context
✅ Accuracy
❌ Loss

Explanation:
Accuracy directly measures correctness of predictions on unseen data and is a primary metric for model selection.

🧾 Summary Table

Q#	Correct Answer	Key Concept
1	One-hot vectors	Fixed-length inputs
2	Argmax	Class selection
3	LR scheduler	Efficient training
4	Suboptimal convergence	Importance of shuffling
5	Sum one-hot vectors	Context vector computation
6	Accuracy	Model evaluation KPI

Graded Quiz: Fundamentals of Language Understanding :Gen AI Foundational Models for NLP & Language Understanding (IBM AI Engineering Professional Certificate) Answers 2025