1. CBOW context + target at position t = 2 (“she loves watching football”)

Sentence indexed:
t=0: she
t=1: loves
t=2: watching
t=3: football

Window size = 1 → context = immediate neighbors

❌ Context: loves, football; Target: watching
❌ Context: she, football; Target: loves, watching
✅ Context: she, watching; Target word: loves
❌ Context: loves, watching; Target: she, football

Explanation:
At position t = 1 (word = loves), context words are the ones right before and after: she and watching.

2. Difference between CBOW and Skip-gram

❌ Differ only in loss
❌ CBOW predicts target using target
✅ CBOW predicts the target word from context; skip-gram predicts context words from the target word
❌ Skip-gram creates one-hot vectors

Explanation:
CBOW = context → target
Skip-gram = target → context

3. Which model uses time series and remembers past information?

❌ Feedforward neural network
❌ GANs
❌ Word2Vec
✅ Recurrent neural networks (RNNs)

Explanation:
RNNs are designed to handle sequential data and maintain memory through hidden states.

4. Purpose of BOS token in decoder training

❌ Replacement for unknown words
❌ Generate entire output at once
✅ Signals the decoder to start generating the output sequence
❌ Terminates the sequence

Explanation:
The BOS token tells the decoder where to begin generation.

5. Which component generates each translated word sequentially?

❌ Last token of input
❌ Encoder embedding layer
❌ Softmax alone
✅ The decoder module with RNN cells

Explanation:
The decoder RNN processes hidden states step-by-step to produce each output token.

6. How is perplexity computed?

✅ Exponential of average cross-entropy loss
❌ Multiplies all predicted probabilities
❌ Averages squared error
❌ Vocabulary size divided by predicted tokens

Explanation:
Perplexity = exp(cross-entropy) and measures how well a language model predicts text.

🧾 Summary Table

Q#	Correct Answer	Key Concept
1	she, watching → loves	CBOW context/target
2	CBOW: context→target, SG: target→context	Word2Vec models
3	RNNs	Sequential memory
4	BOS starts decoder generation	Seq2Seq training
5	Decoder RNN	Translation step-by-step
6	exp(cross-entropy)	Perplexity metric

Graded Quiz: Word2Vec and Sequence-to-Sequence Models :Gen AI Foundational Models for NLP & Language Understanding (IBM AI Engineering Professional Certificate) Answers 2025