Skip to content

Graded Quiz: Word2Vec and Sequence-to-Sequence Models :Gen AI Foundational Models for NLP & Language Understanding (IBM AI Engineering Professional Certificate) Answers 2025

1. CBOW context + target at position t = 2 (“she loves watching football”)

Sentence indexed:
t=0: she
t=1: loves
t=2: watching
t=3: football

Window size = 1 → context = immediate neighbors

❌ Context: loves, football; Target: watching
❌ Context: she, football; Target: loves, watching
Context: she, watching; Target word: loves
❌ Context: loves, watching; Target: she, football

Explanation:
At position t = 1 (word = loves), context words are the ones right before and after: she and watching.


2. Difference between CBOW and Skip-gram

❌ Differ only in loss
❌ CBOW predicts target using target
CBOW predicts the target word from context; skip-gram predicts context words from the target word
❌ Skip-gram creates one-hot vectors

Explanation:
CBOW = context → target
Skip-gram = target → context


3. Which model uses time series and remembers past information?

❌ Feedforward neural network
❌ GANs
❌ Word2Vec
Recurrent neural networks (RNNs)

Explanation:
RNNs are designed to handle sequential data and maintain memory through hidden states.


4. Purpose of BOS token in decoder training

❌ Replacement for unknown words
❌ Generate entire output at once
Signals the decoder to start generating the output sequence
❌ Terminates the sequence

Explanation:
The BOS token tells the decoder where to begin generation.


5. Which component generates each translated word sequentially?

❌ Last token of input
❌ Encoder embedding layer
❌ Softmax alone
The decoder module with RNN cells

Explanation:
The decoder RNN processes hidden states step-by-step to produce each output token.


6. How is perplexity computed?

Exponential of average cross-entropy loss
❌ Multiplies all predicted probabilities
❌ Averages squared error
❌ Vocabulary size divided by predicted tokens

Explanation:
Perplexity = exp(cross-entropy) and measures how well a language model predicts text.


🧾 Summary Table

Q# Correct Answer Key Concept
1 she, watching → loves CBOW context/target
2 CBOW: context→target, SG: target→context Word2Vec models
3 RNNs Sequential memory
4 BOS starts decoder generation Seq2Seq training
5 Decoder RNN Translation step-by-step
6 exp(cross-entropy) Perplexity metric