Week 4 Quiz :Natural Language Processing in TensorFlow (DeepLearning.AI TensorFlow Developer Professional Certificate) Answers 2025
1. Question 1
What function creates one-hot encoded arrays of labels?
-
✅ tf.keras.utils.to_categorical
-
❌ tf.keras.utils.SequenceEnqueuer
-
❌ tf.keras.utils.img_to_array
-
❌ tf.keras.preprocessing.text.one_hot
Explanation:to_categorical() converts integer labels into one-hot encoded vectors, which is required for multi-class classification.
2. Question 2
Major drawback of word-based training vs character-based?
-
❌ Character generation is more accurate
-
✅ Because there are far more words in a typical corpus than characters, it is much more memory intensive
-
❌ No drawback
-
❌ Word-based is more accurate
Explanation:
Word vocabularies can reach tens of thousands, making:
✔ memory heavier
✔ embeddings larger
✔ output layers huge
✔ training slower
Characters usually: 40–100 symbols only.
3. Question 3
Critical steps in preparing input sequences?
-
✅ Generating subphrases using
n_gram_sequences -
❌ Converting the seed text with
texts_to_sequences -
❌ Splitting into training/testing (not part of core sequence prep)
-
✅ Pre-padding the subphrase sequences
Correct Steps:
-
Create n-gram subphrases
-
Pre-pad sequences so they are equal length
Explanation:
These are essential for sequence prediction model training.
4. Question 4
Why does predicting more words cause gibberish?
-
❌ Probability compounds
-
❌ Matching fewer known phrases
-
❌ Likelihood doesn’t change
-
✅ Because you are more likely to hit words not in the training set
Explanation:
The model becomes less confident as predictions move forward → probability of selecting uncommon or unseen words increases, and small prediction errors accumulate → gibberish grows.
5. Question 5
Do we use a sigmoid output layer with one neuron per word?
-
❌ True
-
✅ False
Explanation:
For next-word prediction:
✔ Use Softmax
✔ One neuron per word
✔ Softmax gives a proper probability distribution across all words.
Sigmoid is for binary classification, not multi-class.
🧾 Summary Table
| Q# | Correct Answer | Key Concept |
|---|---|---|
| 1 | to_categorical | One-hot encoding |
| 2 | Word-based is more memory-intensive | Vocabulary size impact |
| 3 | n-gram + pre-padding | Preparing sequences |
| 4 | More chance of unseen words | Prediction drift |
| 5 | False — use softmax | NLP multi-class output layer |