Skip to content

The Basics of ConvNets:Convolutional Neural Networks(Deep Learning Specialization)Answers:2025

Question 1

What do you think applying this filter to a grayscale image will do?
Kernel: [−1−12−121211]\begin{bmatrix}-1 & -1 & 2\\-1 & 2 & 1\\2 & 1 & 1\end{bmatrix}

  • ❌ Detect vertical edges.

  • ❌ Detecting image contrast.

  • ❌ Detect horizontal edges.

  • Detect 45-degree edges.

Explanation: The pattern of positive weights concentrated toward one corner and negative weights toward the opposite corner makes the kernel respond strongly to diagonal intensity changes (i.e., edges at about 45°). This kernel accentuates intensity differences along a diagonal orientation rather than purely vertical or horizontal changes.


Question 2

Input: 128 × 128 grayscale. First hidden layer: 256 fully-connected neurons. How many parameters in that hidden layer (including biases)?

  • ❌ 12,582,912

  • 4,194,560

  • ❌ 4,194,304

  • ❌ 12,583,168

Explanation: Input size = 128×128=16,384128\times128=16,384. Each neuron has 16,384 weights + 1 bias = 16,385 parameters. Total = 256×16,385=4,194,560256 \times 16,385 = 4,194,560.


Question 3

Input: 300×300 color (RGB). Convolutional layer: 100 filters, each 5×5. How many parameters (including biases)?

  • 7,600

  • ❌ 2,600

  • ❌ 7,500

  • ❌ 2,501

Explanation: Each filter size = 5×5×3=755\times5\times3 = 75 weights; plus 1 bias = 76 params per filter. For 100 filters: 100×76=7,600100\times76 = 7,600.


Question 4

Input volume: 121×121×16. Use 32 filters of 4×4, stride 3, no padding. Output volume?

  • ❌ 118×118×16

  • 40×40×32

  • ❌ 40×40×16

  • ❌ 118×118×32

Explanation: Spatial output size = ⌊121−43⌋+1=40\left\lfloor\frac{121-4}{3}\right\rfloor + 1 = 40. Depth = number of filters = 32. So output = 40×40×32.


Question 5

Input: 15×15×8, pad = 2. Dimension after padding?

  • ❌ 17×17×10

  • 19×19×8

  • ❌ 17×17×8

  • ❌ 19×19×12

Explanation: Padding 2 adds 2 pixels on each side: new width = 15+2×2=1915 + 2\times2 = 19, same for height. Depth (channels) unchanged = 8. So 19×19×8.


Question 6

Input: 63×63×16, convolve with 7×7 filters, stride 1, want “same” convolution. What is padding?

  • 3

  • ❌ 7

  • ❌ 1

  • ❌ 2

Explanation: For “same” with odd filter size ff, use p=(f−1)/2p = (f-1)/2. Here p=(7−1)/2=3p=(7-1)/2=3.


Question 7

Input: 32×32×16, apply max pooling with stride 2 and filter size 2. Output volume?

  • 16×16×16

  • ❌ 15×15×16

  • ❌ 32×32×8

  • ❌ 16×16×8

Explanation: Pooling reduces each spatial dimension by a factor of 2 when filter=2 and stride=2: 32/2=1632/2 = 16. Depth unchanged = 16.


Question 8

Which of the following are hyperparameters of the pooling layers? (Choose all that apply)

  • b[l]b^{[l]} bias.

  • W[l]W^{[l]} weights.

  • Stride.

  • Whether it is max or average.

Explanation: Pooling layers have no learnable weights or biases. Their hyperparameters include the pooling window size (not listed), the stride, and the type (max vs. average).


Question 9

Which statements about parameter sharing in ConvNets are true? (Check all that apply)

  • It allows parameters learned for one task to be shared even for a different task (transfer learning).

  • It reduces the total number of parameters, thus reducing overfitting.

  • It allows a feature detector to be used in multiple locations throughout the whole input image/input volume.

  • It allows gradient descent to set many of the parameters to zero, thus making the connections sparse.

Explanation:

  • Sharing convolutional filters makes it possible to reuse learned features for other tasks (transfer learning).

  • Weight sharing greatly reduces parameters vs. fully connected layers, helping generalization.

  • The same filter is applied across spatial locations, so a feature detector is used everywhere.

  • However, parameter sharing does not imply GD sets many parameters to zero to create sparsity — that’s a different concept (sparsity/regularization), so the last statement is false.


Question 10

The sparsity of connections and weight sharing are mechanisms that allow us to use fewer parameters in a convolutional layer making it possible to train a network with smaller training sets. True/False?

  • ❌ False

  • True

Explanation: Convolutional layers use sparse connectivity (filters are small) and weight sharing (same filter at all locations). Together these reduce the number of parameters dramatically compared to fully-connected layers, which helps training with smaller datasets.


🧾 Summary Table

Q # Correct Answer(s) Key concept
1 ✅ Detect 45-degree edges Kernel emphasizes diagonal contrast → detects diagonal edges.
2 ✅ 4,194,560 Params = neurons × (input_size + bias).
3 ✅ 7,600 Conv params = filters × (f×f×depth + 1).
4 ✅ 40×40×32 Output spatial size = ⌊(121−4)/3⌋ + 1 = 40, depth = 32.
5 ✅ 19×19×8 Padding adds 2 on each side → 15+4=19.
6 ✅ 3 same conv padding p=(f−1)/2 for odd f.
7 ✅ 16×16×16 Pool halves spatial dims with stride=2, same depth.
8 ✅ Stride; ✅ Max vs Average Pooling has no weights/biases; hyperparams are stride/type/window.
9 ✅ (1,2,3 true; 4 false) Parameter sharing reduces params, enables reuse/transfer, spatial sharing.
10 ✅ True Sparse connections + weight sharing → fewer params → easier training with less data.