Homework Assignment
HW 37 — CNN on DigitDataset
Background
In HW 36 you tuned a fully-connected network on the DigitDataset and found that even a well-tuned FC network is limited by its inability to exploit spatial structure. In this assignment you will build a CNN on the same dataset and systematically compare its performance, explore the effect of depth and filter count, and visualize what the network has learned.
Use the same CNN structure from the Lesson 37 example (imageInputLayer, convolution2dLayer, batchNormalizationLayer, maxPooling2dLayer, fullyConnectedLayer, trainNetwork) and the same dataset loading pattern. Use 200 images per class (2,000 total) and an 80/20 train/test split unless otherwise specified. Set rng(356) at the top of your script.
Problem 1 — CNN Training Behavior
-
Run the Lesson 37 example with the default parameters (
n_per_class=200,n_filters=8,learn_rate=0.01). Examine the Training Progress plot. Which of the four diagnostic patterns from Lesson 36 (underfitting, good generalization, overfitting onset, severe overfitting) does it most closely resemble? How does this compare to what you saw for the FC network under similar conditions?
Problem 2 — Effect of Filter Count and Network Depth
-
Keeping all other parameters fixed (
n_per_class=200,learn_rate=0.01, 20 epochs), train CNNs with four different filter counts:n_filters= 4, 8, 16, and 32 (recall the second conv block always uses2*n_filters). Record the test accuracy and note the Training Progress pattern for each. -
Now add a third conv block to your best-performing architecture from part 1. Insert it between Pool2 and the FC layer:
Does adding the extra block improve accuracy? Examine the Training Progress plot and comment on whether the additional depth is helping or introducing overfitting.% Third conv block: 7×7×(2*n_filters) → 7×7×(4*n_filters) → 3×3×(4*n_filters) convolution2dLayer(3, 4*n_filters, 'Padding', 'same', 'Name', 'conv3') batchNormalizationLayer('Name', 'bn3') reluLayer('Name', 'relu3') maxPooling2dLayer(2, 'Stride', 2, 'Name', 'pool3')
Problem 3 — Data Efficiency
-
One of the key benefits of CNNs is that parameter sharing makes them more data-efficient than FC networks. To test this, train your best CNN architecture on progressively smaller datasets. Use
n_per_class= 20, 50, 100, and 200 (withlearn_rate=0.001and 30 epochs). Record test accuracy for each and present the results in a table. -
Using the same four dataset sizes, train the best FC architecture from HW 36 (use
learn_rate=0.001, same epochs). Plot or tabulate both sets of results side by side. At which dataset size does the accuracy gap between the CNN and FC network become most pronounced? Explain why in 2–3 sentences.