Updates

2020-05-18 14:18:08 -07:00
parent a3599602ce
commit d8d39c560a
14 changed files with 390 additions and 415 deletions
--- a/clean/08_collab.ipynb
+++ b/clean/08_collab.ipynb
@@ -1444,7 +1444,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Boot Strapping a Collaborative Filtering Model"
+    "## Bootstrapping a Collaborative Filtering Model"
   ]
  },
  {
--- a/clean/11_midlevel_data.ipynb
+++ b/clean/11_midlevel_data.ipynb
@@ -843,7 +843,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "1. Use the mid-level API to prepare the data in `DataLoaders` on the pets dataset. On the adult dataset (used in chapter 1).\n",
+    "1. Use the mid-level API to prepare the data in `DataLoaders` on your own datasets. Try this with the Pet dataset and the Adult dataset from Chapter 1.\n",
    "1. Look at the Siamese tutorial in the fastai documentation to learn how to customize the behavior of `show_batch` and `show_results` for new type of items. Implement it in your own project."
   ]
  },
--- a/clean/13_convolutions.ipynb
+++ b/clean/13_convolutions.ipynb
@@ -1815,14 +1815,14 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "### A Note about Twitter"
+    "### A Note About Twitter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Colour Images"
+    "## Color Images"
   ]
  },
  {
@@ -2590,27 +2590,27 @@
   "source": [
    "1. What is a \"feature\"?\n",
    "1. Write out the convolutional kernel matrix for a top edge detector.\n",
-    "1. Write out the mathematical operation applied by a 3 x 3 kernel to a single pixel in an image.\n",
-    "1. What is the value of a convolutional kernel apply to a 3 x 3 matrix of zeros?\n",
-    "1. What is padding?\n",
-    "1. What is stride?\n",
+    "1. Write out the mathematical operation applied by a 3\\*3 kernel to a single pixel in an image.\n",
+    "1. What is the value of a convolutional kernel apply to a 3\\*3 matrix of zeros?\n",
+    "1. What is \"padding\"?\n",
+    "1. What is \"stride\"?\n",
    "1. Create a nested list comprehension to complete any task that you choose.\n",
-    "1. What are the shapes of the input and weight parameters to PyTorch's 2D convolution?\n",
-    "1. What is a channel?\n",
+    "1. What are the shapes of the `input` and `weight` parameters to PyTorch's 2D convolution?\n",
+    "1. What is a \"channel\"?\n",
    "1. What is the relationship between a convolution and a matrix multiplication?\n",
-    "1. What is a convolutional neural network?\n",
+    "1. What is a \"convolutional neural network\"?\n",
    "1. What is the benefit of refactoring parts of your neural network definition?\n",
    "1. What is `Flatten`? Where does it need to be included in the MNIST CNN? Why?\n",
    "1. What does \"NCHW\" mean?\n",
    "1. Why does the third layer of the MNIST CNN have `7*7*(1168-16)` multiplications?\n",
-    "1. What is a receptive field?\n",
+    "1. What is a \"receptive field\"?\n",
    "1. What is the size of the receptive field of an activation after two stride 2 convolutions? Why?\n",
-    "1. Run conv-example.xlsx yourself and experiment with \"trace precedents\".\n",
+    "1. Run *conv-example.xlsx* yourself and experiment with *trace precedents*.\n",
    "1. Have a look at Jeremy or Sylvain's list of recent Twitter \"like\"s, and see if you find any interesting resources or ideas there.\n",
    "1. How is a color image represented as a tensor?\n",
    "1. How does a convolution work with a color input?\n",
-    "1. What method can we use to see that data in DataLoaders?\n",
-    "1. Why do we double the number of filters after each stride 2 conv?\n",
+    "1. What method can we use to see that data in `DataLoaders`?\n",
+    "1. Why do we double the number of filters after each stride-2 conv?\n",
    "1. Why do we use a larger kernel in the first conv with MNIST (with `simple_cnn`)?\n",
    "1. What information does `ActivationStats` save for each layer?\n",
    "1. How can we access a learner's callback after training?\n",
@@ -2621,7 +2621,7 @@
    "1. What is 1cycle training?\n",
    "1. What are the benefits of training with a high learning rate?\n",
    "1. Why do we want to use a low learning rate at the end of training?\n",
-    "1. What is cyclical momentum?\n",
+    "1. What is \"cyclical momentum\"?\n",
    "1. What callback tracks hyperparameter values during training (along with other information)?\n",
    "1. What does one column of pixels in the `color_dim` plot represent?\n",
    "1. What does \"bad training\" look like in `color_dim`? Why?\n",
@@ -2643,8 +2643,7 @@
   "source": [
    "1. What features other than edge detectors have been used in computer vision (especially before deep learning became popular)?\n",
    "1. There are other normalization layers available in PyTorch. Try them out and see what works best. Learn about why other normalization layers have been developed, and how they differ from batch normalization.\n",
-    "1. Try moving the activation function after the batch normalization layer in `conv`. Does it make a difference? See what you can find out about what order is recommended, and why.\n",
-    "1. Batch normalization isn't defined for a batch size of one, since the standard deviation isn't defined for a single item. "
+    "1. Try moving the activation function after the batch normalization layer in `conv`. Does it make a difference? See what you can find out about what order is recommended, and why."
   ]
  },
  {
--- a/clean/14_resnet.ipynb
+++ b/clean/14_resnet.ipynb
@@ -16,7 +16,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Resnets"
+    "# ResNets"
   ]
  },
  {
@@ -237,7 +237,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "### Skip-Connections"
+    "### Skip Connections"
   ]
  },
  {
@@ -829,27 +829,28 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "1. How did we get to a single vector of activations in the convnets used for MNIST in previous chapters? Why isn't that suitable for Imagenette?\n",
+    "1. How did we get to a single vector of activations in the CNNs used for MNIST in previous chapters? Why isn't that suitable for Imagenette?\n",
    "1. What do we do for Imagenette instead?\n",
-    "1. What is adaptive pooling?\n",
-    "1. What is average pooling?\n",
+    "1. What is \"adaptive pooling\"?\n",
+    "1. What is \"average pooling\"?\n",
    "1. Why do we need `Flatten` after an adaptive average pooling layer?\n",
-    "1. What is a skip connection?\n",
+    "1. What is a \"skip connection\"?\n",
    "1. Why do skip connections allow us to train deeper models?\n",
    "1. What does <<resnet_depth>> show? How did that lead to the idea of skip connections?\n",
-    "1. What is an identity mapping?\n",
-    "1. What is the basic equation for a ResNet block (ignoring batchnorm and relu layers)?\n",
-    "1. What do ResNets have to do with \"residuals\"?\n",
-    "1. How do we deal with the skip connection when there is a stride 2 convolution? How about when the number of filters changes?\n",
-    "1. How can we express a 1x1 convolution in terms of a vector dot product?\n",
+    "1. What is \"identity mapping\"?\n",
+    "1. What is the basic equation for a ResNet block (ignoring batchnorm and ReLU layers)?\n",
+    "1. What do ResNets have to do with residuals?\n",
+    "1. How do we deal with the skip connection when there is a stride-2 convolution? How about when the number of filters changes?\n",
+    "1. How can we express a 1\\*1 convolution in terms of a vector dot product?\n",
+    "1. Create a `1x1 convolution` with `F.conv2d` or `nn.Conv2d` and apply it to an image. What happens to the `shape` of the image?\n",
    "1. What does the `noop` function return?\n",
    "1. Explain what is shown in <<resnet_surface>>.\n",
    "1. When is top-5 accuracy a better metric than top-1 accuracy?\n",
-    "1. What is the stem of a CNN?\n",
-    "1. Why use plain convs in the CNN stem, instead of ResNet blocks?\n",
+    "1. What is the \"stem\" of a CNN?\n",
+    "1. Why do we use plain convolutions in the CNN stem, instead of ResNet blocks?\n",
    "1. How does a bottleneck block differ from a plain ResNet block?\n",
    "1. Why is a bottleneck block faster?\n",
-    "1. How do fully convolution nets (and nets with adaptive pooling in general) allow for progressive resizing?"
+    "1. How do fully convolutional nets (and nets with adaptive pooling in general) allow for progressive resizing?"
   ]
  },
  {
@@ -863,9 +864,9 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "1. Try creating a fully convolutional net with adaptive average pooling for MNIST (note that you'll need fewer stride 2 layers). How does it compare to a network without such a pooling layer?\n",
-    "1. In <<chapter_foundations>> we introduce *Einstein summation notation*. Skip ahead to see how this works, and then write an implementation of the 1x1 convolution operation using `torch.einsum`. Compare it to the same operation using `torch.conv2d`.\n",
-    "1. Write a \"top 5 accuracy\" function using plain PyTorch or plain Python.\n",
+    "1. Try creating a fully convolutional net with adaptive average pooling for MNIST (note that you'll need fewer stride-2 layers). How does it compare to a network without such a pooling layer?\n",
+    "1. In <<chapter_foundations>> we introduce *Einstein summation notation*. Skip ahead to see how this works, and then write an implementation of the 1\\*1 convolution operation using `torch.einsum`. Compare it to the same operation using `torch.conv2d`.\n",
+    "1. Write a \"top-5 accuracy\" function using plain PyTorch or plain Python.\n",
    "1. Train a model on Imagenette for more epochs, with and without label smoothing. Take a look at the Imagenette leaderboards and see how close you can get to the best results shown. Read the linked pages describing the leading approaches."
   ]
  },
--- a/clean/15_arch_details.ipynb
+++ b/clean/15_arch_details.ipynb
@@ -367,7 +367,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Wrapping up Architectures"
+    "## Wrapping Up Architectures"
   ]
  },
  {
@@ -381,24 +381,24 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "1. What is the head of a neural net?\n",
-    "1. What is the body of a neural net?\n",
+    "1. What is the \"head\" of a neural net?\n",
+    "1. What is the \"body\" of a neural net?\n",
    "1. What is \"cutting\" a neural net? Why do we need to do this for transfer learning?\n",
-    "1. What is \"model_meta\"? Try printing it to see what's inside.\n",
+    "1. What is `model_meta`? Try printing it to see what's inside.\n",
    "1. Read the source code for `create_head` and make sure you understand what each line does.\n",
-    "1. Look at the output of create_head and make sure you understand why each layer is there, and how the create_head source created it.\n",
-    "1. Figure out how to change the dropout, layer size, and number of layers created by create_cnn, and see if you can find values that result in better accuracy from the pet recognizer.\n",
-    "1. What does AdaptiveConcatPool2d do?\n",
-    "1. What is nearest neighbor interpolation? How can it be used to upsample convolutional activations?\n",
-    "1. What is a transposed convolution? What is another name for it?\n",
+    "1. Look at the output of `create_head` and make sure you understand why each layer is there, and how the `create_head` source created it.\n",
+    "1. Figure out how to change the dropout, layer size, and number of layers created by `create_cnn`, and see if you can find values that result in better accuracy from the pet recognizer.\n",
+    "1. What does `AdaptiveConcatPool2d` do?\n",
+    "1. What is \"nearest neighbor interpolation\"? How can it be used to upsample convolutional activations?\n",
+    "1. What is a \"transposed convolution\"? What is another name for it?\n",
    "1. Create a conv layer with `transpose=True` and apply it to an image. Check the output shape.\n",
-    "1. Draw the u-net architecture.\n",
-    "1. What is BPTT for Text Classification (BPT3C)?\n",
+    "1. Draw the U-Net architecture.\n",
+    "1. What is \"BPTT for Text Classification\" (BPT3C)?\n",
    "1. How do we handle different length sequences in BPT3C?\n",
    "1. Try to run each line of `TabularModel.forward` separately, one line per cell, in a notebook, and look at the input and output shapes at each step.\n",
    "1. How is `self.layers` defined in `TabularModel`?\n",
    "1. What are the five steps for preventing over-fitting?\n",
-    "1. Why don't we reduce architecture complexity before trying other approaches to preventing over-fitting?"
+    "1. Why don't we reduce architecture complexity before trying other approaches to preventing overfitting?"
   ]
  },
  {
@@ -413,10 +413,10 @@
   "metadata": {},
   "source": [
    "1. Write your own custom head and try training the pet recognizer with it. See if you can get a better result than fastai's default.\n",
-    "1. Try switching between AdaptiveConcatPool2d and AdaptiveAvgPool2d in a CNN head and see what difference it makes.\n",
-    "1. Write your own custom splitter to create a separate parameter group for every resnet block, and a separate group for the stem. Try training with it, and see if it improves the pet recognizer.\n",
-    "1. Read the online chapter about generative image models, and create your own colorizer, super resolution model, or style transfer model.\n",
-    "1. Create a custom head using nearest neighbor interpolation and use it to do segmentation on Camvid."
+    "1. Try switching between `AdaptiveConcatPool2d` and `AdaptiveAvgPool2d` in a CNN head and see what difference it makes.\n",
+    "1. Write your own custom splitter to create a separate parameter group for every ResNet block, and a separate group for the stem. Try training with it, and see if it improves the pet recognizer.\n",
+    "1. Read the online chapter about generative image models, and create your own colorizer, super-resolution model, or style transfer model.\n",
+    "1. Create a custom head using nearest neighbor interpolation and use it to do segmentation on CamVid."
   ]
  },
  {