This commit is contained in:
Sylvain Gugger
2020-05-19 16:56:41 -07:00
parent 4b1345a068
commit cd1aa1f758
20 changed files with 497 additions and 510 deletions

View File

@@ -1513,7 +1513,7 @@
"1. What is the name of the theorem that shows that a neural network can solve any mathematical problem to any level of accuracy?\n",
"1. What do you need in order to train a model?\n",
"1. How could a feedback loop impact the rollout of a predictive policing model?\n",
"1. Do we always have to use 224\\*224-pixel images with the cat recognition model?\n",
"1. Do we always have to use 224×224-pixel images with the cat recognition model?\n",
"1. What is the difference between classification and regression?\n",
"1. What is a validation set? What is a test set? Why do we need them?\n",
"1. What will fastai do if you don't provide a validation set?\n",

View File

@@ -4232,7 +4232,7 @@
"1. What is the difference between tensor rank and shape? How do you get the rank from the shape?\n",
"1. What are RMSE and L1 norm?\n",
"1. How can you apply a calculation on thousands of numbers at once, many thousands of times faster than a Python loop?\n",
"1. Create a 3\\*3 tensor or array containing the numbers from 1 to 9. Double it. Select the bottom-right four numbers.\n",
"1. Create a 3×3 tensor or array containing the numbers from 1 to 9. Double it. Select the bottom-right four numbers.\n",
"1. What is broadcasting?\n",
"1. Are metrics generally calculated using the training set, or the validation set? Why?\n",
"1. What is SGD?\n",

View File

@@ -2590,8 +2590,8 @@
"source": [
"1. What is a \"feature\"?\n",
"1. Write out the convolutional kernel matrix for a top edge detector.\n",
"1. Write out the mathematical operation applied by a 3\\*3 kernel to a single pixel in an image.\n",
"1. What is the value of a convolutional kernel apply to a 3\\*3 matrix of zeros?\n",
"1. Write out the mathematical operation applied by a 3×3 kernel to a single pixel in an image.\n",
"1. What is the value of a convolutional kernel apply to a 3×3 matrix of zeros?\n",
"1. What is \"padding\"?\n",
"1. What is \"stride\"?\n",
"1. Create a nested list comprehension to complete any task that you choose.\n",

View File

@@ -841,7 +841,7 @@
"1. What is the basic equation for a ResNet block (ignoring batchnorm and ReLU layers)?\n",
"1. What do ResNets have to do with residuals?\n",
"1. How do we deal with the skip connection when there is a stride-2 convolution? How about when the number of filters changes?\n",
"1. How can we express a 1\\*1 convolution in terms of a vector dot product?\n",
"1. How can we express a 1×1 convolution in terms of a vector dot product?\n",
"1. Create a `1x1 convolution` with `F.conv2d` or `nn.Conv2d` and apply it to an image. What happens to the `shape` of the image?\n",
"1. What does the `noop` function return?\n",
"1. Explain what is shown in <<resnet_surface>>.\n",
@@ -865,7 +865,7 @@
"metadata": {},
"source": [
"1. Try creating a fully convolutional net with adaptive average pooling for MNIST (note that you'll need fewer stride-2 layers). How does it compare to a network without such a pooling layer?\n",
"1. In <<chapter_foundations>> we introduce *Einstein summation notation*. Skip ahead to see how this works, and then write an implementation of the 1\\*1 convolution operation using `torch.einsum`. Compare it to the same operation using `torch.conv2d`.\n",
"1. In <<chapter_foundations>> we introduce *Einstein summation notation*. Skip ahead to see how this works, and then write an implementation of the 1×1 convolution operation using `torch.einsum`. Compare it to the same operation using `torch.conv2d`.\n",
"1. Write a \"top-5 accuracy\" function using plain PyTorch or plain Python.\n",
"1. Train a model on Imagenette for more epochs, with and without label smoothing. Take a look at the Imagenette leaderboards and see how close you can get to the best results shown. Read the linked pages describing the leading approaches."
]

View File

@@ -23,7 +23,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## A Neural Net Layer from Scratch"
"## Building a Neural Net Layer from Scratch"
]
},
{
@@ -710,7 +710,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Broadcasting Rules"
"#### Broadcasting rules"
]
},
{
@@ -1156,7 +1156,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Gradients and Backward Pass"
"### Gradients and the Backward Pass"
]
},
{
@@ -1258,7 +1258,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Refactor the Model"
"### Refactoring the Model"
]
},
{
@@ -1537,41 +1537,41 @@
"1. Write the Python code to implement a single neuron.\n",
"1. Write the Python code to implement ReLU.\n",
"1. Write the Python code for a dense layer in terms of matrix multiplication.\n",
"1. Write the Python code for a dense layer in plain Python (that is with list comprehensions and functionality built into Python).\n",
"1. What is the hidden size of a layer?\n",
"1. What does the `t` method to in PyTorch?\n",
"1. Write the Python code for a dense layer in plain Python (that is, with list comprehensions and functionality built into Python).\n",
"1. What is the \"hidden size\" of a layer?\n",
"1. What does the `t` method do in PyTorch?\n",
"1. Why is matrix multiplication written in plain Python very slow?\n",
"1. In matmul, why is `ac==br`?\n",
"1. In Jupyter notebook, how do you measure the time taken for a single cell to execute?\n",
"1. What is elementwise arithmetic?\n",
"1. In `matmul`, why is `ac==br`?\n",
"1. In Jupyter Notebook, how do you measure the time taken for a single cell to execute?\n",
"1. What is \"elementwise arithmetic\"?\n",
"1. Write the PyTorch code to test whether every element of `a` is greater than the corresponding element of `b`.\n",
"1. What is a rank-0 tensor? How do you convert it to a plain Python data type?\n",
"1. What does this return, and why?: `tensor([1,2]) + tensor([1])`\n",
"1. What does this return, and why?: `tensor([1,2]) + tensor([1,2,3])`\n",
"1. How does elementwise arithmetic help us speed up matmul?\n",
"1. What does this return, and why? `tensor([1,2]) + tensor([1])`\n",
"1. What does this return, and why? `tensor([1,2]) + tensor([1,2,3])`\n",
"1. How does elementwise arithmetic help us speed up `matmul`?\n",
"1. What are the broadcasting rules?\n",
"1. What is `expand_as`? Show an example of how it can be used to match the results of broadcasting.\n",
"1. How does `unsqueeze` help us to solve certain broadcasting problems?\n",
"1. How can you use indexing to do the same operation as `unsqueeze`?\n",
"1. How can we use indexing to do the same operation as `unsqueeze`?\n",
"1. How do we show the actual contents of the memory used for a tensor?\n",
"1. When adding a vector of size 3 to a matrix of size 3 x 3, are the elements of the vector added to each row, or each column of the matrix? (Be sure to check your answer by running this code in a notebook.)\n",
"1. When adding a vector of size 3 to a matrix of size 3×3, are the elements of the vector added to each row or each column of the matrix? (Be sure to check your answer by running this code in a notebook.)\n",
"1. Do broadcasting and `expand_as` result in increased memory use? Why or why not?\n",
"1. Implement matmul using Einstein summation.\n",
"1. Implement `matmul` using Einstein summation.\n",
"1. What does a repeated index letter represent on the left-hand side of einsum?\n",
"1. What are the three rules of Einstein summation notation? Why?\n",
"1. What is the forward pass, and the backward pass, of a neural network?\n",
"1. What are the forward pass and backward pass of a neural network?\n",
"1. Why do we need to store some of the activations calculated for intermediate layers in the forward pass?\n",
"1. What is the downside of having activations with a standard deviation too far away from one?\n",
"1. How can weight initialisation help avoid this problem?\n",
"1. What is the formula to initialise weights such that we get a standard deviation of one, for a plain linear layer; for a linear layer followed by ReLU?\n",
"1. What is the downside of having activations with a standard deviation too far away from 1?\n",
"1. How can weight initialization help avoid this problem?\n",
"1. What is the formula to initialize weights such that we get a standard deviation of 1 for a plain linear layer, and for a linear layer followed by ReLU?\n",
"1. Why do we sometimes have to use the `squeeze` method in loss functions?\n",
"1. What does the argument to the squeeze method do? Why might it be important to include this argument, even though PyTorch does not require it?\n",
"1. What is the chain rule? Show the equation in either of the two forms shown in this chapter.\n",
"1. What does the argument to the `squeeze` method do? Why might it be important to include this argument, even though PyTorch does not require it?\n",
"1. What is the \"chain rule\"? Show the equation in either of the two forms presented in this chapter.\n",
"1. Show how to calculate the gradients of `mse(lin(l2, w2, b2), y)` using the chain rule.\n",
"1. What is the gradient of relu? Show in math or code. (You shouldn't need to commit this to memory—try to figure it using your knowledge of the shape of the function.)\n",
"1. What is the gradient of ReLU? Show it in math or code. (You shouldn't need to commit this to memory—try to figure it using your knowledge of the shape of the function.)\n",
"1. In what order do we need to call the `*_grad` functions in the backward pass? Why?\n",
"1. What is `__call__`?\n",
"1. What methods do we need to implement when writing a `torch.autograd.Function`?\n",
"1. What methods must we implement when writing a `torch.autograd.Function`?\n",
"1. Write `nn.Linear` from scratch, and test it works.\n",
"1. What is the difference between `nn.Module` and fastai's `Module`?"
]
@@ -1587,10 +1587,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"1. Implement relu as a `torch.autograd.Function` and train a model with it.\n",
"1. If you are mathematically inclined, find out what the gradients of a linear layer are in maths notation. Map that to the implementation we saw in this chapter.\n",
"1. Learn about the `unfold` method in PyTorch, and use it along with matrix multiplication to implement your own 2d convolution function, and train a CNN that uses it.\n",
"1. Implement all what is in this chapter using numpy instead of PyTorch. "
"1. Implement ReLU as a `torch.autograd.Function` and train a model with it.\n",
"1. If you are mathematically inclined, find out what the gradients of a linear layer are in mathematical notation. Map that to the implementation we saw in this chapter.\n",
"1. Learn about the `unfold` method in PyTorch, and use it along with matrix multiplication to implement your own 2D convolution function. Then train a CNN that uses it.\n",
"1. Implement everything in this chapter using NumPy instead of PyTorch. "
]
},
{

File diff suppressed because one or more lines are too long

View File

@@ -14,7 +14,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# fastai Learner from Scratch"
"# A fastai Learner from Scratch"
]
},
{
@@ -1288,37 +1288,37 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"For the questions here that ask you to explain what some function or class is, you should also complete your own code experiments."
"> tip: Experiments: For the questions here that ask you to explain what some function or class is, you should also complete your own code experiments."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1. What is glob?\n",
"1. What is `glob`?\n",
"1. How do you open an image with the Python imaging library?\n",
"1. What does L.map do?\n",
"1. What does Self do?\n",
"1. What is L.val2idx?\n",
"1. What methods do you need to implement to create your own Dataset?\n",
"1. What does `L.map` do?\n",
"1. What does `Self` do?\n",
"1. What is `L.val2idx`?\n",
"1. What methods do you need to implement to create your own `Dataset`?\n",
"1. Why do we call `convert` when we open an image from Imagenette?\n",
"1. What does `~` do? How is it useful for splitting training and validation sets?\n",
"1. Which of these classes does `~` work with: `L`, `Tensor`, numpy array, Python `list`, pandas `DataFrame`?\n",
"1. What is ProcessPoolExecutor?\n",
"1. Does `~` work with the `L` or `Tensor` classes? What about NumPy arrays, Python lists, or pandas DataFrames?\n",
"1. What is `ProcessPoolExecutor`?\n",
"1. How does `L.range(self.ds)` work?\n",
"1. What is `__iter__`?\n",
"1. What is `first`?\n",
"1. What is `permute`? Why is it needed?\n",
"1. What is a recursive function? How does it help us define the `parameters` method?\n",
"1. Write a recursive function which returns the first 20 items of the Fibonacci sequence.\n",
"1. Write a recursive function that returns the first 20 items of the Fibonacci sequence.\n",
"1. What is `super`?\n",
"1. Why do subclasses of Module need to override `forward` instead of defining `__call__`?\n",
"1. In `ConvLayer` why does `init` depend on `act`?\n",
"1. Why do subclasses of `Module` need to override `forward` instead of defining `__call__`?\n",
"1. In `ConvLayer`, why does `init` depend on `act`?\n",
"1. Why does `Sequential` need to call `register_modules`?\n",
"1. Write a hook that prints the shape of every layers activations.\n",
"1. What is LogSumExp?\n",
"1. Why is log_softmax useful?\n",
"1. What is GetAttr? How is it helpful for callbacks?\n",
"1. Write a hook that prints the shape of every layer's activations.\n",
"1. What is \"LogSumExp\"?\n",
"1. Why is `log_softmax` useful?\n",
"1. What is `GetAttr`? How is it helpful for callbacks?\n",
"1. Reimplement one of the callbacks in this chapter without inheriting from `Callback` or `GetAttr`.\n",
"1. What does `Learner.__call__` do?\n",
"1. What is `getattr`? (Note the case difference to `GetAttr`!)\n",
@@ -1326,7 +1326,7 @@
"1. Why do we check for `model.training` in `one_batch`?\n",
"1. What is `store_attr`?\n",
"1. What is the purpose of `TrackResults.before_epoch`?\n",
"1. What does `model.cuda()` do? How does it work?\n",
"1. What does `model.cuda` do? How does it work?\n",
"1. Why do we need to check `model.training` in `LRFinder` and `OneCycle`?\n",
"1. Use cosine annealing in `OneCycle`."
]
@@ -1342,15 +1342,15 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"1. Write `resnet18` from scratch (refer to <<chapter_resnet>> as needed), and train it with the Learner in this chapter.\n",
"1. Implement a batchnorm layer from scratch and use it in your resnet18.\n",
"1. Write a mixup callback for use in this chapter.\n",
"1. Add momentum to `SGD`.\n",
"1. Write `resnet18` from scratch (refer to <<chapter_resnet>> as needed), and train it with the `Learner` in this chapter.\n",
"1. Implement a batchnorm layer from scratch and use it in your `resnet18`.\n",
"1. Write a Mixup callback for use in this chapter.\n",
"1. Add momentum to SGD.\n",
"1. Pick a few features that you're interested in from fastai (or any other library) and implement them in this chapter.\n",
"1. Pick a research paper that's not yet implemented in fastai or PyTorch and implement it in this chapter.\n",
" - Port it over to fastai.\n",
" - Submit a PR to fastai, or create your own extension module and release it. \n",
" - Hint: you may find it helpful to use [nbdev](https://nbdev.fast.ai/) to create and deploy your package."
" - Submit a pull request to fastai, or create your own extension module and release it. \n",
" - Hint: you may find it helpful to use [`nbdev`](https://nbdev.fast.ai/) to create and deploy your package."
]
},
{

View File

@@ -36,7 +36,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setting up Your Homepage"
"### Setting Up Your Home Page"
]
},
{
@@ -57,7 +57,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Jupyter for Blogging"
"## Jupyter for Blogging"
]
},
{