typos fixed in ch17 and ch18

This commit is contained in:
Rubens 2020-03-23 10:36:11 -03:00
parent dc1bf74f26
commit 4e1d76f64d
2 changed files with 2 additions and 2 deletions

View File

@ -1586,7 +1586,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"So our initialization wasn't right. This is because at the same the previous article was written, the popular activation in a neural net was the hyperbolic tangent (which is the one they use) and that initialization doesn't account for our ReLU. Fortunately someone else has done the math for us and computed the right scale we should use. Kaiming He et al. in [Delving Deep into Rectifiers: Surpassing Human-Level Performance](https://arxiv.org/abs/1502.01852) (which we've seen before--it's the article that introduced the ResNet) show we should use the following scale instead: $\\sqrt{2 / n_{in}}$ where $n_{in}$ is the number of inputs of our model."
"So our initialization wasn't right. This is because at the time the previous article was written, the popular activation in a neural net was the hyperbolic tangent (which is the one they use) and that initialization doesn't account for our ReLU. Fortunately someone else has done the math for us and computed the right scale we should use. Kaiming He et al. in [Delving Deep into Rectifiers: Surpassing Human-Level Performance](https://arxiv.org/abs/1502.01852) (which we've seen before--it's the article that introduced the ResNet) show we should use the following scale instead: $\\sqrt{2 / n_{in}}$ where $n_{in}$ is the number of inputs of our model."
]
},
{

View File

@ -48,7 +48,7 @@
"source": [
"Class Activation Mapping (or CAM) was introduced by Zhou et al. in [Learning Deep Features for Discriminative Localization](https://arxiv.org/abs/1512.04150). It uses the output of the last convolutional layer (just before our average pooling) together with the predictions to give us some heatmap visulaization of why the model made its decision. This is a useful tool for intepretation.\n",
"\n",
"More precisely, at each position of our final convolutional layer we have has many filters as the last linear layer. We can then compute the dot product of those activations by the final weights to have, for each location on our feature map, the score of the feature that was used to make a decision.\n",
"More precisely, at each position of our final convolutional layer we have as many filters as the last linear layer. We can then compute the dot product of those activations by the final weights to have, for each location on our feature map, the score of the feature that was used to make a decision.\n",
"\n",
"We're going to need a way to get access to the activations inside the model while it's training. In PyTorch this can be done with a *hook*. Hooks are PyTorch's equivalent of fastai's *callbacks*. However rather than allowing you to inject code to the training loop like a fastai Learner callback, hooks allow you to inject code into the forward and backward calculations themselves. We can attach a hook to any layer of the model, and it will be executed when we compute the outputs (forward hook) or during backpropagation (backward hook). A forward hook has to be a function that takes three things: a module, its input and its output, and it can perform any behavior you want. (fastai also provides a handy `HookCallback` that we won't cover here, so take a look at the fastai docs; it makes working with hooks a little easier.)\n",
"\n",