This commit is contained in:
Jeremy Howard 2020-03-04 12:07:53 -08:00
parent ee85ecc89f
commit 6e9584fa04
6 changed files with 178 additions and 21 deletions

View File

@ -629,6 +629,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"A `DataLoaders` object (i.e. the plural) stores multiple `DataLoader` objects, normally a `train` and a `valid`, although it's possible to have as many as you like. (Later in the book we'll also learn about the `Dataset` and `Datasets` classes, which have the same relationship).\n",
"\n",
"To turn our downloaded data into `DataLoaders` we need to tell fastai at least four things:\n",
"\n",
"- what kinds of data we are working with ;\n",
@ -830,14 +832,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"> note: The `get_idx` assignment in this code is a little bit magic, and you absolutely don't have to understand it at this point. So feel free to ignore the entirety of this paragraph! This is just if you're curious… Showing different randomly varied versions of the same image is not something we normally have to do in deep learning, so it's not something that fastai provides directly. Therefore to draw the picture of data augmentation on the same image, we had to take advantage of fastai's sophisticated customisation features. DataLoader has a method called `get_idx`, which is called to decide which items should be selected next. Normally when we are training, this returns a random permutation of all of the indexes in the dataset. But pretty much everything in fastai can be changed, including how the `get_idx` method is defined, which means we can change how we sample data. So in this case, we are replacing it with a version which always returns the number one. That way, our DataLoader shows the same image again and again! This is a great example of the flexibility that fastai provides. "
"> note: The `get_idx` assignment in this code is a little bit magic, and you absolutely don't have to understand it at this point. So feel free to ignore the entirety of this paragraph! This is just if you're curious… Showing different randomly varied versions of the same image is not something we normally have to do in deep learning, so it's not something that fastai provides directly. Therefore to draw the picture of data augmentation on the same image, we had to take advantage of fastai's sophisticated customisation features. DataLoader has a method called `get_idx`, which is called to decide which items should be selected next. Normally when we are training, this returns a random permutation of all of the indexes in the dataset. But pretty much everything in fastai can be changed, including how the `get_idx` method is defined, which means we can change how we sample data. So in this case, we are replacing it with a version which always returns the number one. That way, our DataLoader shows the same image again and again! This is a great example of the flexibility that fastai provides. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In fact, an entirely untrained neural network knows nothing whatsoever about how images behave. It doesn't even recognise that when an object is moved one pixel to the left, then it still is a picture of the same thing! So actually training the neural network with examples of images that are in slightly different places, and slightly different sizes, helps it to understand the basic concept of what a *object* is, and how it can be represented in an image.\n",
"In fact, an entirely untrained neural network knows nothing whatsoever about how images behave. It doesn't even recognise that when an object is rotated by one degree, then it still is a picture of the same thing! So actually training the neural network with examples of images that are in slightly different places, and slightly different sizes, helps it to understand the basic concept of what a *object* is, and how it can be represented in an image.\n",
"\n",
"This is a specific example of a more general technique, called *data augmentation*."
]
@ -853,7 +855,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Data augmentation refers to creating random variations of our input data, such that they appear different, but are not expected to change the meaning of the data. Examples of common data augmentation for images are rotation, flipping, perspective warping, brightness changes, contrast changes, and much more. For natural photo images such as the ones we are using here, there is a standard set of augmentations which we have found work pretty well, and are provided with the get transforms function. Because the images are now all the same size, we can apply these augmentations to an entire batch of them using the GPU, which will save a lot of time. To tell fastai we want to use these transforms to a batch, we use the `batch_tfms` parameter. (Note that's we're not using `RandomResizedCrop` in this example, so you can see the differences more clearly; we're also using double the amount of augmentation compared to the default, for the same reason)."
"Data augmentation refers to creating random variations of our input data, such that they appear different, but are not expected to change the meaning of the data. Examples of common data augmentation for images are rotation, flipping, perspective warping, brightness changes, contrast changes, and much more. For natural photo images such as the ones we are using here, there is a standard set of augmentations which we have found work pretty well, and are provided with the `aug_transforms` function. Because the images are now all the same size, we can apply these augmentations to an entire batch of them using the GPU, which will save a lot of time. To tell fastai we want to use these transforms to a batch, we use the `batch_tfms` parameter. (Note that's we're not using `RandomResizedCrop` in this example, so you can see the differences more clearly; we're also using double the amount of augmentation compared to the default, for the same reason)."
]
},
{
@ -1216,7 +1218,9 @@
"source": [
"Once you've got a model you're happy with, you need to save it, so that you can then copy it over to a server where you'll use it in production. Remember that a model consists of two parts: the *architecture*, and the trained *parameters*. The easiest way to save a model is to save both of these, because that way when you load a model you can be sure that you have the matching architecture and parameters. To save both parts, use the `export` method.\n",
"\n",
"This method even saves the definition of how to create your `DataLoaders`. This is important, because otherwise you would have to redefine how to transform your data in order to use your model in production. When you call export, fastai will save a file called `export.pkl`."
"This method even saves the definition of how to create your `DataLoaders`. This is important, because otherwise you would have to redefine how to transform your data in order to use your model in production. fastai automatically uses your validation set `DataLoader` for inference by default, so your data augmentation will not be applied, which is generally what you want.\n",
"\n",
"When you call export, fastai will save a file called `export.pkl`."
]
},
{
@ -1232,7 +1236,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's check that file exists:"
"Let's check that file exists, by using the `Path.ls` method that fastai adds to Python's `Path` class:"
]
},
{
@ -1253,7 +1257,7 @@
],
"source": [
"path = Path()\n",
"Path().ls(file_exts='.pkl')"
"path.ls(file_exts='.pkl')"
]
},
{
@ -1654,7 +1658,7 @@
"source": [
"Now that we have everything working in this Jupyter notebook, we can create our application. To do this, create a notebook which contains only the code needed to create and show the widgets that you need, and markdown for any text that you want to appear. Have a look at the *bear_classifier* notebook in the book repo to see the simple notebook application we created.\n",
"\n",
"Next, install Voila if you have not already, by copying these lines into a Notebook cell, and executing it:\n",
"Next, install Voila if you have not already, by copying these lines into a Notebook cell, and executing it (if you're comfortable using the command line, you can also execute these two lines in your terminal, without the `!` prefix):\n",
"\n",
" !pip install voila\n",
" !jupyter serverextension enable voila --sys-prefix\n",
@ -1682,7 +1686,7 @@
"As we now know, you need a GPU to train nearly any useful deep learning model. So, do you need a GPU to use that model in production? No! You almost certainly **do not need a GPU to serve your model in production**. There's a few reasons for this:\n",
"\n",
"- As we've seen, GPUs are only useful when they do lots of identical work in parallel. If you're doing (say) image classification, then you'll normally be classifying just one user's image at a time, and there isn't normally enough work to do in a single image to keep a GPU busy for long enough for it to be very efficient. So a CPU will often be more cost effective.\n",
"- An alternative could be to wait for a few users to submit their images, and then batch them up, and do them all at once on a GPU. But then you're asking your users to wait, rather than getting answers straight away! And you need a high volume site for this to be workable.\n",
"- An alternative could be to wait for a few users to submit their images, and then batch them up, and do them all at once on a GPU. But then you're asking your users to wait, rather than getting answers straight away! And you need a high volume site for this to be workable. If you do need this functionality, you can use a tool such as Microsoft's [ONNX Runtime](https://github.com/microsoft/onnxruntime), or [AWS Sagemaker](https://aws.amazon.com/sagemaker/)\n",
"- The complexities of dealing with GPU inference are significant. In particular, the GPU's memory will need careful manual management, and you'll need some careful queueing system to ensure you only do one batch at a time\n",
"- There's a lot more market competition in CPU servers than GPU, as a result of which there's much cheaper options available for CPU servers.\n",
"\n",
@ -1703,6 +1707,20 @@
"6. Click \"Launch\"."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
@ -1791,7 +1809,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Where possible, the first step is to use an entirely manual process, with your deep learning model approach running in parallel, but not being used directly to drive any actions. The humans involved in the manual process should look at the deep learning outputs and check whether they make sense. For instance, with our bear classifier a park ranger could have a screen displaying any time a possible bear sighting occurred in any camera, and simply highlight them in red on the screen. The park ranger would still be expected to be just as alert as before the model was deployed; they are simply helping to check for problems at this point.\n",
"Where possible, the first step is to use an entirely manual process, with your deep learning model approach running in parallel, but not being used directly to drive any actions. The humans involved in the manual process should look at the deep learning outputs and check whether they make sense. For instance, with our bear classifier a park ranger could have a screen displaying any time a possible bear sighting occurred in any camera, and simply highlight them in red on the screen. The park ranger would still be expected to be just as alert as before the model was deployed; the model is simply helping to check for problems at this point.\n",
"\n",
"The second step is to try to limit the scope of the model, and have it carefully supervised by people. For instance, do a small geographically and time constrained trial of the model-driven approach. Rather than rolling your bear classifier out in every national park throughout the country, pick a single observation post, for a one-week period, and have a park ranger check each alert before it goes out.\n",
"\n",
@ -1820,6 +1838,10 @@
"\n",
"However, human beings tend to be drawn towards controversial content. This meant that videos about things like conspiracy theories started to get recommended more and more by the recommendation system. Furthermore, it turns out that the kinds of people that are interested in conspiracy theories are also people that watch a lot of online videos! So, they started to get drawn more and more towards YouTube. The increasing number of conspiracy theorists watching YouTube resulted in the algorithm recommending more and more conspiracy theories and other extremist content, which resulted in more extremists watching videos on YouTube, and more people watching YouTube developing extremist views, which led to the algorithm recommending more extremist content... The system became so out of control that in February 2019 it led the New York Times to run the headline \"YouTube Unleashed a Conspiracy Theory Boom. Can It Be Contained?\"footnote:[https://www.nytimes.com/2019/02/19/technology/youtube-conspiracy-stars.html]\n",
"\n",
"One of our reviewers for this book, Aurélien Géron, led YouTube's video classification team from 2013 to 2016. He pointed out that it's not just feedback loops involving humans that are a problem. There can also be feedback loops without humans! He told us about an example from YouTube:\n",
"\n",
"> \"One important signal to classify the main topic of a video is the channel it comes from. For example, a video uploaded to a cooking channel is very likely to be a cooking video. But how do we know what topic a channel is about? Well… in part by looking at the topics of the videos it contains! Do you see the loop? For example, many videos have a description which indicates what camera was used to shoot the video. As a result, some of these videos might get classified as videos about “photography”. If a channel has such as misclassified video, it might be classified as a “photography” channel, making it even more likely for future videos on this channel to be wrongly classified as “photography”. This could even lead to runaway virus-like classifications! One way to break this feedback loop is to classify videos with and without the channel signal. Then when classifying the channels, you can only use the classes obtained without the channel signal. This way, the feedback loop is broken.\"\n",
"\n",
"A helpful exercise prior to rolling out a significant machine learning system is to consider this question: \"what would happen if it went really, really well?\" In other words, what if the predictive power was extremely high, and its ability to influence behaviour was extremely significant? In that case, who would be most impacted? What would the most extreme results potentially look like? How would you know what was really going on?\n",
"\n",
"Such a thought exercise might help you to construct a more careful rollout plan, ongoing monitoring systems, and human oversight. Of course, human oversight isn't useful if it isn't listened to; so make sure that there are reliable and resilient communication channels so that the right people will be aware of issues, and will have the power to fix them."

View File

@ -5406,6 +5406,31 @@
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.5"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,

View File

@ -2567,7 +2567,20 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To learn more about `Transform`s and how you can use them to have different behavior depending on the type of the input, be sure to check our tutorial in the docs online."
"Note that the method called and the method implemented are different, for each of these methods:\n",
"\n",
"```asciidoc\n",
"[options=\"header\"]\n",
"|======\n",
"| Class | To call | To implement\n",
"| `nn.Module` (PyTorch) | `()` (i.e. call as function) | `forward`\n",
"| `Transform` | `()` | `encodes`\n",
"| `Transform` | `decode()` | `decodes`\n",
"| `Transform` | `setup()` | `setups`\n",
"|======\n",
"```\n",
"\n",
"So, for instance, you would never call `setups` directly, but instead would call `setups`. The reason for this is that `setup` does some work before and after calling `setups` for you. To learn more about `Transform`s and how you can use them to have different behavior depending on the type of the input, be sure to check the tutorials in the fastai docs."
]
},
{
@ -3262,6 +3275,31 @@
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.5"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,

View File

@ -42,7 +42,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Whenever we start working on a new problem, we always first try to think of the simplest dataset we can which would allow us to try out methods quickly and easily, and interpret the results. When we started working on language modelling a few years ago, we didn't find any datasets that would allow for quick prototyping, so we made one. We call it *human numbers*, and it simply contains the first 10,000 words written out in English."
"Whenever we start working on a new problem, we always first try to think of the simplest dataset we can which would allow us to try out methods quickly and easily, and interpret the results. When we started working on language modelling a few years ago, we didn't find any datasets that would allow for quick prototyping, so we made one. We call it *human numbers*, and it simply contains the first 10,000 numbers written out in English."
]
},
{
@ -674,6 +674,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In `LMModel2` we only have one weight matrix, `h_h`, to calculate the next hidden state from the previous hidden state. Therefore the hidden state isn't able to easily calculate anything much more complex than a linear relationship. In next chapter we'll see how to create truely deep RNNs.\n",
"\n",
"A neural network which is defined using a loop like this is called a *recurrent neural network*, also known as an RNN. It is important to realise that an RNN is not a complicated new architecture, but is simply a refactoring of a multilayer neural network using a for loop.\n",
"\n",
"> A: My true opinion: if they were called \"looping neural networks\", or LNNs, they would seem 50% less daunting!"
@ -1281,6 +1283,31 @@
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.5"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,

View File

@ -79,7 +79,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The obvious way to get a better model is to go deeper: we only have one linear layer between the hidden state and the output activations in our basic RNN, so maybe we would get better results with more."
"The obvious way to get a better model is to go deeper: as we discussed in the last chapter, we only have one linear layer between the hidden state and the output activations in our basic RNN, so maybe we would get better results with more."
]
},
{
@ -388,9 +388,9 @@
"\n",
"First, the arrows for input and old hidden state are joined together. In the RNN we wrote in the past chapter, we were adding them together. In the LSTM, we stack them in one big tensor. This means the dimension of our embeddings (which is the dimension of $x_{t}$) can be different than the dimension of our hidden state. If we call those `n_in` and `n_hid`, the arrow at the bottom is of size `n_in + n_hid`, thus all the neural nets (orange boxes) are linear layers with `n_in + n_hid` inputs and `n_hid` outputs.\n",
"\n",
"The first gate (looking from the left to right) is called the *forget gate*. Since it's a linear layer followed by a sigmoid, its output will have scalars between 0 and 1. We multiply this result y the cell gate, so for all the values close to 0, we will forget what was inside that cell state (and for the values close to 1 it doesn't do anything). This gives the ability to the LSTM to forget things about its longterm state. For instance, when crossing a period or an `xxbos` token, we would expect to it to (have learned to) reset its cell state.\n",
"The first gate (looking from the left to right) is called the *forget gate*. Since it's a linear layer followed by a sigmoid, its output will have scalars between 0 and 1. We multiply this result by the cell gate, so for all the values close to 0, we will forget what was inside that cell state (and for the values close to 1 it doesn't do anything). This gives the ability to the LSTM to forget things about its longterm state. For instance, when crossing a period or an `xxbos` token, we would expect to it to (have learned to) reset its cell state.\n",
"\n",
"The second gate works is called the *input gate*. It works with the third gate (which doesn't really have a name but is sometimes called the *cell gate*) to update the cell state. For instance we may see a new gender pronoun, so we must replace the information about gender that the forget gate removed by the new one. Like the forget gate, the input gate ends up on a product, so it jsut decides which element of the cell state to update (valeus close to 1) or not (values close to 0). The third gate will then fill those values with things between -1 and 1 (thanks to the tanh). The result is then added to the cell state.\n",
"The second gate is called the *input gate*. It works with the third gate (which doesn't really have a name but is sometimes called the *cell gate*) to update the cell state. For instance we may see a new gender pronoun, so we must replace the information about gender that the forget gate removed by the new one. Like the forget gate, the input gate ends up on a product, so it jsut decides which element of the cell state to update (valeus close to 1) or not (values close to 0). The third gate will then fill those values with things between -1 and 1 (thanks to the tanh). The result is then added to the cell state.\n",
"\n",
"The last gate is the *output gate*. It will decides which information take in the cell state to generate the output. The cell state goes through a tanh before this and the output gate combined with the sigmoid decides which values to take inside it.\n",
"\n",
@ -1122,6 +1122,31 @@
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.5"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,

View File

@ -310,13 +310,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Sometimes, callbacks need to be called in a particular order. In the case of `TerminateOnNaNCallback`, it's important that `Recorder` runs its `after_batch` after this callback, to avoid registering an NaN loss. You can specify `run_before` (this callback must run before ...) or `run_after` (this callback must run after ...) in your callback to ensure the ordering that you need."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Sometimes, callbacks need to be called in a particular order. In the case of `TerminateOnNaNCallback`, it's important that `Recorder` runs its `after_batch` after this callback, to avoid registering an NaN loss. You can specify `run_before` (this callback must run before ...) or `run_after` (this callback must run after ...) in your callback to ensure the ordering that you need.\n",
"\n",
"Now that we have seen how to tweak the training loop of fastai to do anything we need, let's take a step back and dig a little bit deeper in the foundations of that training loop."
]
},
@ -392,6 +387,31 @@
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.5"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,