Merge pull request #136 from joe-bender/ch2

Writing fixes, 02_production.ipynb
This commit is contained in:
Sylvain Gugger 2020-04-23 08:38:39 -04:00 committed by GitHub
commit a3cc8df637
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -47,7 +47,7 @@
"source": [
"We've seen that deep learning can solve a lot of challenging problems quickly and with little code. As a beginner there's a sweet spot of problems that are similar enough to our example problems that you can very quickly get extremely useful results. However, deep learning isn't magic! The same 5 lines of code won't work on every problem anyone can think of today. Underestimating the constraints and overestimating the capabilities of deep learning may lead to frustratingly poor results. At least until you gain some experience to solve the problems that arise. Overestimating the constraints and underestimating the capabilities of deep learning may mean you do not attempt a solvable problem because you talk yourself out of it. \n",
"\n",
"We often talk to people who underestimate both the constraints, and the capabilities of deep learning. Both of these can be problems: underestimating the capabilities means that you might not even try things which could be very beneficial; underestimating the constraints might mean that you fail to consider and react to important issues.\n",
"We often talk to people who underestimate both the constraints and the capabilities of deep learning. Both of these can be problems: underestimating the capabilities means that you might not even try things which could be very beneficial; underestimating the constraints might mean that you fail to consider and react to important issues.\n",
"\n",
"The best thing to do is to keep an open mind. If you remain open to the possibility that deep learning might solve part of your problem with less data or complexity than you expect, then it is possible to design a process where you can find the specific capabilities and constraints related to your particular problem as you work through the process. This doesn't mean making any risky bets — we will show you how you can gradually roll out models so that they don't create significant risks, and can even backtest them prior to putting them in production.\n",
"\n",
@ -65,7 +65,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"So where should you start your deep learning journey? The most important thing is to ensure that you have some project that you are working on — it is only through working on your own projects that you will get real experience of building and using models. When selecting a project, the most important consideration is data availability. Regardless of whether you are doing a project just for your own learning, or for practical application in your organization, you want something where you can get started quickly. We have seen many students, researchers, and industry practitioners waste months or years while they attempt to find their perfect dataset. The goal is not to find the perfect dataset, or the perfect project, but just to get started, and iterate from there.\n",
"So where should you start your deep learning journey? The most important thing is to ensure that you have some project that you are working on — it is only through working on your own projects that you will get the real experience of building and using models. When selecting a project, the most important consideration is data availability. Regardless of whether you are doing a project just for your own learning or for practical application in your organization, you want something where you can get started quickly. We have seen many students, researchers, and industry practitioners waste months or years while they attempt to find their perfect dataset. The goal is not to find the perfect dataset or the perfect project, but just to get started and iterate from there.\n",
"\n",
"If you take this approach, then you will be on your third iteration of learning and improving whilst the perfectionists are still in the planning stages!\n",
"\n",
@ -82,7 +82,7 @@
"\n",
"By using the end to end iteration approach you will also get a better understanding of how much data you really need. For instance, you may find you can only easily get 200 labelled data items, and you can't really know until you try whether that's enough to get the performance you need for your application to work well in practice.\n",
"\n",
"In an organizational context you will be able to show your colleagues that your idea can really work, by showing them a real working prototype. We have repeatedly observed that this is the secret to getting good organizational buy in for a project."
"In an organizational context you will be able to show your colleagues that your idea can really work, by showing them a real working prototype. We have repeatedly observed that this is the secret to getting good organizational buy-in for a project."
]
},
{
@ -93,7 +93,7 @@
"\n",
"Sometimes, you have to get a bit creative. Maybe you can find some previous machine learning project, such as a Kaggle competition, that is related to your field of interest. Sometimes, you have to compromise. Maybe you can't find the exact data you need for the precise project you have in mind; but you might be able to find something from a similar domain, or measured in a different way, tackling a slightly different problem. Working on these kinds of similar projects will still give you a good understanding of the overall process, and may help you identify other shortcuts, data sources, and so forth.\n",
"\n",
"Especially when you are just starting out with deep learning it's not a good idea to branch out into very different areas to places that deep learning has not been applied to before. That's because if your model does not work at first, you will not know whether it is because you have made a mistake, or if the very problem you are trying to solve is simply not solvable with deep learning. And you won't know where to look to get help. Therefore, it is best at first to start with something where you can find an example online of somebody who has had good results with something that is at least somewhat similar to what you are trying to achieve, or where you can convert your data into a format similar what someone else has used before (such as creating an image from your data). Let's have a look at the state of deep learning, just so you know what kinds of things deep learning is good at right now."
"Especially when you are just starting out with deep learning, it's not a good idea to branch out into very different areas to places that deep learning has not been applied to before. That's because if your model does not work at first, you will not know whether it is because you have made a mistake, or if the very problem you are trying to solve is simply not solvable with deep learning. And you won't know where to look to get help. Therefore, it is best at first to start with something where you can find an example online of somebody who has had good results with something that is at least somewhat similar to what you are trying to achieve, or where you can convert your data into a format similar to what someone else has used before (such as creating an image from your data). Let's have a look at the state of deep learning, just so you know what kinds of things deep learning is good at right now."
]
},
{
@ -107,7 +107,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's start by considering whether deep learning can be any good at the problem you are looking to work on. In general, here is a summary of the state of deep learning at the start of 2020. However, things move very fast, and by the time you read this some of these constraints may no longer exist. We will try to keep the book website up-to-date; in addition, a Google search for \"what can AI do now\" there is likely to provide some up-to-date information."
"Let's start by considering whether deep learning can be any good at the problem you are looking to work on. In general, here is a summary of the state of deep learning at the start of 2020. However, things move very fast, and by the time you read this some of these constraints may no longer exist. We will try to keep the book website up-to-date; in addition, a Google search for \"what can AI do now\" is likely to provide some up-to-date information."
]
},
{
@ -123,7 +123,7 @@
"source": [
"There are many domains in which deep learning has not been used to analyse images yet, but those where it has been tried have nearly universally shown that computers can recognise what items are in an image at least as well as people can — even specially trained people, such as radiologists. This is known as *object recognition*. Deep learning is also good at recognizing whereabouts objects in an image are, and can highlight their location and name each found object. This is known as *object detection* (there is also a variant of this we saw in <<chapter_intro>>, where every pixel is categorized based on what kind of object it is part of--this is called *segmentation*). Deep learning algorithms are generally not good at recognizing images that are significantly different in structure or style to those used to train the model. For instance, if there were no black-and-white images in the training data, the model may do poorly on black-and-white images. If the training data did not contain hand-drawn images then the model will probably do poorly on hand-drawn images. There is no general way to check what types of images are missing in your training set, but we will show in this chapter some ways to try to recognize when unexpected image types arise in the data when the model is being used in production (this is known as checking for *out of domain* data).\n",
"\n",
"One major challenge for object detection systems is that image labelling can be slow and expensive. There is a lot of work at the moment going into tools to try to make this labelling faster and easier, and require less handcrafted labels to train accurate object detection models. One approach which is particularly helpful is to synthetically generate variations of input images, such as by rotating them, or changing their brightness and contrast; this is called *data augmentation* and also works well for text and other types of model. We will be discussing it in detail in this chapter.\n",
"One major challenge for object detection systems is that image labelling can be slow and expensive. There is a lot of work at the moment going into tools to try to make this labelling faster and easier, and require fewer handcrafted labels to train accurate object detection models. One approach which is particularly helpful is to synthetically generate variations of input images, such as by rotating them or changing their brightness and contrast; this is called *data augmentation* and also works well for text and other types of model. We will be discussing it in detail in this chapter.\n",
"\n",
"Another point to consider is that although your problem might not look like a computer vision problem, it might be possible with a little imagination to turn it into one. For instance, if what you are trying to classify are sounds, you might try converting the sounds into images of their acoustic waveforms and then training a model on those images."
]
@ -139,7 +139,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Just like in computer vision, computers are very good at categorising both short and long documents based on categories such as spam, sentiment (e.g. is the review positive or negative), author, source website, and so forth. We are not aware of any rigorous work done in this area to compare to human performance, but anecdotally it seems to us that deep learning performance is similar to human performance here. Deep learning is also very good at generating context-appropriate text, such as generating replies to social media posts, and imitating a particular author's style. It is also good at making this content compelling to humans, and has been shown to be even more compelling than human-generated text. However, deep learning is currently not good at generating *correct* responses! We don't currently have a reliable way to, for instance, combine a knowledge base of medical information, along with a deep learning model for generating medically correct natural language responses. This is very dangerous, because it is so easy to create content which appears to a layman to be compelling, but actually is entirely incorrect.\n",
"Just like in computer vision, computers are very good at categorising both short and long documents based on categories such as spam, sentiment (e.g. is the review positive or negative), author, source website, and so forth. We are not aware of any rigorous work done in this area to compare to human performance, but anecdotally it seems to us that deep learning performance is similar to human performance here. Deep learning is also very good at generating context-appropriate text, such as replies to social media posts, and imitating a particular author's style. It is also good at making this content compelling to humans, and has been shown to be even more compelling than human-generated text. However, deep learning is currently not good at generating *correct* responses! We don't currently have a reliable way to, for instance, combine a knowledge base of medical information, along with a deep learning model for generating medically correct natural language responses. This is very dangerous, because it is so easy to create content which appears to a layman to be compelling, but actually is entirely incorrect.\n",
"\n",
"Another concern is that context-appropriate, highly compelling responses on social media can be used at massive scale — thousands of times greater than any troll farm previously seen — to spread disinformation, create unrest, and encourage conflict. As a rule of thumb, text generation will always be technologically a bit ahead of the ability of models to recognize automatically generated text. For instance, it is possible to use a model that can recognize artificially generated content to actually improve the generator that creates that content, until the classification model is no longer able to complete its task.\n",
"\n",
@ -157,9 +157,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The ability of deep learning to combine text and images into a single model is, generally, far better than most people intuitively expect. For example, a deep learning model can be trained on input images, and output captions written in English, and can learn to generate surprisingly appropriate captions automatically for new images! But again, we have the same warning that we discussed in the previous section: there is no guarantee that these captions will actually be correct.\n",
"The ability of deep learning to combine text and images into a single model is, generally, far better than most people intuitively expect. For example, a deep learning model can be trained on input images with output captions written in English, and can learn to generate surprisingly appropriate captions automatically for new images! But again, we have the same warning that we discussed in the previous section: there is no guarantee that these captions will actually be correct.\n",
"\n",
"Because of this serious issue we generally recommend that deep learning be used not as an entirely automated process, but as part of a process in which the model and a human user interact closely. This can potentially make humans orders of magnitude more productive than they would be with entirely manual methods, and actually result in more accurate processes than using a human alone. For instance, an automatic system can be used to identify potential strokes directly from CT scans, and send a high priority alert to have those scans looked at quickly. There is only a three-hour window to treat strokes, so this fast feedback loop could save lives. At the same time, however, all scans could continue to be sent to radiologists in the usual way, so there would be no reduction in human input. Other deep learning models could automatically measure items seen on the scan, and insert those measurements into reports, warning the radiologist about findings that they may have missed, and tell the radiologist about other cases which might be relevant."
"Because of this serious issue, we generally recommend that deep learning be used not as an entirely automated process, but as part of a process in which the model and a human user interact closely. This can potentially make humans orders of magnitude more productive than they would be with entirely manual methods, and actually result in more accurate processes than using a human alone. For instance, an automatic system can be used to identify potential strokes directly from CT scans, and send a high priority alert to have those scans looked at quickly. There is only a three-hour window to treat strokes, so this fast feedback loop could save lives. At the same time, however, all scans could continue to be sent to radiologists in the usual way, so there would be no reduction in human input. Other deep learning models could automatically measure items seen on the scan, and insert those measurements into reports, warning the radiologist about findings that they may have missed, and tell the radiologist about other cases which might be relevant."
]
},
{
@ -187,9 +187,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Recommendation systems are really just a special type of tabular data. In particular, they generally have a high cardinality categorical variable representing users, and another one representing products (or something similar). A company like Amazon represents every purchase that has ever been made as a giant sparse matrix, with customers as the rows and products as the columns. Once they have the data in this format, data scientists apply some form of collaborative filtering to *fill in the matrix*. For example, if customer A buys products 1 and 10, and customer B buys products 1, 2, 4, and 10, the engine will recommend that A buy 2 and 4. Because deep learning models are good at handling high cardinality categorical variables they are quite good at handling recommendation systems. They particularly come into their own, just like for tabular data, when combining these variables with other kinds of data, such as natural language, or images. They can also do a good job of combining all of these types of information with additional meta data represented as tables, such as user information, previous transactions, and so forth.\n",
"Recommendation systems are really just a special type of tabular data. In particular, they generally have a high cardinality categorical variable representing users, and another one representing products (or something similar). A company like Amazon represents every purchase that has ever been made as a giant sparse matrix, with customers as the rows and products as the columns. Once they have the data in this format, data scientists apply some form of collaborative filtering to *fill in the matrix*. For example, if customer A buys products 1 and 10, and customer B buys products 1, 2, 4, and 10, the engine will recommend that A buy 2 and 4. Because deep learning models are good at handling high cardinality categorical variables, they are quite good at handling recommendation systems. They particularly come into their own, just like for tabular data, when combining these variables with other kinds of data, such as natural language or images. They can also do a good job of combining all of these types of information with additional meta data represented as tables, such as user information, previous transactions, and so forth.\n",
"\n",
"However, nearly all machine learning approaches have the downside that they only tell you what products a particular user might like, rather than what recommendations would be helpful for a user. Many kinds of recommendations for products a user might like may not be at all helpful, for instance, if the user is already familiar with its products, or if they are simply different packagings of products they have already purchased (such as a boxed set of novels, where they already have each of the items in that set). Jeremy likes reading books by Terry Pratchett, and for a while Amazon was recommending nothing but Terry Pratchett books to him (see <<pratchett>>), which really wasn't helpful because he already was aware of these books!"
"However, nearly all machine learning approaches have the downside that they only tell you what products a particular user might like, rather than what recommendations would be helpful for a user. Many kinds of recommendations for products a user might like may not be at all helpful, for instance, if the user is already familiar with the products, or if they are simply different packagings of products they have already purchased (such as a boxed set of novels, where they already have each of the items in that set). Jeremy likes reading books by Terry Pratchett, and for a while Amazon was recommending nothing but Terry Pratchett books to him (see <<pratchett>>), which really wasn't helpful because he already was aware of these books!"
]
},
{
@ -226,7 +226,7 @@
"source": [
"The Drivetrain approach, illustrated in <<drivetrain>>, was described in detail in [Designing Great Data Products](https://www.oreilly.com/radar/drivetrain-approach-data-products/). The basic idea is to start with considering your objective, then think about what you can actually do to change that objective (\"levers\"), what data you have that might help you connect potential changes to levers to changes in your objective, and then to build a model of that. You can then use that model to find the best actions (that is, changes to levers) to get the best results in terms of your objective.\n",
"\n",
"Consider a model in an autonomous vehicle, you want to help a car drive safely from point A to point B without human intervention. Great predictive modeling is an important part of the solution, but it doesn't stand on its own; as products become more sophisticated, it disappears into the plumbing. Someone using a self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work. But as data scientists build increasingly sophisticated products, they need a systematic design approach.\n",
"Consider a model in an autonomous vehicle: you want to help a car drive safely from point A to point B without human intervention. Great predictive modeling is an important part of the solution, but it doesn't stand on its own; as products become more sophisticated, it disappears into the plumbing. Someone using a self-driving car is completely unaware of the hundreds (if not thousands) of models and the petabytes of data that make it work. But as data scientists build increasingly sophisticated products, they need a systematic design approach.\n",
"\n",
"We use data not just to generate more data (in the form of predictions), but to produce *actionable outcomes*. That is the goal of the Drivetrain Approach. Start by defining a clear **objective**. For instance, Google, when creating their first search engine, considered \"What is the users main objective in typing in a search query?\", and their answer was \"show the most relevant search result\". The next step is to consider what **levers** you can pull (i.e. what actions could you take) to better achieve that objective. In Google's case, that was the ranking of the search results. The third step was to consider what new **data** they would need to produce such a ranking; they realized that the implicit information regarding which pages linked to which other pages could be used for this purpose. Only after these first three steps do we begin thinking about building the predictive **models**. Our objective and available levers, what data we already have and what additional data we will need to collect, determine the models we can build. The models will take both the levers and any uncontrollable variables as their inputs; the outputs from the models can be combined to predict the final state for our objective."
]
@ -260,28 +260,21 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"For many types of projects, you may be able to find all the data you need online. The project we'll be completing in this chapter is a *bear detector*. It will discriminate between three types of bear: grizzly, black, and teddy bear. There are many images on the Internet of each type of bear we can use. We just need a way to find them and download them. We've provided a tool you can use for this purpose, so you can follow along with this chapter, creating your own image recognition application for whatever kinds of object you're interested in. In the fast.ai course, thousands of students have presented their work on the course forums, displaying everything from Trinidad hummingbird varieties, to Panama bus types, and even an application that helped one student let his fiancée recognize his sixteen cousins during Christmas vacation!"
"For many types of projects, you may be able to find all the data you need online. The project we'll be completing in this chapter is a *bear detector*. It will discriminate between three types of bear: grizzly, black, and teddy bear. There are many images on the Internet of each type of bear we can use. We just need a way to find them and download them. We've provided a tool you can use for this purpose, so you can follow along with this chapter, creating your own image recognition application for whatever kinds of object you're interested in. In the fast.ai course, thousands of students have presented their work on the course forums, displaying everything from Trinidad hummingbird varieties to Panama bus types, and even an application that helped one student let his fiancée recognize his sixteen cousins during Christmas vacation!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As at the time of writing, Bing Image Search is the best option we know of for finding and downloading images. It's free for up to 1000 queries per month, and each query can download up to 150 images. However, something better might have come along between when we wrote this and when you're reading the book, so be sure to check out [book.fast.ai](https://book.fast.ai) where we'll let you know our current recommendation."
"At the time of writing, Bing Image Search is the best option we know of for finding and downloading images. It's free for up to 1000 queries per month, and each query can download up to 150 images. However, something better might have come along between when we wrote this and when you're reading the book, so be sure to check out [book.fast.ai](https://book.fast.ai) where we'll let you know our current recommendation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> important: Services that can be used for creating datasets come and go all the time, and their features, interfaces, and pricing change regularly too. In this section, we'll show how to use one particular provider, _Bing Image Search_, using the service they have as this book was written. We'll be providing more options and more up to date information on the http://book.fast.ai[book website], so be sure to have a look there now to get the most current information on how to download images from the web to create a dataset for deep learning."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To download images with Bing Image Search, you should sign up at Microsoft for *Bing Image Search*. You will be given a key, which you can either paste here, replacing \"XXX\":"
"> important: Services that can be used for creating datasets come and go all the time, and their features, interfaces, and pricing change regularly too. In this section, we'll show how to use one particular provider, _Bing Image Search_, using the service they have as this book was written. We'll be providing more options and more up to date information on the [book website](https://book.fast.ai), so be sure to have a look there now to get the most current information on how to download images from the web to create a dataset for deep learning."
]
},
{
@ -309,7 +302,7 @@
"\n",
" export AZURE_SEARCH_KEY=your_key_here\n",
"\n",
"and then restart jupyter notebooks, and finally execute in this notebook:\n",
"and then restart Jupyter notebooks, and finally execute in this notebook:\n",
"\n",
"```python\n",
"key = os.environ['AZURE_SEARCH_KEY']\n",
@ -536,7 +529,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sidebar: Getting help in jupyter notebooks"
"### Sidebar: Getting help in Jupyter notebooks"
]
},
{
@ -563,11 +556,11 @@
"Here are the commands that are very useful in Jupyter notebooks:\n",
"\n",
"- at any point, if you don't remember the exact spelling of a function or argument name, you can press \"tab\" to get suggestions of auto-completion.\n",
"- when inside the parenthesis of a function, pressing \"shift\" and \"tab\" simultaneously will display a window with the signature of the function and a short documentation. Pressing it twice will expand the documentation and pressing it three times will open a full window with the same information at the bottom of your screen.\n",
"- when inside the parentheses of a function, pressing \"shift\" and \"tab\" simultaneously will display a window with the signature of the function and a short documentation. Pressing it twice will expand the documentation and pressing it three times will open a full window with the same information at the bottom of your screen.\n",
"- in a cell, typing `?func_name` and executing will open a window with the signature of the function and a short documentation.\n",
"- in a cell, typing `??func_name` and executing will open a window with the signature of the function, a short documentation and the source code.\n",
"- if you are using the fastai library, we added a `doc` function for you, executing `doc(func_name)` in a cell will open a window with the signature of the function, a short documentation and links to the source code on GitHub and the full documentation of the function in the [documentation of the library](https://docs.fast.ai).\n",
"- unrelated to the documentation but still very useful to get help, at any point, if you get an error, type `%debug` in the next cell and execute to open the [python debugger](https://docs.python.org/3/library/pdb.html) that will let you inspect the content of every variable."
"- if you are using the fastai library, we added a `doc` function for you: executing `doc(func_name)` in a cell will open a window with the signature of the function, a short documentation and links to the source code on GitHub and the full documentation of the function in the [documentation of the library](https://docs.fast.ai).\n",
"- unrelated to the documentation but still very useful: to get help at any point if you get an error, type `%debug` in the next cell and execute to open the [python debugger](https://docs.python.org/3/library/pdb.html) that will let you inspect the content of every variable."
]
},
{
@ -646,7 +639,7 @@
"- how to label these items ;\n",
"- how to create the validation set.\n",
"\n",
"So far we have seen a number of *factory methods* for particular combinations of these things, which are convenient when you have an application and data structure which happens to fit into those predefined methods. For when you don't, fastai has an extremely flexible system called the *data block API*. With this API you can fully customize every stage of the creation of your DataLoaders. Here is what we need to create a DataLoaders for the dataset that we just downloaded:"
"So far we have seen a number of *factory methods* for particular combinations of these things, which are convenient when you have an application and data structure which happen to fit into those predefined methods. For when you don't, fastai has an extremely flexible system called the *data block API*. With this API you can fully customize every stage of the creation of your DataLoaders. Here is what we need to create a DataLoaders for the dataset that we just downloaded:"
]
},
{
@ -806,11 +799,11 @@
"source": [
"All of these approaches seem somewhat wasteful, or problematic. If we squished or stretched the images then they end up as unrealistic shapes, leading to a model that learns that things look different to how they actually are, which we would expect to result in lower accuracy. If we crop the images then we remove some of the features that allow us to recognize them. For instance, if we were trying to recognise the breed of dog or cat, we may end up cropping out a key part of the body or the face necessary to distinguish between similar breeds. If we pad the images then we have a whole lot of empty space, which is just wasted computation for our model, and results in a lower effective resolution for the part of the image we actually use.\n",
"\n",
"Instead, what we normally do in practice is to randomly select part of the image, and crop to just that part. On each epoch (which is one complete pass through all of our images in the dataset) we randomly select a different part of each image. This means that our model can learn to focus on, and recognize, different features in our images. It also reflects how images work in the real world; different photos of the same thing may be framed in slightly different ways.\n",
"Instead, what we normally do in practice is to randomly select part of the image, and crop to just that part. On each epoch (which is one complete pass through all of our images in the dataset) we randomly select a different part of each image. This means that our model can learn to focus on, and recognize, different features in our images. It also reflects how images work in the real world: different photos of the same thing may be framed in slightly different ways.\n",
"\n",
"In fact, an entirely untrained neural network knows nothing whatsoever about how images behave. It doesn't even recognise that when an object is rotated by one degree, then it still is a picture of the same thing! So actually training the neural network with examples of images that are in slightly different places, and slightly different sizes, helps it to understand the basic concept of what a *object* is, and how it can be represented in an image.\n",
"In fact, an entirely untrained neural network knows nothing whatsoever about how images behave. It doesn't even recognise that when an object is rotated by one degree, then it still is a picture of the same thing! So actually training the neural network with examples of images that are in slightly different places, and slightly different sizes, helps it to understand the basic concept of what an *object* is, and how it can be represented in an image.\n",
"\n",
"Here is another copy of the previous examples, but this time we are replacing `Resize` with `RandomResizedCrop`, which is the transform that provides the behaviour described above. The most important parameter to pass in is the `min_scale` parameter, which determines how much of the image to select at minimum each time."
"Here is another copy of the previous examples, but this time we are replacing `Resize` with `RandomResizedCrop`, which is the transform that provides the behaviour described above. The most important parameter to pass in is `min_scale`, which determines how much of the image to select at minimum each time."
]
},
{
@ -1022,7 +1015,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's see whether the mistakes the model is making is mainly thinking that grizzlies are teddies (that would be bad for safety!), or that grizzlies are black bears, or something else. We can create a *confusion matrix*:"
"Now let's see whether the mistakes the model is making are mainly thinking that grizzlies are teddies (that would be bad for safety!), or that grizzlies are black bears, or something else. We can create a *confusion matrix*:"
]
},
{
@ -1064,7 +1057,7 @@
"source": [
"Each row here represents all the black, grizzly, and teddy bears in our dataset, respectively. Each column represents the images which the model predicted as black, grizzly, and teddy bears, respectively. Therefore, the diagonal of the matrix shows the images which were classified correctly, and the other, off diagonal, cells represent those which were classified incorrectly. This is called a *confusion matrix* and is one of the many ways that fastai allows you to view the results of your model. It is (of course!) calculated using the validation set. With the color coding, the goal is to have white everywhere, except the diagonal where we want dark blue. Our bear classifier isn't making many mistakes!\n",
"\n",
"It's helpful to see where exactly our errors are occurring, to see whether it's due to a dataset problem (e.g. images that aren't bears at all, or are labelled incorrectly, etc.), or a model problem (e.g. perhaps it isn't handling images taken with unusual lighting, or from a different angle, etc.). To do this, we can sort out images by their *loss*.\n",
"It's helpful to see where exactly our errors are occurring, to see whether it's due to a dataset problem (e.g. images that aren't bears at all, or are labelled incorrectly, etc.), or a model problem (e.g. perhaps it isn't handling images taken with unusual lighting, or from a different angle, etc.). To do this, we can sort our images by their *loss*.\n",
"\n",
"The *loss* is a number that is higher if the model is incorrect (and especially if it's also confident of its incorrect answer), or if it's correct, but not confident of its correct answer. In a couple chapters we'll learn in depth how loss is calculated and used in the training process. For now, `plot_top_losses` shows us the images with the highest loss in our dataset. As the title of the output says, each image is labeled with four things: prediction, actual (target label), loss, and probability. The *probability* here is the confidence level, from zero to one, that the model has assigned to its prediction."
]
@ -1099,7 +1092,7 @@
"\n",
"The intuitive approach to doing data cleaning is to do it *before* you train a model. But as you've seen in this case, a model can actually help you find data issues more quickly and easily. So we normally prefer to train a quick and simple model first, and then use it to help us with data cleaning.\n",
"\n",
"fastai includes a handy GUI for data cleaning called `ImageClassifierCleaner`, which allows you to choose a category, and training vs validation set, and view the highest-loss images (in order), along with menus to allow any images to be selected for removal, or relabeling."
"fastai includes a handy GUI for data cleaning called `ImageClassifierCleaner` which allows you to choose a category, and training vs validation set, and view the highest-loss images (in order), along with menus to allow any images to be selected for removal or relabeling."
]
},
{
@ -1170,7 +1163,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We can see that amongst our *black bears* is an image that contain two bears, one grizzly, one black. So we should choose `<Delete>` in the menu under this image. `ImageClassifierCleaner` doesn't actually do the deleting or changing of labels for you; it just returns the indices of items to change. So, for instance, to delete (`unlink`) all images selected for deletion, we would run:\n",
"We can see that amongst our *black bears* is an image that contains two bears: one grizzly, one black. So we should choose `<Delete>` in the menu under this image. `ImageClassifierCleaner` doesn't actually do the deleting or changing of labels for you; it just returns the indices of items to change. So, for instance, to delete (`unlink`) all images selected for deletion, we would run:\n",
"\n",
"```python\n",
"for idx in cleaner.delete(): cleaner.fns[idx].unlink()\n",
@ -1182,7 +1175,7 @@
"for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat)\n",
"```\n",
"\n",
"> s: Cleaning the data or getting it ready for your model are two of the biggest challenges for data scientists; they say it takes 90% of their time. The fastai library aims at providing tools to make it as easy as possible.\n",
"> s: Cleaning the data and getting it ready for your model are two of the biggest challenges for data scientists; they say it takes 90% of their time. The fastai library aims at providing tools to make it as easy as possible.\n",
"\n",
"We'll be seeing more examples of model-driven data cleaning throughout this book. Once we've cleaned up our data, we can retrain our model. Try it yourself, and see if your accuracy improves!"
]
@ -1192,7 +1185,7 @@
"metadata": {},
"source": [
"\n",
"> note: After cleaning the dataset using the above steps, we generally are seeing 100% accuracy on this task. We even see that result when we download a lot less images than the 150 per class we're using here. As you can see, the common complaint _you need massive amounts of data to do deep learning_ can be a very long way from the truth!"
"> note: After cleaning the dataset using the above steps, we generally are seeing 100% accuracy on this task. We even see that result when we download a lot fewer images than the 150 per class we're using here. As you can see, the common complaint that _you need massive amounts of data to do deep learning_ can be a very long way from the truth!"
]
},
{
@ -1213,7 +1206,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We are now going to look at what it takes to take this model and turn it into a working online application. We will just go as far as creating a basic working prototype; we do not have the scope in this book to teach you all the details of web application development generally."
"We are now going to look at what it takes to turn this model into a working online application. We will just go as far as creating a basic working prototype; we do not have the scope in this book to teach you all the details of web application development generally."
]
},
{
@ -1227,7 +1220,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Once you've got a model you're happy with, you need to save it, so that you can then copy it over to a server where you'll use it in production. Remember that a model consists of two parts: the *architecture*, and the trained *parameters*. The easiest way to save a model is to save both of these, because that way when you load a model you can be sure that you have the matching architecture and parameters. To save both parts, use the `export` method.\n",
"Once you've got a model you're happy with, you need to save it, so that you can then copy it over to a server where you'll use it in production. Remember that a model consists of two parts: the *architecture* and the trained *parameters*. The easiest way to save a model is to save both of these, because that way when you load a model you can be sure that you have the matching architecture and parameters. To save both parts, use the `export` method.\n",
"\n",
"This method even saves the definition of how to create your `DataLoaders`. This is important, because otherwise you would have to redefine how to transform your data in order to use your model in production. fastai automatically uses your validation set `DataLoader` for inference by default, so your data augmentation will not be applied, which is generally what you want.\n",
"\n",
@ -1247,7 +1240,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's check that file exists, by using the `Path.ls` method that fastai adds to Python's `Path` class:"
"Let's check that the file exists, by using the `Path.ls` method that fastai adds to Python's `Path` class:"
]
},
{
@ -1330,7 +1323,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"This has returned three things: the predicted category in the same format you originally provided, in this case that's a string), the index of the predicted category, and the probabilities of each category. The last two are based on the order of categories in the *vocab* of the `DataLoaders`; that is, the stored list of all possible categories. At inference time, you can access the `DataLoaders` as an attribute of the `Learner`:"
"This has returned three things: the predicted category in the same format you originally provided (in this case that's a string), the index of the predicted category, and the probabilities of each category. The last two are based on the order of categories in the *vocab* of the `DataLoaders`; that is, the stored list of all possible categories. At inference time, you can access the `DataLoaders` as an attribute of the `Learner`:"
]
},
{
@ -1378,16 +1371,16 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To use our model in an application we can simply treat the `predict` method as a regular function. Therefore, creating an app from the model can be done using any of the myriad of frameworks and techniques available to application developers.\n",
"To use our model in an application, we can simply treat the `predict` method as a regular function. Therefore, creating an app from the model can be done using any of the myriad of frameworks and techniques available to application developers.\n",
"\n",
"However, most data scientists are not familiar with the world of web application development. So let's try using something that you do, at this point, know: Jupyter notebooks. It turns out that we can create a complete working web application using nothing but Jupyter notebooks! The two things we need to make this happen are:\n",
"\n",
"- IPython widgets (ipywidgets)\n",
"- Voilà\n",
"\n",
"*IPython widgets* are GUI components that bring together JavaScript and Python functionality in a web browser, and can be created and used within a Jupyter notebook. For instance, the image cleaner that we saw earlier in this chapter is entirely written with IPython widgets. However, we don't want to require users of our application to have to run Jupyter themselves.\n",
"*IPython widgets* are GUI components that bring together JavaScript and Python functionality in a web browser, and can be created and used within a Jupyter notebook. For instance, the image cleaner that we saw earlier in this chapter is entirely written with IPython widgets. However, we don't want to require users of our application to run Jupyter themselves.\n",
"\n",
"That is why *Voilà* exists. It is a system for making applications consisting of IPython widgets available to end-users, without them having to use Jupyter at all. Voilà is taking advantage of the fact that a notebook _already is_ a kind of web application, just a rather complex one that depends on another web application, Jupyter itself. Essentially, it helps us automatically convert the complex web application which we've already implicitly made (the notebook) into a simpler, easier-to-deploy web application, which functions like a normal web application rather than like a notebook.\n",
"That is why *Voilà* exists. It is a system for making applications consisting of IPython widgets available to end-users, without them having to use Jupyter at all. Voilà is taking advantage of the fact that a notebook _already is_ a kind of web application, just a rather complex one that depends on another web application: Jupyter itself. Essentially, it helps us automatically convert the complex web application which we've already implicitly made (the notebook) into a simpler, easier-to-deploy web application, which functions like a normal web application rather than like a notebook.\n",
"\n",
"But we still have the advantage of developing in a notebook. So with ipywidgets, we can build up our GUI step by step. We will use this approach to create a simple image classifier. First, we need a file upload widget:"
]
@ -1545,7 +1538,7 @@
"source": [
"`Prediction: grizzly; Probability: 1.0000`\n",
"\n",
"We'll need a button to do the classification, it looks exactly like the upload button."
"We'll need a button to do the classification; it looks exactly like the upload button."
]
},
{
@ -1688,7 +1681,7 @@
"\n",
"Cells which begin with a `!` do not contain Python code, but instead contain code which is passed to your shell, such as bash, power shell in windows, or so forth. If you are comfortable using the command line (which we'll be learning about later in this book), you can of course simply type these two lines (without the `!` prefix) directly into your terminal. In this case, the first line installs the voila library and application, and the second connects it to your existing Jupyter notebook.\n",
"\n",
"Voilà runs Jupyter notebooks, just like the Jupyter notebook server you are using now does, except that it does something very important: it removes all of the cell inputs, and only shows output (including ipywidgets), along with your markdown cells. So what's left is a web application! To view your notebook as a voila web application replace the word \"notebooks\" in your browser's URL with: \"voila/render\". You will see the same content as your notebook, but without any of the code cells.\n",
"Voilà runs Jupyter notebooks, just like the Jupyter notebook server you are using now does, except that it does something very important: it removes all of the cell inputs, and only shows output (including ipywidgets), along with your markdown cells. So what's left is a web application! To view your notebook as a voila web application, replace the word \"notebooks\" in your browser's URL with: \"voila/render\". You will see the same content as your notebook, but without any of the code cells.\n",
"\n",
"Of course, you don't need to use Voilà or ipywidgets. Your model is just a function you can call: `pred,pred_idx,probs = learn.predict(img)` . So you can use it with any framework, hosted on any platform. And you can take something you've prototyped in ipywidgets and Voilà and later convert it into a regular web application. We're showing you this approach in the book because we think it's a great way for data scientists and other folks that aren't web development experts to create applications from their models.\n",
"\n",
@ -1710,8 +1703,8 @@
"\n",
"- As we've seen, GPUs are only useful when they do lots of identical work in parallel. If you're doing (say) image classification, then you'll normally be classifying just one user's image at a time, and there isn't normally enough work to do in a single image to keep a GPU busy for long enough for it to be very efficient. So a CPU will often be more cost effective.\n",
"- An alternative could be to wait for a few users to submit their images, and then batch them up, and do them all at once on a GPU. But then you're asking your users to wait, rather than getting answers straight away! And you need a high volume site for this to be workable. If you do need this functionality, you can use a tool such as Microsoft's [ONNX Runtime](https://github.com/microsoft/onnxruntime), or [AWS Sagemaker](https://aws.amazon.com/sagemaker/)\n",
"- The complexities of dealing with GPU inference are significant. In particular, the GPU's memory will need careful manual management, and you'll need some careful queueing system to ensure you only do one batch at a time\n",
"- There's a lot more market competition in CPU servers than GPU, as a result of which there's much cheaper options available for CPU servers.\n",
"- The complexities of dealing with GPU inference are significant. In particular, the GPU's memory will need careful manual management, and you'll need some careful queueing system to ensure you only do one batch at a time.\n",
"- There's a lot more market competition in CPU servers than GPU, as a result of which there are much cheaper options available for CPU servers.\n",
"\n",
"Because of the complexity of GPU serving, many systems have sprung up to try to automate this. However, managing and running these systems is also complex, and generally requires compiling your model into a different form that's specialized for that system. It doesn't make sense to deal with this complexity until/unless your app gets popular enough that it makes clear financial sense for you to do so."
]
@ -1741,7 +1734,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The first time you do this Binder will take around 5 minutes to build your site. In other words, it is finding a virtual machine which can run your app, allocating storage, collecting the files needed for Jupyter, for your notebook, and for presenting your notebook as a web application. It's doing all of this behind the scenes.\n",
"The first time you do this, Binder will take around 5 minutes to build your site. In other words, it is finding a virtual machine which can run your app, allocating storage, collecting the files needed for Jupyter, for your notebook, and for presenting your notebook as a web application. It's doing all of this behind the scenes.\n",
"\n",
"Finally, once it has started the app running, it will navigate your browser to your new web app. You can share the URL you copied to allow others to access your app as well.\n",
"\n",
@ -1752,7 +1745,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"You may well want to deploy your application onto mobile devices, or edge devices such as a Raspberry Pi. There are a lot of libraries and frameworks to allow you to integrate a model directly into a mobile application. However these approaches tend to require a lot of extra steps and boilerplate, and do not always support all the PyTorch and fastai layers that your model might use. In addition, the work you do will depend on what kind of mobile devices you are targeting for deployment. So you might need to do some work to run on iOS devices, different work to run on newer Android devices, different work for older Android devices, etc. Instead, we recommend wherever possible that you deploy the model itself to a server, and have your mobile or edge application connect to it as a web service.\n",
"You may well want to deploy your application onto mobile devices, or edge devices such as a Raspberry Pi. There are a lot of libraries and frameworks to allow you to integrate a model directly into a mobile application. However, these approaches tend to require a lot of extra steps and boilerplate, and do not always support all the PyTorch and fastai layers that your model might use. In addition, the work you do will depend on what kind of mobile devices you are targeting for deployment. So you might need to do some work to run on iOS devices, different work to run on newer Android devices, different work for older Android devices, etc. Instead, we recommend wherever possible that you deploy the model itself to a server, and have your mobile or edge application connect to it as a web service.\n",
"\n",
"There are quite a few upsides to this approach. The initial installation is easier, because you only have to deploy a small GUI application, which connects to the server to do all the heavy lifting. More importantly perhaps, upgrades of that core logic can happen on your server, rather than needing to be distributed to all of your users. Your server can have a lot more memory and processing capacity than most edge devices, and it is far easier to scale those resources if your model becomes more demanding. The hardware that you will have on a server is going to be more standard and more easily supported by fastai and PyTorch, so you don't have to compile your model into a different form.\n",
"\n",
@ -1781,9 +1774,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In practice, a deep learning model will be just one piece of a much bigger system. As we discussed at the start of this chapter, a *data product* requires thinking about the entire end to end process within which our model lives. In this book, we can't hope to cover all the complexity of managing deployed data products, such as managing multiple versions of models, A/B testing, canarying, refreshing the data (should we just grow and grow our datasets all the time, or should we regularly remove some of the old data), handling data labelling, monitoring all this, detecting model rot, and so forth. However, there is an excellent book that covers many deployment issues, which is [Building Machine Learning Powered Applications](https://www.amazon.com/Building-Machine-Learning-Powered-Applications/dp/149204511X), by Emmanuel Ameisen. In this section, we will give an overview of some of the most important issues to consider.\n",
"In practice, a deep learning model will be just one piece of a much bigger system. As we discussed at the start of this chapter, a *data product* requires thinking about the entire end-to-end process within which our model lives. In this book, we can't hope to cover all the complexity of managing deployed data products, such as managing multiple versions of models, A/B testing, canarying, refreshing the data (should we just grow and grow our datasets all the time, or should we regularly remove some of the old data), handling data labelling, monitoring all this, detecting model rot, and so forth. However, there is an excellent book that covers many deployment issues, which is [Building Machine Learning Powered Applications](https://www.amazon.com/Building-Machine-Learning-Powered-Applications/dp/149204511X), by Emmanuel Ameisen. In this section, we will give an overview of some of the most important issues to consider.\n",
"\n",
"One of the biggest issues with this is that understanding and testing the behavior of a deep learning model is much more difficult than most code that you would write. With normal software development you can analyse the exact steps that the software is taking, and carefully study which of these steps match the desired behaviour that you are trying to create. But with a neural network the behavior emerges from the models attempt to match the training data, rather than being exactly defined.\n",
"One of the biggest issues to consider is that understanding and testing the behavior of a deep learning model is much more difficult than most code that you would write. With normal software development you can analyse the exact steps that the software is taking, and carefully study which of these steps match the desired behaviour that you are trying to create. But with a neural network the behavior emerges from the model's attempt to match the training data, rather than being exactly defined.\n",
"\n",
"This can result in disaster! For instance, let's say you really were rolling out a bear detection system which will be attached to video cameras around the campsite, and will warn campers of incoming bears. If we used a model trained with the dataset we downloaded, there are going to be all kinds of problems in practice, such as:\n",
"\n",
@ -1829,7 +1822,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"> J: I started a company 20 years ago called _Optimal Decisions_ which used machine learning and optimisation to help giant insurance companies set their pricing, impacting tens of billions of dollars of risks. We used the approaches described above to manage the potential downsides of something that might go wrong. Also, before we worked with our clients to put anything in production, we tried to simulate the impact by testing the end to end system on their previous year's data. It was always quite a nerve-wracking process, putting these new algorithms in production, but every rollout was successful."
"> J: I started a company 20 years ago called _Optimal Decisions_ which used machine learning and optimisation to help giant insurance companies set their pricing, impacting tens of billions of dollars of risks. We used the approaches described above to manage the potential downsides of something that might go wrong. Also, before we worked with our clients to put anything in production, we tried to simulate the impact by testing the end-to-end system on their previous year's data. It was always quite a nerve-wracking process, putting these new algorithms in production, but every rollout was successful."
]
},
{