Merge pull request #116 from joe-bender/patch-3
Writing and formatting fixes, 01_intro.ipynb
This commit is contained in:
commit
17c1ba2c33
@ -1575,7 +1575,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"> important: When you train a model, you must **always** have both a training set, and a validation set, and must measure the accuracy of your model only on the validation set. If you train for too long, with not enough data, you will see the accuracy of your model start to get worse; this is called **over-fitting**. fastai defaults `valid_pct` to `0.2`, so even if you forget, fastai will create a validation set for you!"
|
||||
"> important: When you train a model, you must **always** have both a training set and a validation set, and must measure the accuracy of your model only on the validation set. If you train for too long, with not enough data, you will see the accuracy of your model start to get worse; this is called **over-fitting**. fastai defaults `valid_pct` to `0.2`, so even if you forget, fastai will create a validation set for you!"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1715,7 +1715,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"As you can see by looking at the right-hand side of this picture, the features are now able to identify and match with higher levels semantic components, such as car wheels, text, and flower petals. Using these components, layers four and five can identify even higher-level concepts, as shown in <<img_layer4>>."
|
||||
"As you can see by looking at the right-hand side of this picture, the features are now able to identify and match with higher-level semantic components, such as car wheels, text, and flower petals. Using these components, layers four and five can identify even higher-level concepts, as shown in <<img_layer4>>."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1775,7 +1775,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Another interesting fast.ai student project example comes from Gleb Esman. He was working on fraud detection at Splunk, and was working with a dataset of users' mouse movements and mouse clicks. He turned these into pictures by drawing an image where the position, speed and acceleration of the mouse was displayed using coloured lines, and the clicks were displayed using [small coloured circles](https://www.splunk.com/en_us/blog/security/deep-learning-with-splunk-and-tensorflow-for-security-catching-the-fraudster-in-neural-networks-with-behavioral-biometrics.html) as shown in <<splunk>>. He then fed this into an image recognition model just like the one we've shown in this chapter, and it worked so well that had led to a patent for this approach to fraud analytics!"
|
||||
"Another interesting fast.ai student project example comes from Gleb Esman. He was working on fraud detection at Splunk, and was working with a dataset of users' mouse movements and mouse clicks. He turned these into pictures by drawing an image where the position, speed and acceleration of the mouse was displayed using coloured lines, and the clicks were displayed using [small coloured circles](https://www.splunk.com/en_us/blog/security/deep-learning-with-splunk-and-tensorflow-for-security-catching-the-fraudster-in-neural-networks-with-behavioral-biometrics.html) as shown in <<splunk>>. He then fed this into an image recognition model just like the one we've shown in this chapter, and it worked so well that it led to a patent for this approach to fraud analytics!"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1817,7 +1817,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"As you can see, the different types of malware look very distinctive to the human eye. The model they train based on this image representation was more accurate at malware classification than any previous approach shown in the academic literature. This suggests a good rule of thumb for converting a dataset into an image representation: if the human eye can recognize categories from the images, then a deep learning model should be able to do so too.\n",
|
||||
"As you can see, the different types of malware look very distinctive to the human eye. The model they trained based on this image representation was more accurate at malware classification than any previous approach shown in the academic literature. This suggests a good rule of thumb for converting a dataset into an image representation: if the human eye can recognize categories from the images, then a deep learning model should be able to do so too.\n",
|
||||
"\n",
|
||||
"In general, you'll find that a small number of general approaches in deep learning can go a long way, if you're a bit creative in how you represent your data! You shouldn't think of approaches like the above as \"hacky workarounds\", since actually they often (as here) beat previously state of the art results. These really are the right way to think about these problem domains."
|
||||
]
|
||||
@ -1872,7 +1872,7 @@
|
||||
"\n",
|
||||
"In order to make the training process go faster, we might start with a *pretrained model*, a model which has already been trained on someone else's data. We then adapt it to our data by training it a bit more on our data, a process called *fine tuning*.\n",
|
||||
"\n",
|
||||
"When we train a model, a key concern is to ensure that our model *generalizes* -- that is, that it learns general lessons from our data which also apply to new items it will encounter, so that it can make good predictions on those items. The risk is that if we train our model badly, instead of learning general lessons it effectively memorizes what it has already seen, and then it will make poor predictions about new images. Such a failure is called *overfitting*. In order to avoid this, we always divide our data into two parts, the *training set* and the *validation set*. We train the model by showing it only the *training set* and then we evaluate how well the model is doing by seeing how well it predicts on items from the *validation set* . In this way, we check if the lessons the model learns from the training set are lessons that generalize to the validation set. In order for a person to assess how well the model is doing on the validation set overall, we define a *metric* . During the training process, when the model has seen every item in the training set, we call that an *epoch*.\n",
|
||||
"When we train a model, a key concern is to ensure that our model *generalizes* -- that is, that it learns general lessons from our data which also apply to new items it will encounter, so that it can make good predictions on those items. The risk is that if we train our model badly, instead of learning general lessons it effectively memorizes what it has already seen, and then it will make poor predictions about new images. Such a failure is called *overfitting*. In order to avoid this, we always divide our data into two parts, the *training set* and the *validation set*. We train the model by showing it only the *training set* and then we evaluate how well the model is doing by seeing how well it predicts on items from the *validation set*. In this way, we check if the lessons the model learns from the training set are lessons that generalize to the validation set. In order for a person to assess how well the model is doing on the validation set overall, we define a *metric*. During the training process, when the model has seen every item in the training set, we call that an *epoch*.\n",
|
||||
"\n",
|
||||
"All these concepts apply to machine learning in general. That is, they apply to all sorts of schemes for defining a model by training it with data. What makes deep learning distinctive is a particular class of architectures, the architectures based on *neural networks*. In particular, tasks like image classification rely heavily on *convolutional neural networks*, which we will discuss shortly."
|
||||
]
|
||||
@ -2181,14 +2181,14 @@
|
||||
"learn.fine_tune(4, 1e-2)\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"This reduces the batch size to 32 (we will explain this later). If you keep hitting the same error, change 32 by 16."
|
||||
"This reduces the batch size to 32 (we will explain this later). If you keep hitting the same error, change 32 to 16."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This model is using the IMDb dataset from the paper [Learning Word Vectors for Sentiment Analysis]((https://ai.stanford.edu/~amaas/data/sentiment/)). It works well with movie reviews of many thousands of words. But let's test it out on a very short one, to see it does its thing:"
|
||||
"This model is using the IMDb dataset from the paper [Learning Word Vectors for Sentiment Analysis](https://ai.stanford.edu/~amaas/data/sentiment/). It works well with movie reviews of many thousands of words. But let's test it out on a very short one, to see it do its thing:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -2234,7 +2234,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Sidebar: The order matter"
|
||||
"### Sidebar: The order matters"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -2276,7 +2276,7 @@
|
||||
"doc(learn.predict)\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"This will make a small window pop with content like this:\n",
|
||||
"This will make a small window pop up with content like this:\n",
|
||||
"\n",
|
||||
"<img src=\"images/doc_ex.png\" width=\"600\">"
|
||||
]
|
||||
@ -2656,7 +2656,7 @@
|
||||
"\n",
|
||||
"Most datasets used in this book took the creators a lot of work to build. For instance, later in the book we’ll be showing you how to create a model that can translate between French and English. The key input to this is a French/English parallel text corpus prepared back in 2009 by Professor Chris Callison-Burch of the University of Pennsylvania. This dataset contains over 20 million sentence pairs in French and English. He built the dataset in a really clever way: by crawling millions of Canadian web pages (which are often multi-lingual) and then using a set of simple heuristics to transform French URLs onto English URLs.\n",
|
||||
"\n",
|
||||
"As you look at datasets throughout this book, think about where they might have come from, and how they might have been curated. Then, think about what kinds of interesting dataset you could create for your own projects. (We’ll even take you step by step through the process of creating your own image dataset soon.)\n",
|
||||
"As you look at datasets throughout this book, think about where they might have come from, and how they might have been curated. Then, think about what kinds of interesting datasets you could create for your own projects. (We’ll even take you step by step through the process of creating your own image dataset soon.)\n",
|
||||
"\n",
|
||||
"fast.ai has spent a lot of time creating cutdown versions of popular datasets that are specially designed to support rapid prototyping and experimentation, and to be easier to learn with. In this book we will often start by using one of the cutdown versions, and we later on scale up to the full-size version (just as we're doing in this chapter!) In fact, this is how the world’s top practitioners do their modelling projects in practice; they do most of their experimentation and prototyping with subsets of their data, and only use the full dataset when they have a good understanding of what they have to do."
|
||||
]
|
||||
@ -2672,7 +2672,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Each of the models we trained showed a training and validation loss. A good validation set is one of the most important pieces of your training, let's see why and learn how to create one."
|
||||
"Each of the models we trained showed a training and validation loss. A good validation set is one of the most important pieces of your training. Let's see why and learn how to create one."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -2690,7 +2690,7 @@
|
||||
"\n",
|
||||
"It is in order to avoid this that our first step was to split our dataset into two sets, the *training set* (which our model sees in training) and the *validation set*, also known as the *development set* (which is used only for evaluation). This lets us test that the model learns lessons from the training data which generalize to new data, the validation data.\n",
|
||||
"\n",
|
||||
"One way to understand this situation is that, in a sense, we don't want our model to get good results by \"cheating\". If it predicts well on a data item, that should be because it has learned principles that govern that kind of item, and not because the model has been shaped by *actually having seeing that particular item*.\n",
|
||||
"One way to understand this situation is that, in a sense, we don't want our model to get good results by \"cheating\". If it predicts well on a data item, that should be because it has learned principles that govern that kind of item, and not because the model has been shaped by *actually having seen that particular item*.\n",
|
||||
"\n",
|
||||
"Splitting off our validation data means our model never sees it in training, and so is completely untainted by it, and is not cheating in any way. Right?\n",
|
||||
"\n",
|
||||
|
Loading…
Reference in New Issue
Block a user