Rewor and clean

2020-04-18 05:35:07 -07:00 · 2020-04-18 05:35:07 -07:00 · e55b399c5d
commit e55b399c5d
parent b8184baaf0
4 changed files with 36 additions and 35 deletions
--- a/01_intro.ipynb
+++ b/01_intro.ipynb
@ -28,9 +28,9 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Hello, and thank you for letting us join you on your deep learning journey, however far along that you may be! If you are a complete beginner to deep learning and machine learning, then you are most welcome here. Our only expectation is that you already know how to code, preferably in Python.\n",
+    "Hello, and thank you for letting us join you on your deep learning journey, however far along it you may be! If you are a complete beginner to deep learning and machine learning, then you are most welcome here. Our only expectation is that you already know how to code, preferably in Python.\n",
    "\n",
-    "> note: If you don't have any experience coding, that's OK too! The first three chapters have been explicitly written in a way that will allow executives, product managers, etc to understand the most important things they'll need to know about deep learning. When you see bits of code in the text, try to look over them to get an intuitive sense of what they're doing. We'll explain them line by line. The details of the syntax are not nearly as important as the high level understanding of what's going on.\n",
+    "> note: If you don't have any experience coding, that's OK too! The first three chapters have been explicitly written in a way that will allow executives, product managers, etc. to understand the most important things they'll need to know about deep learning. When you see bits of code in the text, try to look over them to get an intuitive sense of what they're doing. We'll explain them line by line. The details of the syntax are not nearly as important as the high level understanding of what's going on.\n",
    "\n",
    "If you are already a confident deep learning practitioner, then you will also find a lot here. In this book we will be showing you how to achieve world-class results, including techniques from the latest research. As we will show, this doesn't require advanced mathematical training, or years of study. It just requires a bit of common sense and tenacity."
   ]
@ -60,7 +60,7 @@
    "|======\n",
    "```\n",
    "\n",
-    "Deep learning is a computer technique to extract and transform data – with use cases ranging from human speech recognition to animal imagery classification – by using multiple layers of neural networks. Each of these layers takes the inputs from previous layers and progressively refines them. The algorithms involved can train the layers by learning to minimize errors and improve their own accuracy (we will discuss those in details in the next section)."
+    "Deep learning is a computer technique to extract and transform data – with use cases ranging from human speech recognition to animal imagery classification – by using multiple layers of neural networks. Each of these layers takes its inputs from previous layers and progressively refines them. The layers are trained by algorithms that minimize their errors and improve their accuracy. In this way, the network learns to perform a specified task. We will discuss training algorithms in detail in the next section."
   ]
  },
  {
@ -73,11 +73,11 @@
    "\n",
    "- Natural Language Processing (NLP):: answering questions; speech recognition; summarizing documents; classifying documents; finding names, dates, etc. in documents; searching for articles mentioning a concept\n",
    "- Computer vision:: satellite and drone imagery interpretation (e.g. for disaster resilience); face recognition; image captioning; reading traffic signs; locating pedestrians and vehicles in autonomous vehicles\n",
-    "- Medicine:: Finding anomalies in radiology images, including CT, MRI, and x-ray; counting features in pathology slides; measuring features in ultrasounds; diagnosing diabetic retinopathy\n",
+    "- Medicine:: Finding anomalies in radiology images, including CT, MRI, and X-ray; counting features in pathology slides; measuring features in ultrasounds; diagnosing diabetic retinopathy\n",
    "- Biology:: folding proteins; classifying proteins; many genomics tasks, such as tumor-normal sequencing and classifying clinically actionable genetic mutations; cell classification; analyzing protein/protein interactions\n",
    "- Image generation:: Colorizing images; increasing image resolution; removing noise from images; converting images to art in the style of famous artists\n",
    "- Recommendation systems:: web search; product recommendations; home page layout\n",
-    "- Playing games:: Better than humans and better than any other computer algorithm at Chess, Go, most Atari videogames, many real-time strategy games\n",
+    "- Playing games:: Better than humans and better than any other computer algorithm at Chess, Go, most Atari videogames, and many real-time strategy games\n",
    "- Robotics:: handling objects that are challenging to locate (e.g. transparent, shiny, lack of texture) or hard to pick up\n",
    "- Other applications:: financial and logistical forecasting; text to speech; much much more..."
   ]
@ -111,7 +111,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "They realised that a simplified model of a real neuron could be represented using simple addition and thresholding as shown in <<neuron>>. Pitts was self-taught, and, by age 12, had received an offer to study at Cambridge with the great Bertrand Russell. He did not take up this invitation, and indeed throughout his life did not accept any offers of advanced degrees or positions of authority. Most of his famous work was done whilst he was homeless. Despite his lack of an officially recognized position, and increasing social isolation, his work with McCulloch was influential, and was picked up by a psychologist named Frank Rosenblatt."
+    "They realised that a simplified model of a real neuron could be represented using simple addition and thresholding as shown in <<neuron>>. Pitts was self-taught, and, by age 12, had received an offer to study at Cambridge with the great Bertrand Russell. He did not take up this invitation, and indeed throughout his life did not accept any offers of advanced degrees or positions of authority. Most of his famous work was done whilst he was homeless. Despite his lack of an officially recognized position and increasing social isolation, his work with McCulloch was influential, and was taken up by a psychologist named Frank Rosenblatt."
   ]
  },
  {
@ -125,9 +125,9 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Rosenblatt further developed the artificial neuron to give it the ability to learn. Even more importantly, he worked on building the first device that actually used these principles: The Mark I Perceptron. Rosenblatt wrote about this work: \"we are about to witness the birth of such a machine – a machine capable of perceiving, recognizing and identifying its surroundings without any human training or control\". The perceptron was built, and was able to successfully recognize simple shapes.\n",
+    "Rosenblatt further developed the artificial neuron to give it the ability to learn. Even more importantly, he worked on building the first device that actually used these principles, The Mark I Perceptron. Rosenblatt wrote about this work: \"We are about to witness the birth of such a machine – a machine capable of perceiving, recognizing and identifying its surroundings without any human training or control\". The perceptron was built, and was able to successfully recognize simple shapes.\n",
    "\n",
-    "An MIT professor named Marvin Minsky (who was a grade behind Rosenblatt at the same high school!) along with Seymour Papert wrote a book, called \"Perceptrons\", about Rosenblatt's invention. They showed that a single layer of these devices was unable to learn some simple, critical mathematical functions (such as XOR). In the same book, they also showed that using multiple layers of the devices would allow these limitations to be addressed. Unfortunately, only the first of these insights was widely recognized, as a result of which the global academic community nearly entirely gave up on neural networks for the next two decades."
+    "An MIT professor named Marvin Minsky (who was a grade behind Rosenblatt at the same high school!) along with Seymour Papert wrote a book, called \"Perceptrons\", about Rosenblatt's invention. They showed that a single layer of these devices was unable to learn some simple, critical mathematical functions (such as XOR). In the same book, they also showed that using multiple layers of the devices would allow these limitations to be addressed. Unfortunately, only the first of these insights was widely recognized. As a result, the global academic community nearly entirely gave up on neural networks for the next two decades."
   ]
  },
  {
@ -138,7 +138,8 @@
    "\n",
    "> : _…people are smarter than today's computers because the brain employs a basic computational architecture that is more suited to deal with a central aspect of the natural information processing tasks that people are so good at. …we will introduce a computational framework for modeling cognitive processes that seems… closer than other frameworks to the style of computation as it might be done by the brain._ (PDP, chapter 1)\n",
    "\n",
-    "The premise that PDP is using here is that traditional computer programs work very differently to brains, and that might be why computer programs had (at that point) been so bad at doing things that brains find easy (such as recognizing objects in pictures). The authors claim that the PDP approach is \"closer than other frameworks\" to how the brain works, and therefore it might be better able to handle these kinds of tasks.\n",
+    "The premise that PDP is using here is that traditional computer programs work very differently to brains, and that might be why computer programs had benn (at that point) so bad at doing things that brains find easy (such as recognizing objects in pictures). The authors claim that the PDP approach is \"closer \n",
+    "than other frameworks\" to how the brain works, and therefore it might be better able to handle these kinds of tasks.\n",
    "\n",
    "In fact, the approach laid out in PDP is very similar to the approach used in today's neural networks. The book defined \"Parallel Distributed Processing\" as requiring:\n",
    "\n",
@ -147,15 +148,15 @@
    "1. An *output function* for each unit \n",
    "1. A *pattern of connectivity* among units \n",
    "1. A *propagation rule* for propagating patterns of activities through the network of connectivities \n",
-    "1. An *activation rule* for combining the inputs impinging on a unit with the current state of that unit to produce a new level of activation for the unit\n",
+    "1. An *activation rule* for combining the inputs impinging on a unit with the current state of that unit to produce an output for the unit\n",
    "1. A *learning rule* whereby patterns of connectivity are modified by experience \n",
    "1. An *environment* within which the system must operate\n",
    "\n",
    "We will see in this book that modern neural networks handle each of these requirements.\n",
    "\n",
-    "In the 1980's most models were built with a second layer of neurons, thus avoiding the problem that had been identified by Minsky (this was their \"pattern of connectivity among units\", to use the framework above). And indeed, neural networks were widely used during the 80s and 90s for real, practical projects. However, again a misunderstanding of the theoretical issues held back the field. In theory, adding just one extra layer of neurons was enough to allow any mathematical model to be approximated with these neural networks, but in practice such networks were often too big and slow to be useful.\n",
+    "In the 1980's most models were built with a second layer of neurons, thus avoiding the problem that had been identified by Minsky (this was their \"pattern of connectivity among units\", to use the framework above). And indeed, neural networks were widely used during the 80s and 90s for real, practical projects. However, again a misunderstanding of the theoretical issues held back the field. In theory, adding just one extra layer of neurons was enough to allow any mathematical function to be approximated with these neural networks, but in practice such networks were often too big and too slow to be useful.\n",
    "\n",
-    "Although researchers showed 30 years ago that to get practical good performance you need to use even more layers of neurons, it is only in the last decade that this has been more widely appreciated. Neural networks are now finally living up to their potential, thanks to the understanding to use more layers as well as improved ability to do so thanks to improvements in computer hardware, increases in data availability, and algorithmic tweaks that allow neural networks to be trained faster and more easily. We now have what Rosenblatt had promised: \"a machine capable of perceiving, recognizing and identifying its surroundings without any human training or control\".\n",
+    "Although researchers showed 30 years ago that to get practical good performance you need to use even more layers of neurons, it is only in the last decade that this principle has been more widely appreciatedand applied. Neural networks are now finally living up to their potential, thanks to the use of more layers, coupled with the capacity to do so due to improvements in computer hardware, increases in data availability, and algorithmic tweaks that allow neural networks to be trained faster and more easily. We now have what Rosenblatt had promised: \"a machine capable of perceiving, recognizing and identifying its surroundings without any human training or control\".\n",
    "\n",
    "This is what you will learn how to build in this book."
   ]
@ -1575,7 +1576,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "> important: When you train a model, you must **always** have both a training set, and a validation set, and must measure the accuracy of your model only on the validation set. If you train for too long, with not enough data, you will see the accuracy of your model start to get worse; this is called **over-fitting**. fastai defaults `valid_pct` to `0.2`, so even if you forget, fastai will create a validation set for you!"
+    "> important: When you train a model, you must **always** have both a training set and a validation set, and must measure the accuracy of your model only on the validation set. If you train for too long, with not enough data, you will see the accuracy of your model start to get worse; this is called **over-fitting**. fastai defaults `valid_pct` to `0.2`, so even if you forget, fastai will create a validation set for you!"
   ]
  },
  {
@ -1715,7 +1716,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "As you can see by looking at the right-hand side of this picture, the features are now able to identify and match with higher levels semantic components, such as car wheels, text, and flower petals. Using these components, layers four and five can identify even higher-level concepts, as shown in <<img_layer4>>."
+    "As you can see by looking at the right-hand side of this picture, the features are now able to identify and match with higher-level semantic components, such as car wheels, text, and flower petals. Using these components, layers four and five can identify even higher-level concepts, as shown in <<img_layer4>>."
   ]
  },
  {
@ -1775,7 +1776,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Another interesting fast.ai student project example comes from Gleb Esman. He was working on fraud detection at Splunk, and was working with a dataset of users' mouse movements and mouse clicks. He turned these into pictures by drawing an image where the position, speed and acceleration of the mouse was displayed using coloured lines, and the clicks were displayed using [small coloured circles](https://www.splunk.com/en_us/blog/security/deep-learning-with-splunk-and-tensorflow-for-security-catching-the-fraudster-in-neural-networks-with-behavioral-biometrics.html) as shown in <<splunk>>. He then fed this into an image recognition model just like the one we've shown in this chapter, and it worked so well that had led to a patent for this approach to fraud analytics!"
+    "Another interesting fast.ai student project example comes from Gleb Esman. He was working on fraud detection at Splunk, and was working with a dataset of users' mouse movements and mouse clicks. He turned these into pictures by drawing an image where the position, speed and acceleration of the mouse was displayed using coloured lines, and the clicks were displayed using [small coloured circles](https://www.splunk.com/en_us/blog/security/deep-learning-with-splunk-and-tensorflow-for-security-catching-the-fraudster-in-neural-networks-with-behavioral-biometrics.html) as shown in <<splunk>>. He then fed this into an image recognition model just like the one we've shown in this chapter, and it worked so well that it led to a patent for this approach to fraud analytics!"
   ]
  },
  {
@ -1789,7 +1790,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Another examples comes from the paper [Malware Classification with Deep Convolutional Neural Networks](https://ieeexplore.ieee.org/abstract/document/8328749) which explains that \"the malware binary file is divided into 8-bit sequences which are then converted to equivalent decimal values. This decimal vector is reshaped and gray-scale image is generated that represent the malware sample\", like in <<malware_proc>>."
+    "Another example comes from the paper [Malware Classification with Deep Convolutional Neural Networks](https://ieeexplore.ieee.org/abstract/document/8328749) which explains that \"the malware binary file is divided into 8-bit sequences which are then converted to equivalent decimal values. This decimal vector is reshaped and gray-scale image is generated that represent the malware sample\", like in <<malware_proc>>."
   ]
  },
  {
@ -1817,7 +1818,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "As you can see, the different types of malware look very distinctive to the human eye. The model they train based on this image representation was more accurate at malware classification than any previous approach shown in the academic literature. This suggests a good rule of thumb for converting a dataset into an image representation: if the human eye can recognize categories from the images, then a deep learning model should be able to do so too.\n",
+    "As you can see, the different types of malware look very distinctive to the human eye. The model they trained based on this image representation was more accurate at malware classification than any previous approach shown in the academic literature. This suggests a good rule of thumb for converting a dataset into an image representation: if the human eye can recognize categories from the images, then a deep learning model should be able to do so too.\n",
    "\n",
    "In general, you'll find that a small number of general approaches in deep learning can go a long way, if you're a bit creative in how you represent your data! You shouldn't think of approaches like the above as \"hacky workarounds\", since actually they often (as here) beat previously state of the art results. These really are the right way to think about these problem domains."
   ]
@ -2181,14 +2182,14 @@
    "learn.fine_tune(4, 1e-2)\n",
    "```\n",
    "\n",
-    "This reduces the batch size to 32 (we will explain this later). If you keep hitting the same error, change 32 by 16."
+    "This reduces the batch size to 32 (we will explain this later). If you keep hitting the same error, change 32 to 16."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "This model is using the IMDb dataset from the paper [Learning Word Vectors for Sentiment Analysis]((https://ai.stanford.edu/~amaas/data/sentiment/)). It works well with movie reviews of many thousands of words. But let's test it out on a very short one, to see it does its thing:"
+    "This model is using the IMDb dataset from the paper [Learning Word Vectors for Sentiment Analysis](https://ai.stanford.edu/~amaas/data/sentiment/). It works well with movie reviews of many thousands of words. But let's test it out on a very short one, to see it does its thing:"
   ]
  },
  {
@ -2234,7 +2235,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "### Sidebar: The order matter"
+    "### Sidebar: The order matters"
   ]
  },
  {
@ -2276,7 +2277,7 @@
    "doc(learn.predict)\n",
    "```\n",
    "\n",
-    "This will make a small window pop with content like this:\n",
+    "This will make a small window pop up with content like this:\n",
    "\n",
    "<img src=\"images/doc_ex.png\" width=\"600\">"
   ]
@ -2656,7 +2657,7 @@
    "\n",
    "Most datasets used in this book took the creators a lot of work to build. For instance, later in the book we’ll be showing you how to create a model that can translate between French and English. The key input to this is a French/English parallel text corpus prepared back in 2009 by Professor Chris Callison-Burch of the University of Pennsylvania. This dataset contains over 20 million sentence pairs in French and English. He built the dataset in a really clever way: by crawling millions of Canadian web pages (which are often multi-lingual) and then using a set of simple heuristics to transform French URLs onto English URLs.\n",
    "\n",
-    "As you look at datasets throughout this book, think about where they might have come from, and how they might have been curated. Then, think about what kinds of interesting dataset you could create for your own projects. (We’ll even take you step by step through the process of creating your own image dataset soon.)\n",
+    "As you look at datasets throughout this book, think about where they might have come from, and how they might have been curated. Then, think about what kinds of interesting datasets you could create for your own projects. (We’ll even take you step by step through the process of creating your own image dataset soon.)\n",
    "\n",
    "fast.ai has spent a lot of time creating cutdown versions of popular datasets that are specially designed to support rapid prototyping and experimentation, and to be easier to learn with. In this book we will often start by using one of the cutdown versions, and we later on scale up to the full-size version (just as we're doing in this chapter!) In fact, this is how the world’s top practitioners do their modelling projects in practice; they do most of their experimentation and prototyping with subsets of their data, and only use the full dataset when they have a good understanding of what they have to do."
   ]
@ -2672,7 +2673,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Each of the models we trained showed a training and validation loss. A good validation set is one of the most important pieces of your training, let's see why and learn how to create one."
+    "Each of the models we trained showed a training and validation loss. A good validation set is one of the most important pieces of your training. Let's see why and learn how to create one."
   ]
  },
  {
@ -2690,7 +2691,7 @@
    "\n",
    "It is in order to avoid this that our first step was to split our dataset into two sets, the *training set* (which our model sees in training) and the *validation set*, also known as the *development set* (which is used only for evaluation). This lets us test that the model learns lessons from the training data which generalize to new data, the validation data.\n",
    "\n",
-    "One way to understand this situation is that, in a sense, we don't want our model to get good results by \"cheating\". If it predicts well on a data item, that should be because it has learned principles that govern that kind of item, and not because the model has been shaped by *actually having seeing that particular item*.\n",
+    "One way to understand this situation is that, in a sense, we don't want our model to get good results by \"cheating\". If it predicts well on a data item, that should be because it has learned principles that govern that kind of item, and not because the model has been shaped by *actually having seen that particular item*.\n",
    "\n",
    "Splitting off our validation data means our model never sees it in training, and so is completely untainted by it, and is not cheating in any way. Right?\n",
    "\n",
--- a/05_pet_breeds.ipynb
+++ b/05_pet_breeds.ipynb
@ -53,7 +53,7 @@
   "source": [
    "In our very first model we learnt how to classify dogs versus cats. Just a few years ago this was considered a very challenging task. But today, it is far too easy! We will not be able to show you the nuances of training models with this problem, because we get the nearly perfect result without worrying about any of the details. But it turns out that the same dataset also allows us to work on a much more challenging problem: figuring out what breed of pet is shown in each image.\n",
    "\n",
-    "In the first chapter we presented the applications as already solved problems. But this is not how things work in real life. We start with some dataset which we know nothing about. We have to understand how it is put together, how to extract the data we need from it, and what that data looks like. For the rest of this book we will be showing you how to solve these problems in practice, including all of these intermediate steps necessary to understand the data that we working with and test our modelling as we go.\n",
+    "In the first chapter we presented the applications as already solved problems. But this is not how things work in real life. We start with some dataset which we know nothing about. We have to understand how it is put together, how to extract the data we need from it, and what that data looks like. For the rest of this book we will be showing you how to solve these problems in practice, including all of these intermediate steps necessary to understand the data that we are working with and test our modelling as we go.\n",
    "\n",
    "We have already downloaded the pets dataset. We can get a path to this dataset using the same code we saw in <<chapter_intro>>:"
   ]
@ -77,7 +77,7 @@
    "- Individual files representing items of data, such as text documents or images, possibly organised into folders or with filenames representing information about those items, or\n",
    "- A table of data, such as in CSV format, where each row is an item, each row which may include filenames providing a connection between the data in the table and data in other formats such as text documents and images.\n",
    "\n",
-    "There are exceptions to these rules, particularly in domains such as genomics, where there can be binary database formats or even network streams, but overall the vast majority of the datasets your work with use some combination of the above two formats.\n",
+    "There are exceptions to these rules, particularly in domains such as genomics, where there can be binary database formats or even network streams, but overall the vast majority of the datasets you work with use some combination of the above two formats.\n",
    "\n",
    "To see what is in our dataset we can use the ls method:"
   ]
@ -145,7 +145,7 @@
   "source": [
    "Most functions and methods in fastai which return a collection use a class called `L`. `L` can be thought of as an enhanced version of the ordinary Python `list` type, with added conveniences for common operations. For instance, when we display an object of this class in a notebook it appears in the format you see above. The first thing that is shown is the number of items in the collection, prefixed with a `#`. You'll also see in the above output that the list is suffixed with a \"…\". This means that only the first few items are displayed — which is a good thing, because we would not want more than 7000 filenames on our screen!\n",
    "\n",
-    "By examining these filenames, we see how they appear to be structured. Each file name contains the pet breed, and then an_character, a number, and finally the file extension. We need to create a piece of code that extracts the breed from a single `Path`. Jupyter notebook makes this easy, because we can gradually build up something that works, and then use it for the entire dataset. We do have to be careful to not make too many assumptions at this point. For instance, if you look carefully you may notice that some of the pet breeds contain multiple words, so we cannot simply break at the first `_` character that we find. To allow us to test our code, let's pick out one of these filenames:"
+    "By examining these filenames, we see how they appear to be structured. Each file name contains the pet breed, and then an _ character, a number, and finally the file extension. We need to create a piece of code that extracts the breed from a single `Path`. Jupyter notebook makes this easy, because we can gradually build up something that works, and then use it for the entire dataset. We do have to be careful to not make too many assumptions at this point. For instance, if you look carefully you may notice that some of the pet breeds contain multiple words, so we cannot simply break at the first `_` character that we find. To allow us to test our code, let's pick out one of these filenames:"
   ]
  },
  {
@ -167,7 +167,7 @@
    "\n",
    "We do not have the space to give you a complete regular expression tutorial here, particularly because there are so many excellent ones online. And we know that many of you will already be familiar with this wonderful tool. If you're not, that is totally fine — this is a great opportunity for you to rectify that! We find that regular expressions are one of the most useful tools in our programming toolkit, and many of our students tell us that it is one of the things they are most excited to learn about. So head over to Google and search for *regular expressions tutorial* now, and then come back here after you've had a good look around. The book website also provides a list of our favorites.\n",
    "\n",
-    "> a: Not only are regular expressions dead handy, they also have interesting roots. They are \"regular\" because they were originally examples of a \"regular\" language, the lowest rung within the \"Chomsky hierarchy\", a grammar classification due to the same linguist Noam Chomskey who wrote _Syntactic Structures_, the pioneering work searching for the formal grammar underlying human language. This is one of the charms of computing: it may be that the hammer you reach for every day in fact came from a space ship.\n",
+    "> a: Not only are regular expressions dead handy, they also have interesting roots. They are \"regular\" because they were originally examples of a \"regular\" language, the lowest rung within the \"Chomsky hierarchy\", a grammar classification due to the same linguist Noam Chomsky who wrote _Syntactic Structures_, the pioneering work searching for the formal grammar underlying human language. This is one of the charms of computing: it may be that the hammer you reach for every day in fact came from a space ship.\n",
    "\n",
    "When you are writing a regular expression, the best way to start is just to try it against one example at first. Let's use the `findall` method to try a regular expression against the filename of the `fname` object:"
   ]
@ -1264,7 +1264,7 @@
    "\n",
    "    log(a*b) = log(a)+log(b)\n",
    "\n",
-    "When we see it in that format looks a bit boring; but have a think about what this really means. It means that logarithms increase linearly when the underlying signal increases exponentially or multiplicatively. This is used for instance in the Richter scale of earthquake severity, and the dB scale of noise levels. It's also often used on financial charts, where we want to show compound growth rates more clearly. Computer scientists love using logarithms, because it means that modification, which can create really really large and really really small numbers, can be replaced by addition, which is much less likely to result in scales which are difficult for our computer to handle."
+    "When we see it in that format, it looks a bit boring; but have a think about what this really means. It means that logarithms increase linearly when the underlying signal increases exponentially or multiplicatively. This is used for instance in the Richter scale of earthquake severity, and the dB scale of noise levels. It's also often used on financial charts, where we want to show compound growth rates more clearly. Computer scientists love using logarithms, because it means that modification, which can create really really large and really really small numbers, can be replaced by addition, which is much less likely to result in scales which are difficult for our computer to handle."
   ]
  },
  {
--- a/11_midlevel_data.ipynb
+++ b/11_midlevel_data.ipynb
@ -61,7 +61,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "The factory method `TextDataLoaders.from_folder` is very convenient when your data is arranged the exact same way as the IMDb dataset, but in practice, that often won't be the case. The data block API offers more flexibility. As we saw in the last chapter, we can ge the same result with:"
+    "The factory method `TextDataLoaders.from_folder` is very convenient when your data is arranged the exact same way as the IMDb dataset, but in practice, that often won't be the case. The data block API offers more flexibility. As we saw in the last chapter, we can get the same result with:"
   ]
  },
  {
@ -1079,7 +1079,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Here the resize transform is applied to each of the two images, but not the boolean flag. Even if we have a custom type, we can thus benefit form all the data augmentation transforms inside the library.\n",
+    "Here the resize transform is applied to each of the two images, but not the boolean flag. Even if we have a custom type, we can thus benefit from all the data augmentation transforms inside the library.\n",
    "\n",
    "We are now ready to build the `Transform` that we will use to get our data ready for a Siamese model. First, we will need a function to determine the class of all our images:"
   ]
@ -1098,7 +1098,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Then here is our main transform. For each image, il will, with a probability of 0.5, draw an image from the same class and return a `SiameseImage` with a true label, or draw an image from another class and a return a `SiameseImage` with a false label. This is all done in the private `_draw` function. There is one difference between the training and validation set, which is why the transform needs to be initialized with the splits: on the training set, we will make that random pick each time we read an image, whereas on the validation set, we make this random pick once and for all at initialization. This way, we get more varied samples during training, but always the same validation set."
+    "Then here is our main transform. For each image, il will, with a probability of 0.5, draw an image from the same class and return a `SiameseImage` with a true label, or draw an image from another class and return a `SiameseImage` with a false label. This is all done in the private `_draw` function. There is one difference between the training and validation set, which is why the transform needs to be initialized with the splits: on the training set, we will make that random pick each time we read an image, whereas on the validation set, we make this random pick once and for all at initialization. This way, we get more varied samples during training, but always the same validation set."
   ]
  },
  {
@ -1220,7 +1220,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "And we have can now train a model using those `DataLoaders`. It needs a bit more customization than the usual model provided by `cnn_learner` since it has to take two images instead of one. We will see how to create such a model and train it in <<chapter_arch_dtails>>."
+    "And we can now train a model using those `DataLoaders`. It needs a bit more customization than the usual model provided by `cnn_learner` since it has to take two images instead of one. We will see how to create such a model and train it in <<chapter_arch_dtails>>."
   ]
  },
  {
--- a/clean/01_intro.ipynb
+++ b/clean/01_intro.ipynb
@ -1084,7 +1084,7 @@
    "learn.fine_tune(4, 1e-2)\n",
    "```\n",
    "\n",
-    "This reduces the batch size to 32 (we will explain this later). If you keep hitting the same error, change 32 by 16."
+    "This reduces the batch size to 32 (we will explain this later). If you keep hitting the same error, change 32 to 16."
   ]
  },
  {
@ -1121,7 +1121,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "### Sidebar: The order matter"
+    "### Sidebar: The order matters"
   ]
  },
  {