Merge branch 'master' into patch-2
This commit is contained in:
commit
1d2b47d41f
@ -591,7 +591,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"> important: Trianing Time: Depending on your network speed, it might take a few minutes to download the pretrained model and dataset. Running `fine_tune` might take a minute or so. Often models in this book take a few minutes to train, as will your own models, so it's a good idea to come up with good techniques to make the most of this time. For instance, keep reading the next section while your model trains, or open up another notebook and use it for some coding experiments."
|
||||
"> important: Training Time: Depending on your network speed, it might take a few minutes to download the pretrained model and dataset. Running `fine_tune` might take a minute or so. Often models in this book take a few minutes to train, as will your own models, so it's a good idea to come up with good techniques to make the most of this time. For instance, keep reading the next section while your model trains, or open up another notebook and use it for some coding experiments."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1010,7 +1010,7 @@
|
||||
"source": [
|
||||
"We've changed the name of our box from *program* to *model*. This is to follow modern terminology and to reflect that the *model* is a special kind of program: it's one that can do *many different things*, depending on the *weights*. It can be implemented in many different ways. For instance, in Samuel's checkers program, different values of the weights would result in different checkers-playing strategies. \n",
|
||||
"\n",
|
||||
"(By the way, what Samuel called \"weights\" are most generally refered to as model *parameters* these days, in case you have encountered that term. The term *weights* is reserved for a particular type of model parameter.)\n",
|
||||
"(By the way, what Samuel called \"weights\" are most generally referred to as model *parameters* these days, in case you have encountered that term. The term *weights* is reserved for a particular type of model parameter.)\n",
|
||||
"\n",
|
||||
"Next, Samuel said we need an *automatic means of testing the effectiveness of any current weight assignment in terms of actual performance*. In the case of his checkers program, the \"actual performance\" of a model would be how well it plays. And you could automatically test the performance of two models by setting them to play against each other, and seeing which one usually wins.\n",
|
||||
"\n",
|
||||
@ -1541,7 +1541,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The Pet dataset contains 7,390 pictures of dogs and cats, consisting of 37 different breeds. Each image is labeled using its filename: for instance the file *great\\_pyrenees\\_173.jpg* is the 173rd example of an image of a Great Pyrenees breed dog in the dataset. The filenames start with an uppercase letter if the image is a cat, and a lowercase letter otherwise. We have to tell fastai how to get labels from the filenames, which we do by calling `from_name_func` (which means that filenames can be extracted using a function applied to the filename), and passing `x[0].isupper()`, which evaluates to `True` if the first letter is uppercase (i.e., it's a cat).\n",
|
||||
"The Pet dataset contains 7,390 pictures of dogs and cats, consisting of 37 different breeds. Each image is labeled using its filename: for instance the file *great\\_pyrenees\\_173.jpg* is the 173rd example of an image of a Great Pyrenees breed dog in the dataset. The filenames start with an uppercase letter if the image is a cat, and a lowercase letter otherwise. We have to tell fastai how to get labels from the filenames, which we do by calling `from_name_func` (which means that labels can be extracted using a function applied to the filename), and passing `x[0].isupper()`, which evaluates to `True` if the first letter is uppercase (i.e., it's a cat).\n",
|
||||
"\n",
|
||||
"The most important parameter to mention here is `valid_pct=0.2`. This tells fastai to hold out 20% of the data and *not use it for training the model at all*. This 20% of the data is called the *validation set*; the remaining 80% is called the *training set*. The validation set is used to measure the accuracy of the model. By default, the 20% that is held out is selected randomly. The parameter `seed=42` sets the *random seed* to the same value every time we run this code, which means we get the same validation set every time we run it—this way, if we change our model and retrain it, we know that any differences are due to the changes to the model, not due to having a different random validation set.\n",
|
||||
"\n",
|
||||
|
@ -135,21 +135,21 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Bias: Professor Lantanya Sweeney \"Arrested\""
|
||||
"### Bias: Professor Latanya Sweeney \"Arrested\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Dr. Latanya Sweeney is a professor at Harvard and director of the university's data privacy lab. In the paper [\"Discrimination in Online Ad Delivery\"](https://arxiv.org/abs/1301.6822) (see <<lantanya_arrested>>) she describes her discovery that Googling her name resulted in advertisements saying \"Latanya Sweeney, arrested?\" even though she is the only known Latanya Sweeney and has never been arrested. However when she Googled other names, such as \"Kirsten Lindquist,\" she got more neutral ads, even though Kirsten Lindquist has been arrested three times."
|
||||
"Dr. Latanya Sweeney is a professor at Harvard and director of the university's data privacy lab. In the paper [\"Discrimination in Online Ad Delivery\"](https://arxiv.org/abs/1301.6822) (see <<latanya_arrested>>) she describes her discovery that Googling her name resulted in advertisements saying \"Latanya Sweeney, arrested?\" even though she is the only known Latanya Sweeney and has never been arrested. However when she Googled other names, such as \"Kirsten Lindquist,\" she got more neutral ads, even though Kirsten Lindquist has been arrested three times."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"images/ethics/image1.png\" id=\"lantanya_arrested\" caption=\"Google search showing ads about Professor Lantanya Sweeney's arrest record\" alt=\"Screenshot of google search showing ads about Professor Lantanya Sweeney's arrest record\" width=\"400\">"
|
||||
"<img src=\"images/ethics/image1.png\" id=\"latanya_arrested\" caption=\"Google search showing ads about Professor Latanya Sweeney's arrest record\" alt=\"Screenshot of google search showing ads about Professor Latanya Sweeney's arrest record\" width=\"400\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -243,7 +243,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"images/ethics/image4.png\" id=\"congressmen\" caption=\"Congresspeople matched to criminal mugshots by Amazon software\" alt=\"Picture of the congresspeople matched to criminal mugshots by Amazon software, they are disproportionatedly people of color\" width=\"500\">"
|
||||
"<img src=\"images/ethics/image4.png\" id=\"congressmen\" caption=\"Congresspeople matched to criminal mugshots by Amazon software\" alt=\"Picture of the congresspeople matched to criminal mugshots by Amazon software, they are disproportionately people of color\" width=\"500\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -312,7 +312,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We explained in <<chapter_intro>> how an algorithm can interact with its enviromnent to create a feedback loop, making predictions that reinforce actions taken in the real world, which lead to predictions even more pronounced in the same direction. \n",
|
||||
"We explained in <<chapter_intro>> how an algorithm can interact with its environment to create a feedback loop, making predictions that reinforce actions taken in the real world, which lead to predictions even more pronounced in the same direction. \n",
|
||||
"As an example, let's again consider YouTube's recommendation system. A couple of years ago the Google team talked about how they had introduced reinforcement learning (closely related to deep learning, but where your loss function represents a result potentially a long time after an action occurs) to improve YouTube's recommendation system. They described how they used an algorithm that made recommendations such that watch time would be optimized.\n",
|
||||
"\n",
|
||||
"However, human beings tend to be drawn to controversial content. This meant that videos about things like conspiracy theories started to get recommended more and more by the recommendation system. Furthermore, it turns out that the kinds of people that are interested in conspiracy theories are also people that watch a lot of online videos! So, they started to get drawn more and more toward YouTube. The increasing number of conspiracy theorists watching videos on YouTube resulted in the algorithm recommending more and more conspiracy theory and other extremist content, which resulted in more extremists watching videos on YouTube, and more people watching YouTube developing extremist views, which led to the algorithm recommending more extremist content... The system was spiraling out of control.\n",
|
||||
@ -826,7 +826,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The hiring process is particularly broken in tech. One study indicative of the disfunction comes from Triplebyte, a company that helps place software engineers in companies, conducting a standardized technical interview as part of this process. They have a fascinating dataset: the results of how over 300 engineers did on their exam, coupled with the results of how those engineers did during the interview process for a variety of companies. The number one finding from [Triplebyte’s research](https://triplebyte.com/blog/who-y-combinator-companies-want) is that “the types of programmers that each company looks for often have little to do with what the company needs or does. Rather, they reflect company culture and the backgrounds of the founders.”\n",
|
||||
"The hiring process is particularly broken in tech. One study indicative of the dysfunction comes from Triplebyte, a company that helps place software engineers in companies, conducting a standardized technical interview as part of this process. They have a fascinating dataset: the results of how over 300 engineers did on their exam, coupled with the results of how those engineers did during the interview process for a variety of companies. The number one finding from [Triplebyte’s research](https://triplebyte.com/blog/who-y-combinator-companies-want) is that “the types of programmers that each company looks for often have little to do with what the company needs or does. Rather, they reflect company culture and the backgrounds of the founders.”\n",
|
||||
"\n",
|
||||
"This is a challenge for those trying to break into the world of deep learning, since most companies' deep learning groups today were founded by academics. These groups tend to look for people \"like them\"—that is, people that can solve complex math problems and understand dense jargon. They don't always know how to spot people who are actually good at solving real problems using deep learning.\n",
|
||||
"\n",
|
||||
|
@ -1367,7 +1367,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You can see that the background white pixels are stored as the number 0, black is the number 255, and shades of gray are between the two. The entire image contains 28 pixels across and 28 pixels down, for a total of 768 pixels. (This is much smaller than an image that you would get from a phone camera, which has millions of pixels, but is a convenient size for our initial learning and experiments. We will build up to bigger, full-color images soon.)\n",
|
||||
"You can see that the background white pixels are stored as the number 0, black is the number 255, and shades of gray are between the two. The entire image contains 28 pixels across and 28 pixels down, for a total of 784 pixels. (This is much smaller than an image that you would get from a phone camera, which has millions of pixels, but is a convenient size for our initial learning and experiments. We will build up to bigger, full-color images soon.)\n",
|
||||
"\n",
|
||||
"So, now you've seen what an image looks like to a computer, let's recall our goal: create a model that can recognize 3s and 7s. How might you go about getting a computer to do that?\n",
|
||||
"\n",
|
||||
@ -3943,7 +3943,7 @@
|
||||
"\n",
|
||||
"Let's write such a function now. What form does it take?\n",
|
||||
"\n",
|
||||
"The loss function receives not the images themseles, but the predictions from the model. Let's make one argument, `prds`, of values between 0 and 1, where each value is the prediction that an image is a 3. It is a vector (i.e., a rank-1 tensor), indexed over the images.\n",
|
||||
"The loss function receives not the images themselves, but the predictions from the model. Let's make one argument, `prds`, of values between 0 and 1, where each value is the prediction that an image is a 3. It is a vector (i.e., a rank-1 tensor), indexed over the images.\n",
|
||||
"\n",
|
||||
"The purpose of the loss function is to measure the difference between predicted values and the true values — that is, the targets (aka labels). Let's make another argument, `trgts`, with values of 0 or 1 which tells whether an image actually is a 3 or not. It is also a vector (i.e., another rank-1 tensor), indexed over the images.\n",
|
||||
"\n",
|
||||
@ -4767,7 +4767,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Because this is such a general foundation, PyTorch provides some useful classes to make it easier to implement. The first thing we can do is replace our `linear` function with PyTorch's `nn.Linear` module. A *module* is an object of a class that inherits from the PyTorch `nn.Module` class. Objects of this class behave identically to standard Python functions, in that you can call them using parentheses and they will return the activations of a model.\n",
|
||||
"Because this is such a general foundation, PyTorch provides some useful classes to make it easier to implement. The first thing we can do is replace our `linear1` function with PyTorch's `nn.Linear` module. A *module* is an object of a class that inherits from the PyTorch `nn.Module` class. Objects of this class behave identically to standard Python functions, in that you can call them using parentheses and they will return the activations of a model.\n",
|
||||
"\n",
|
||||
"`nn.Linear` does the same thing as our `init_params` and `linear` together. It contains both the *weights* and *biases* in a single class. Here's how we replicate our model from the previous section:"
|
||||
]
|
||||
@ -5885,4 +5885,4 @@
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
||||
}
|
@ -177,7 +177,7 @@
|
||||
"\n",
|
||||
"In this case, we need a regular expression that extracts the pet breed from the filename.\n",
|
||||
"\n",
|
||||
"We do not have the space to give you a complete regular expression tutorial here,but there are many excellent ones online and we know that many of you will already be familiar with this wonderful tool. If you're not, that is totally fine—this is a great opportunity for you to rectify that! We find that regular expressions are one of the most useful tools in our programming toolkit, and many of our students tell us that this is one of the things they are most excited to learn about. So head over to Google and search for \"regular expressions tutorial\" now, and then come back here after you've had a good look around. The [book's website](https://book.fast.ai/) also provides a list of our favorites.\n",
|
||||
"We do not have the space to give you a complete regular expression tutorial here, but there are many excellent ones online and we know that many of you will already be familiar with this wonderful tool. If you're not, that is totally fine—this is a great opportunity for you to rectify that! We find that regular expressions are one of the most useful tools in our programming toolkit, and many of our students tell us that this is one of the things they are most excited to learn about. So head over to Google and search for \"regular expressions tutorial\" now, and then come back here after you've had a good look around. The [book's website](https://book.fast.ai/) also provides a list of our favorites.\n",
|
||||
"\n",
|
||||
"> a: Not only are regular expressions dead handy, but they also have interesting roots. They are \"regular\" because they were originally examples of a \"regular\" language, the lowest rung within the Chomsky hierarchy, a grammar classification developed by linguist Noam Chomsky, who also wrote _Syntactic Structures_, the pioneering work searching for the formal grammar underlying human language. This is one of the charms of computing: it may be that the hammer you reach for every day in fact came from a spaceship.\n",
|
||||
"\n",
|
||||
@ -255,7 +255,7 @@
|
||||
"source": [
|
||||
"We need our images to have the same dimensions, so that they can collate into tensors to be passed to the GPU. We also want to minimize the number of distinct augmentation computations we perform. The performance requirement suggests that we should, where possible, compose our augmentation transforms into fewer transforms (to reduce the number of computations and the number of lossy operations) and transform the images into uniform sizes (for more efficient processing on the GPU).\n",
|
||||
"\n",
|
||||
"The challenge is that, if performed after resizing down to the augmented size, various common data augmentation transforms might introduce spurious empty zones, degrade data, or both. For instance, rotating an image by 45 degrees fills corner regions of the new bounds with emptyness, which will not teach the model anything. Many rotation and zooming operations will require interpolating to create pixels. These interpolated pixels are derived from the original image data but are still of lower quality.\n",
|
||||
"The challenge is that, if performed after resizing down to the augmented size, various common data augmentation transforms might introduce spurious empty zones, degrade data, or both. For instance, rotating an image by 45 degrees fills corner regions of the new bounds with emptiness, which will not teach the model anything. Many rotation and zooming operations will require interpolating to create pixels. These interpolated pixels are derived from the original image data but are still of lower quality.\n",
|
||||
"\n",
|
||||
"To work around these challenges, presizing adopts two strategies that are shown in <<presizing>>:\n",
|
||||
"\n",
|
||||
@ -385,7 +385,7 @@
|
||||
"source": [
|
||||
"Take a look at each image, and check that each one seems to have the correct label for that breed of pet. Often, data scientists work with data with which they are not as familiar as domain experts may be: for instance, I actually don't know what a lot of these pet breeds are. Since I am not an expert on pet breeds, I would use Google images at this point to search for a few of these breeds, and make sure the images look similar to what I see in this output.\n",
|
||||
"\n",
|
||||
"If you made a mistake while building your `DataBlock`, it is very likely you won't see it before this step. To debug this, we encourage you to use the `summary` method. It will attempt to create a batch from the source you give it, with a lot of details. Also, if it fails, you will see exactly at which point the error happens, and the library will try to give you some help. For instance, one common mistake is to forget to use a `Resize` transform, so you en up with pictures of different sizes and are not able to batch them. Here is what the summary would look like in that case (note that the exact text may have changed since the time of writing, but it will give you an idea):"
|
||||
"If you made a mistake while building your `DataBlock`, it is very likely you won't see it before this step. To debug this, we encourage you to use the `summary` method. It will attempt to create a batch from the source you give it, with a lot of details. Also, if it fails, you will see exactly at which point the error happens, and the library will try to give you some help. For instance, one common mistake is to forget to use a `Resize` transform, so you end up with pictures of different sizes and are not able to batch them. Here is what the summary would look like in that case (note that the exact text may have changed since the time of writing, but it will give you an idea):"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -40,7 +40,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In the previous chapter you learned some important practical techniques for training models in practice. COnsiderations like selecting learning rates and the number of epochs are very important to getting good results.\n",
|
||||
"In the previous chapter you learned some important practical techniques for training models in practice. Considerations like selecting learning rates and the number of epochs are very important to getting good results.\n",
|
||||
"\n",
|
||||
"In this chapter we are going to look at two other types of computer vision problems: multi-label classification and regression. The first one is when you want to predict more than one label per image (or sometimes none at all), and the second is when your labels are one or several numbers—a quantity instead of a category.\n",
|
||||
"\n",
|
||||
@ -472,7 +472,7 @@
|
||||
"As we have seen, PyTorch and fastai have two main classes for representing and accessing a training set or validation set:\n",
|
||||
"\n",
|
||||
"- `Dataset`:: A collection that returns a tuple of your independent and dependent variable for a single item\n",
|
||||
"- `DataLoader`:: An iterator that provides a stream of mini-batches, where each mini-batch is a couple of a batch of independent variables and a batch of dependent variables"
|
||||
"- `DataLoader`:: An iterator that provides a stream of mini-batches, where each mini-batch is a tuple of a batch of independent variables and a batch of dependent variables"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -2193,4 +2193,4 @@
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
}
|
@ -68,7 +68,7 @@
|
||||
"\n",
|
||||
"We thought that seemed very unlikely to be true. We had never actually seen a study that showed that ImageNet happen to be exactly the right size, and that other datasets could not be developed which would provide useful insights. So we thought we would try to create a new dataset that researchers could test their algorithms on quickly and cheaply, but which would also provide insights likely to work on the full ImageNet dataset.\n",
|
||||
"\n",
|
||||
"About three hours later we had created Imagenette. We selected 10 classes from the full ImageNet that looked very different from one another. As we had hopep, we were able to quickly and cheaply create a classifier capable of recognizing these classes. We then tried out a few algorithmic tweaks to see how they impacted Imagenette. We found some that worked pretty well, and tested them on ImageNet as well—and we were very pleased to find that our tweaks worked well on ImageNet too!\n",
|
||||
"About three hours later we had created Imagenette. We selected 10 classes from the full ImageNet that looked very different from one another. As we had hoped, we were able to quickly and cheaply create a classifier capable of recognizing these classes. We then tried out a few algorithmic tweaks to see how they impacted Imagenette. We found some that worked pretty well, and tested them on ImageNet as well—and we were very pleased to find that our tweaks worked well on ImageNet too!\n",
|
||||
"\n",
|
||||
"There is an important message here: the dataset you get given is not necessarily the dataset you want. It's particularly unlikely to be the dataset that you want to do your development and prototyping in. You should aim to have an iteration speed of no more than a couple of minutes—that is, when you come up with a new idea you want to try out, you should be able to train a model and see how it goes within a couple of minutes. If it's taking longer to do an experiment, think about how you could cut down your dataset, or simplify your model, to improve your experimentation speed. The more experiments you can do, the better!\n",
|
||||
"\n",
|
||||
@ -289,7 +289,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Let's check how what effet this had on training our model:"
|
||||
"Let's check what effect this had on training our model:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -914,7 +914,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Let's practice our paper-reading skills to try to interpret this. \"This maximum\" is refering to the previous part of the paragraph, which talked about the fact that 1 is the value of the label for the positive class. So it's not possible for any value (except infinity) to result in 1 after sigmoid or softmax. In a paper, you won't normally see \"any value\" written; instead it will get a symbol, which in this case is $z_k$. This shorthand is helpful in a paper, because it can be refered to again later and the reader will know what value is being discussed.\n",
|
||||
"Let's practice our paper-reading skills to try to interpret this. \"This maximum\" is refering to the previous part of the paragraph, which talked about the fact that 1 is the value of the label for the positive class. So it's not possible for any value (except infinity) to result in 1 after sigmoid or softmax. In a paper, you won't normally see \"any value\" written; instead it will get a symbol, which in this case is $z_k$. This shorthand is helpful in a paper, because it can be referred to again later and the reader will know what value is being discussed.\n",
|
||||
"\n",
|
||||
"Then it says \"if $z_y\\gg z_k$ for all $k\\neq y$.\" In this case, the paper immediately follows the math with an English description, which is handy because you can just read that. In the math, the $y$ is refering to the target ($y$ is defined earlier in the paper; sometimes it's hard to find where symbols are defined, but nearly all papers will define all their symbols somewhere), and $z_y$ is the activation corresponding to the target. So to get close to 1, this activation needs to be much higher than all the others for that prediction.\n",
|
||||
"\n",
|
||||
|
@ -793,7 +793,7 @@
|
||||
"source": [
|
||||
"In computer vision, we have a very easy way to get all the information of a pixel through its RGB values: each pixel in a colored image is represented by three numbers. Those three numbers give us the redness, the greenness and the blueness, which is enough to get our model to work afterward.\n",
|
||||
"\n",
|
||||
"For the problem at hand, we don't have the same easy way to characterize a user or a movie. There are probably relations with genres: if a given user likes romance, they are likely to give higher scores to romance movies. Other factors might be wether the movie is more action-oriented versus heavy on dialogue, or the presence of a specific actor that a user might particularly like. \n",
|
||||
"For the problem at hand, we don't have the same easy way to characterize a user or a movie. There are probably relations with genres: if a given user likes romance, they are likely to give higher scores to romance movies. Other factors might be whether the movie is more action-oriented versus heavy on dialogue, or the presence of a specific actor that a user might particularly like. \n",
|
||||
"\n",
|
||||
"How do we determine numbers to characterize those? The answer is, we don't. We will let our model *learn* them. By analyzing the existing relations between users and movies, our model can figure out itself the features that seem important or not.\n",
|
||||
"\n",
|
||||
@ -1006,7 +1006,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The first thing we can do to make this model a little bit better is to force those predictions to be between 0 and 5. For this, we just need to use `sigmoid_range`, like in <<chapter_mulitcat>>. One thing we discovered empirically is that it's better to have the range go a little bit over 5, so we use `(0, 5.5)`:"
|
||||
"The first thing we can do to make this model a little bit better is to force those predictions to be between 0 and 5. For this, we just need to use `sigmoid_range`, like in <<chapter_multicat>>. One thing we discovered empirically is that it's better to have the range go a little bit over 5, so we use `(0, 5.5)`:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1262,7 +1262,7 @@
|
||||
"loss_with_wd = loss + wd * (parameters**2).sum()\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"In practice, though, it would be very inefficient (and maybe numerically unstable) to compute that big sum and add it to the loss. If you remember a little bit of high schoool math, you might recall that the derivative of `p**2` with respect to `p` is `2*p`, so adding that big sum to our loss is exactly the same as doing:\n",
|
||||
"In practice, though, it would be very inefficient (and maybe numerically unstable) to compute that big sum and add it to the loss. If you remember a little bit of high school math, you might recall that the derivative of `p**2` with respect to `p` is `2*p`, so adding that big sum to our loss is exactly the same as doing:\n",
|
||||
"\n",
|
||||
"``` python\n",
|
||||
"parameters.grad += wd * 2 * parameters\n",
|
||||
|
@ -92,7 +92,7 @@
|
||||
"source": [
|
||||
"At the end of 2015, the [Rossmann sales competition](https://www.kaggle.com/c/rossmann-store-sales) ran on Kaggle. Competitors were given a wide range of information about various stores in Germany, and were tasked with trying to predict sales on a number of days. The goal was to help the company to manage stock properly and be able to satisfy demand without holding unnecessary inventory. The official training set provided a lot of information about the stores. It was also permitted for competitors to use additional data, as long as that data was made public and available to all participants.\n",
|
||||
"\n",
|
||||
"One of the gold medalists used deep learning, in one of the earliest known examples of a state-of-the-art deep learning tabular model. Their method involved far less feature engineering, based on domain knowledge, than those of the other gold medalists. The paper, [\"Entity Embeddings of Categorical Variables\"](https://arxiv.org/abs/1604.06737) describes their approach. In an online-only chapter on the [book's website](https://book.fast.ai/) we show how to replicate it from scratch and attain the same accuracy shown in the paper. In the abstract of the paper the authors (Cheng Guo and Felix Bekhahn) say:"
|
||||
"One of the gold medalists used deep learning, in one of the earliest known examples of a state-of-the-art deep learning tabular model. Their method involved far less feature engineering, based on domain knowledge, than those of the other gold medalists. The paper, [\"Entity Embeddings of Categorical Variables\"](https://arxiv.org/abs/1604.06737) describes their approach. In an online-only chapter on the [book's website](https://book.fast.ai/) we show how to replicate it from scratch and attain the same accuracy shown in the paper. In the abstract of the paper the authors (Cheng Guo and Felix Berkhahn) say:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -174,7 +174,7 @@
|
||||
"\n",
|
||||
"In addition, it is valuable in its own right that embeddings are continuous, because models are better at understanding continuous variables. This is unsurprising considering models are built of many continuous parameter weights and continuous activation values, which are updated via gradient descent (a learning algorithm for finding the minimums of continuous functions).\n",
|
||||
"\n",
|
||||
"Another benefit is that we can combine our continuous embedding values with truly continuous input data in a straightforward manner: we just concatenate the variables, and feed the concatenation into our first dense layer. In other words, the raw categorical data is transformed by an embedding layer before it interacts with the raw continuous input data. This is how fastai and Guo and Berkham handle tabular models containing continuous and categorical variables.\n",
|
||||
"Another benefit is that we can combine our continuous embedding values with truly continuous input data in a straightforward manner: we just concatenate the variables, and feed the concatenation into our first dense layer. In other words, the raw categorical data is transformed by an embedding layer before it interacts with the raw continuous input data. This is how fastai and Guo and Berkhahn handle tabular models containing continuous and categorical variables.\n",
|
||||
"\n",
|
||||
"An example using this concatenation approach is how Google does it recommendations on Google Play, as explained in the paper [\"Wide & Deep Learning for Recommender Systems\"](https://arxiv.org/abs/1606.07792). <<google_recsys>> illustrates."
|
||||
]
|
||||
@ -598,7 +598,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"> A: Here's a productive question to ponder. If you consider that the procedure for defining a decision tree essentially chooses one _sequence of splitting questions about variables_, you might ask yourself, how do we know this procedure chooses the _correct sequence_? The rule is to choose the splitting question that produces the best split (i.e., that most accurately separates the itmes into two distinct categories), and then to apply the same rule to the groups that split produces, and so on. This is known in computer science as a \"greedy\" approach. Can you imagine a scenario in which asking a “less powerful” splitting question would enable a better split down the road (or should I say down the trunk!) and lead to a better result overall?"
|
||||
"> A: Here's a productive question to ponder. If you consider that the procedure for defining a decision tree essentially chooses one _sequence of splitting questions about variables_, you might ask yourself, how do we know this procedure chooses the _correct sequence_? The rule is to choose the splitting question that produces the best split (i.e., that most accurately separates the items into two distinct categories), and then to apply the same rule to the groups that split produces, and so on. This is known in computer science as a \"greedy\" approach. Can you imagine a scenario in which asking a “less powerful” splitting question would enable a better split down the road (or should I say down the trunk!) and lead to a better result overall?"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -612,7 +612,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The first piece of data preparation we need to do is to enrich our representation of dates. The fundamental basis of the decision tree that we just described is *bisection*— dividing a group into two. We look at the ordinal variables and divide up the dataset based on whether the variable's value is greater (or lower) than a threshhold, and we look at the categorical variables and divide up the dataset based on whether the variable's level is a particular level. So this algorithm has a way of dividing up the dataset based on both ordinal and categorical data.\n",
|
||||
"The first piece of data preparation we need to do is to enrich our representation of dates. The fundamental basis of the decision tree that we just described is *bisection*— dividing a group into two. We look at the ordinal variables and divide up the dataset based on whether the variable's value is greater (or lower) than a threshold, and we look at the categorical variables and divide up the dataset based on whether the variable's level is a particular level. So this algorithm has a way of dividing up the dataset based on both ordinal and categorical data.\n",
|
||||
"\n",
|
||||
"But how does this apply to a common data type, the date? You might want to treat a date as an ordinal value, because it is meaningful to say that one date is greater than another. However, dates are a bit different from most ordinal values in that some dates are qualitatively different from others in a way that that is often relevant to the systems we are modeling.\n",
|
||||
"\n",
|
||||
|
@ -903,7 +903,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In a perfect world, we could then give this one batch to our model. But that approach doesn't scale, because outside of this toy example it's unlikely that a signle batch containing all the texts would fit in our GPU memory (here we have 90 tokens, but all the IMDb reviews together give several million).\n",
|
||||
"In a perfect world, we could then give this one batch to our model. But that approach doesn't scale, because outside of this toy example it's unlikely that a single batch containing all the texts would fit in our GPU memory (here we have 90 tokens, but all the IMDb reviews together give several million).\n",
|
||||
"\n",
|
||||
"So, we need to divide this array more finely into subarrays of a fixed sequence length. It is important to maintain order within and across these subarrays, because we will use a model that maintains a state so that it remembers what it read previously when predicting what comes next. \n",
|
||||
"\n",
|
||||
|
@ -1532,7 +1532,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This inaccuracy means that often the gradients calculated for updating the weights end up as zero or infinity for deep networks. This is commonly refered to as the *vanishing gradients* or *exploding gradients* problem. It means that in SGD, the weights are either not updated at all or jump to infinity. Either way, they won't improve with training.\n",
|
||||
"This inaccuracy means that often the gradients calculated for updating the weights end up as zero or infinity for deep networks. This is commonly referred to as the *vanishing gradients* or *exploding gradients* problem. It means that in SGD, the weights are either not updated at all or jump to infinity. Either way, they won't improve with training.\n",
|
||||
"\n",
|
||||
"Researchers have developed a number of ways to tackle this problem, which we will be discussing later in the book. One option is to change the definition of a layer in a way that makes it less likely to have exploding activations. We'll look at the details of how this is done in <<chapter_convolutions>>, when we discuss batch normalization, and <<chapter_resnet>>, when we discuss ResNets, although these details don't generally matter in practice (unless you are a researcher that is creating new approaches to solving this problem). Another strategy for dealing with this is by being careful about initialization, which is a topic we'll investigate in <<chapter_foundations>>.\n",
|
||||
"\n",
|
||||
|
@ -1343,7 +1343,7 @@
|
||||
"\n",
|
||||
"it will return $a1+a2+a3-a7-a8-a9$. If we are in a part of the image where $a1$, $a2$, and $a3$ add up to the same as $a7$, $a8$, and $a9$, then the terms will cancel each other out and we will get 0. However, if $a1$ is greater than $a7$, $a2$ is greater than $a8$, and $a3$ is greater than $a9$, we will get a bigger number as a result. So this filter detects horizontal edges—more precisely, edges where we go from bright parts of the image at the top to darker parts at the bottom.\n",
|
||||
"\n",
|
||||
"Changing our filter to have the row of `1`s at the top and the `-1`s at the bottom would detect horizonal edges that go from dark to light. Putting the `1`s and `-1`s in columns versus rows would give us filters that detect vertical edges. Each set of weights will produce a different kind of outcome.\n",
|
||||
"Changing our filter to have the row of `1`s at the top and the `-1`s at the bottom would detect horizontal edges that go from dark to light. Putting the `1`s and `-1`s in columns versus rows would give us filters that detect vertical edges. Each set of weights will produce a different kind of outcome.\n",
|
||||
"\n",
|
||||
"Let's create a function to do this for one location, and check it matches our result from before:"
|
||||
]
|
||||
@ -1826,7 +1826,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"To explain the math behing convolutions, fast.ai student Matt Kleinsmith came up with the very clever idea of showing [CNNs from different viewpoints](https://medium.com/impactai/cnns-from-different-viewpoints-fab7f52d159c). In fact, it's so clever, and so helpful, we're going to show it here too!\n",
|
||||
"To explain the math behind convolutions, fast.ai student Matt Kleinsmith came up with the very clever idea of showing [CNNs from different viewpoints](https://medium.com/impactai/cnns-from-different-viewpoints-fab7f52d159c). In fact, it's so clever, and so helpful, we're going to show it here too!\n",
|
||||
"\n",
|
||||
"Here's our 3×3 pixel image, with each pixel labeled with a letter:"
|
||||
]
|
||||
@ -2093,7 +2093,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"> jargon: channels and features: These two terms are largely used interchangably, and refer to the size of the second axis of a weight matrix, which is, the number of activations per grid cell after a convolution. _Features_ is never used to refer to the input data, but _channels_ can refer to either the input data (generally channels are colors) or activations inside the network."
|
||||
"> jargon: channels and features: These two terms are largely used interchangeably, and refer to the size of the second axis of a weight matrix, which is, the number of activations per grid cell after a convolution. _Features_ is never used to refer to the input data, but _channels_ can refer to either the input data (generally channels are colors) or activations inside the network."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -2416,7 +2416,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The *receptive field* is the area of an image that is involved in the calculation of a layer. On the [book's website](https://book.fast.ai/), you'll find an Excel spreadsheet called *conv-example.xlsx* that shows the calculation of two stride-2 convolutional layers using an MNIST digit. Each layer has a single kernel. <<preced1>> shows what we see if we click on one of the cells in the *conv2* section, which shows the output of the second convolutional layer, and click *trace precendents*."
|
||||
"The *receptive field* is the area of an image that is involved in the calculation of a layer. On the [book's website](https://book.fast.ai/), you'll find an Excel spreadsheet called *conv-example.xlsx* that shows the calculation of two stride-2 convolutional layers using an MNIST digit. Each layer has a single kernel. <<preced1>> shows what we see if we click on one of the cells in the *conv2* section, which shows the output of the second convolutional layer, and click *trace precedents*."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -2446,7 +2446,7 @@
|
||||
"source": [
|
||||
"In this example, we have just two convolutional layers, each of stride 2, so this is now tracing right back to the input image. We can see that a 7×7 area of cells in the input layer is used to calculate the single green cell in the Conv2 layer. This 7×7 area is the *receptive field* in the input of the green activation in Conv2. We can also see that a second filter kernel is needed now, since we have two layers.\n",
|
||||
"\n",
|
||||
"As you see from this example, the deeper we are in the network (specifically, the more stride-2 convs we have before a layer), the larger the receptive field for an activation in that layer. A large receptive field means that a large amount of the input image is used to calculate each activation in that layer is. We now know that in the deeper layers of the network we have semantically rich features, corresponding to larger receptive fields. Therefore, we'd expect that we'd need more weights for each of our features to handle this increasing complexity. This is another way of saying the same thing we mentionedin the previous section: when we introduce a stride-2 conv in our network, we should also increase the number of channels."
|
||||
"As you see from this example, the deeper we are in the network (specifically, the more stride-2 convs we have before a layer), the larger the receptive field for an activation in that layer. A large receptive field means that a large amount of the input image is used to calculate each activation in that layer is. We now know that in the deeper layers of the network we have semantically rich features, corresponding to larger receptive fields. Therefore, we'd expect that we'd need more weights for each of our features to handle this increasing complexity. This is another way of saying the same thing we mentioned in the previous section: when we introduce a stride-2 conv in our network, we should also increase the number of channels."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -2469,7 +2469,7 @@
|
||||
"source": [
|
||||
"We are not, to say the least, big users of social networks in general. But our goal in writing this book is to help you become the best deep learning practitioner you can, and we would be remiss not to mention how important Twitter has been in our own deep learning journeys.\n",
|
||||
"\n",
|
||||
"You see, there's another part of Twitter, far away from Donald Trump and the Kardashians, which is the part of Twitter where deep learning researchers and practitioners talk shop every day. As we were writing this section, Jeremy wanted to double-checkthat what we were saying about stride-2 convolutions was accurate, so he asked on Twitter:"
|
||||
"You see, there's another part of Twitter, far away from Donald Trump and the Kardashians, which is the part of Twitter where deep learning researchers and practitioners talk shop every day. As we were writing this section, Jeremy wanted to double-check that what we were saying about stride-2 convolutions was accurate, so he asked on Twitter:"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -502,7 +502,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Broadcasting with a scalar is the easiest type of broadcating. When we have a tensor `a` and a scalar, we just imagine a tensor of the same shape as `a` filled with that scalar and perform the operation:"
|
||||
"Broadcasting with a scalar is the easiest type of broadcasting. When we have a tensor `a` and a scalar, we just imagine a tensor of the same shape as `a` filled with that scalar and perform the operation:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1984,7 +1984,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We have sucessfuly defined our model—now let's make it a bit more like a PyTorch module."
|
||||
"We have successfully defined our model—now let's make it a bit more like a PyTorch module."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -2252,7 +2252,7 @@
|
||||
"\n",
|
||||
"To implement an `nn.Module` you just need to:\n",
|
||||
"\n",
|
||||
"- Make sure the superclass `__init__` is called first when you initiliaze it.\n",
|
||||
"- Make sure the superclass `__init__` is called first when you initialize it.\n",
|
||||
"- Define any parameters of the model as attributes with `nn.Parameter`.\n",
|
||||
"- Define a `forward` function that returns the output of your model.\n",
|
||||
"\n",
|
||||
|
@ -58,7 +58,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The class activation map (CAM) was introduced by Bolei Zhou et al. in [\"Learning Deep Features for Discriminative Localization\"](https://arxiv.org/abs/1512.04150). It uses the output of the last convolutional layer (just before the average pooling layer) together with the predictions to give us a heatmap visualization of why the model made its decision. This is a useful tool for intepretation.\n",
|
||||
"The class activation map (CAM) was introduced by Bolei Zhou et al. in [\"Learning Deep Features for Discriminative Localization\"](https://arxiv.org/abs/1512.04150). It uses the output of the last convolutional layer (just before the average pooling layer) together with the predictions to give us a heatmap visualization of why the model made its decision. This is a useful tool for interpretation.\n",
|
||||
"\n",
|
||||
"More precisely, at each position of our final convolutional layer, we have as many filters as in the last linear layer. We can therefore compute the dot product of those activations with the final weights to get, for each location on our feature map, the score of the feature that was used to make a decision.\n",
|
||||
"\n",
|
||||
@ -440,7 +440,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This method is useful, but only works for the last layer. *Gradient CAM* is a variant that addreses this problem."
|
||||
"This method is useful, but only works for the last layer. *Gradient CAM* is a variant that addresses this problem."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -35,7 +35,7 @@
|
||||
"source": [
|
||||
"This final chapter (other than the conclusion and the online chapters) is going to look a bit different. It contains far more code and far less prose than the previous chapters. We will introduce new Python keywords and libraries without discussing them. This chapter is meant to be the start of a significant research project for you. You see, we are going to implement many of the key pieces of the fastai and PyTorch APIs from scratch, building on nothing other than the components that we developed in <<chapter_foundations>>! The key goal here is to end up with your own `Learner` class, and some callbacks—enough to be able to train a model on Imagenette, including examples of each of the key techniques we've studied. On the way to building `Learner`, we will create our own version of `Module`, `Parameter`, and parallel `DataLoader` so you have a very good idea of what those PyTorch classes do.\n",
|
||||
"\n",
|
||||
"The end-of-chapter questionnaire is particularly important for this chapter. This is where we will be pointing you in the many interesting directions that you could take, using this chapter as your starting point. We suggest that you follow t along with this chapter on your computer, and do lots of experiments, web searches, and whatever else you need to understand what's going on. You've built up the skills and expertise to do this in the rest of this book, so we think you are going to do great!"
|
||||
"The end-of-chapter questionnaire is particularly important for this chapter. This is where we will be pointing you in the many interesting directions that you could take, using this chapter as your starting point. We suggest that you follow along with this chapter on your computer, and do lots of experiments, web searches, and whatever else you need to understand what's going on. You've built up the skills and expertise to do this in the rest of this book, so we think you are going to do great!"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1827,7 +1827,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We have explored the key concepts of the fastai library are implemented by re-implementing them in this chapter. Since it's mostly full of code, you should definitely try to experiment with it by looking at the corresponding notebook on the book's website. Now that you know how it's built, as a next step be sure to check out the intermediate and advanced tutorials in the fastai documentation to learn how to customize every bit of the libraryt."
|
||||
"We have explored the key concepts of the fastai library are implemented by re-implementing them in this chapter. Since it's mostly full of code, you should definitely try to experiment with it by looking at the corresponding notebook on the book's website. Now that you know how it's built, as a next step be sure to check out the intermediate and advanced tutorials in the fastai documentation to learn how to customize every bit of the library."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -62,7 +62,7 @@
|
||||
"\n",
|
||||
"Also, you may want to take a look at the fast.ai free online course that covers the same material as this book. Sometimes, seeing the same material in two different ways can really help to crystallize the ideas. In fact, human learning researchers have found that one of the best ways to learn material is to see the same thing from different angles, described in different ways.\n",
|
||||
"\n",
|
||||
"Your final mission, should you choose to accept it, is to take this book and give it to somebody that you know—and get somebody else starte on their own deep learning journey!"
|
||||
"Your final mission, should you choose to accept it, is to take this book and give it to somebody that you know—and get somebody else started on their own deep learning journey!"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
12
README.md
12
README.md
@ -1,17 +1,17 @@
|
||||
[](https://mybinder.org/v2/gh/fastai/fastbook/master)
|
||||
|
||||
# The fastai book - draft
|
||||
# The fastai book
|
||||
|
||||
These draft notebooks cover an introduction to deep learning, [fastai](https://docs.fast.ai/), and [PyTorch](https://pytorch.org/). fastai is a layered API for deep learning; for more information, see [the fastai paper](https://www.mdpi.com/2078-2489/11/2/108). Everything in this repo is copyright Jeremy Howard and Sylvain Gugger, 2020 onwards.
|
||||
These notebooks cover an introduction to deep learning, [fastai](https://docs.fast.ai/), and [PyTorch](https://pytorch.org/). fastai is a layered API for deep learning; for more information, see [the fastai paper](https://www.mdpi.com/2078-2489/11/2/108). Everything in this repo is copyright Jeremy Howard and Sylvain Gugger, 2020 onwards.
|
||||
|
||||
These notebooks will be used for [a course we're teaching](https://www.usfca.edu/data-institute/certificates/deep-learning-part-one) in San Francisco from March 2020, and will be available as a MOOC from around July 2020. In addition, our plan is that these notebooks will form the basis of [this book](https://www.amazon.com/Deep-Learning-Coders-fastai-PyTorch/dp/1492045527), which is currently available for purchase. It will not have the same GPL restrictions that are on this draft.
|
||||
These notebooks are used for [a MOOC](https://course.fast.ai) and form the basis of [this book](https://www.amazon.com/Deep-Learning-Coders-fastai-PyTorch/dp/1492045527), which is currently available for purchase. It does not have the same GPL restrictions that are on this draft.
|
||||
|
||||
The code in the notebooks and python `.py` files is covered by the GPL v3 license; see the LICENSE file for details.
|
||||
|
||||
The remainder (including all markdown cells in the notebooks and other prose) is not licensed for any redistribution or change of format or medium, other than making copies of the notebooks or forking this repo for your own private use. No commercial or broadcast use is allowed. We are making these materials freely available to help you learn deep learning, so please respect our copyright and these restrictions.
|
||||
|
||||
If you see someone hosting a copy of these materials somewhere else, please let them know that their actions are not allowed, and may lead to legal action. Moreover, they would be hurting the community, because we're not likely to release additional materials in this way if people ignore our copyright.
|
||||
If you see someone hosting a copy of these materials somewhere else, please let them know that their actions are not allowed and may lead to legal action. Moreover, they would be hurting the community because we're not likely to release additional materials in this way if people ignore our copyright.
|
||||
|
||||
This is an early draft. If you get stuck running notebooks, please search the [fastai-v2 forum](https://forums.fast.ai/c/fastai-users/fastai-v2) for answers, and ask for help there if needed. Please don't use GitHub issues for problems running the notebooks.
|
||||
This is an early draft. If you get stuck running notebooks, please search the [fastai-dev forum](https://forums.fast.ai/c/fastai-users/fastai-dev/) for answers, and ask for help there if needed. Please don't use GitHub issues for problems running the notebooks.
|
||||
|
||||
If you make any pull requests to this repo, then you are assigning copyright of that work to Jeremy Howard and Sylvain Gugger. (Additionally, if you are making small edits to spelling or text, please specify the name of the file and very brief description of what you're fixing. It's becoming increasingly difficult for reviewers to know which corrections have already been made. Thank you.)
|
||||
If you make any pull requests to this repo, then you are assigning copyright of that work to Jeremy Howard and Sylvain Gugger. (Additionally, if you are making small edits to spelling or text, please specify the name of the file and a very brief description of what you're fixing. It's becoming increasingly difficult for reviewers to know which corrections have already been made. Thank you.)
|
||||
|
@ -61,7 +61,7 @@
|
||||
"source": [
|
||||
"A great solution is to host your blog on a platform called [GitHub Pages](https://pages.github.com/), which is free, has no ads or pay wall, and makes your data available in a standard way such that you can at any time move your blog to another host. But all the approaches I’ve seen to using GitHub Pages have required knowledge of the command line and arcane tools that only software developers are likely to be familiar with. For instance, GitHub's [own documentation](https://help.github.com/en/github/working-with-github-pages/creating-a-github-pages-site-with-jekyll) on setting up a blog includes a long list of instructions that involve installing the Ruby programming language, using the `git` command-line tool, copying over version numbers, and more—17 steps in total!\n",
|
||||
"\n",
|
||||
"To cut down the hassle, weve created an easy approach that allows you to use an *entirely browser-based interface* for all your blogging needs. You will be up and running with your new blog within about five minutes. It doesn’t cost anything, and you can easily add your own custom domain to it if you wish to. In this section, we'll explain how to do it, using a template we've created called *fast\\_template*. (NB: be sure to check the [book's website](https://book.fast.ai) for the latest blog recommendations, since new tools are always coming out)."
|
||||
"To cut down the hassle, we’ve created an easy approach that allows you to use an *entirely browser-based interface* for all your blogging needs. You will be up and running with your new blog within about five minutes. It doesn’t cost anything, and you can easily add your own custom domain to it if you wish to. In this section, we'll explain how to do it, using a template we've created called *fast\\_template*. (NB: be sure to check the [book's website](https://book.fast.ai) for the latest blog recommendations, since new tools are always coming out)."
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -76,10 +76,10 @@
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You’ll need an account on GitHub, so head over there now and create an account if you don’t have one already. Make sure that you are logged in. Normally, GitHub is used by software developers for writing code, and they use a sophisticated command-line tool to work with it—but we're going to show you an approach that doesn't use the command line at all!\n",
|
||||
"\n",_re
|
||||
"To get started, point your browser to [https://github.com/fastai/fast_template/generate](https://github.com/fastai/fast_template/generate) (you need to be logged in to GitHub for the link to work). This will allow you to create a place to store your blog, called a *repository*. You will a screen like the one in <<github_repo>>. Note that you have to enter your repository name using the *exact* format shown here—that is, your GitHub username followed by `.github.io`.\n",
|
||||
"\n",
|
||||
"To get started, point your browser to [https://github.com/fastai/fast_template/generate](https://github.com/fastai/fast_template/generate) (you need to be logged in to GitHub for the link to work). This will allow you to create a place to store your blog, called a *repository*. You will a screen like the one in <<githup_repo>>. Note that you have to enter your repository name using the *exact* format shown here—that is, your GitHub username followed by `.github.io`.\n",
|
||||
"\n",
|
||||
"<img width=\"440\" src=\"images/fast_template/image1.png\" id=\"githup_repo\" caption=\"Creating your repository\" alt=\"Screebshot of the GitHub page for creating a new repository\">\n",
|
||||
"<img width=\"440\" src=\"images/fast_template/image1.png\" id=\"github_repo\" caption=\"Creating your repository\" alt=\"Screenshot of the GitHub page for creating a new repository\">\n",
|
||||
"\n",
|
||||
"Once you’ve entered that, and any description you like, click \"Create repository from template.\" You have the choice to make the repository \"private,\" but since you are creating a blog that you want other people to read, having the underlying files publicly available hopefully won't be a problem for you.\n",
|
||||
"\n",
|
||||
@ -141,7 +141,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now you’re ready to create your first post. All your posts will go in the *\\_posts* folder. Click on that now, and then click the \"Create file\" button. You need to be careful to name your file using the format *<year>-<month>-<day>-<name>.md*, as shwon in <<fastpage_name>>, where *<year>* is a four-digit number, and *<month>* and *<day>* are two-digit numbers. *<name>* can be anything you want that will help you remember what this post was about. The *.md* extension is for markdown documents.\n",
|
||||
"Now you’re ready to create your first post. All your posts will go in the *\\_posts* folder. Click on that now, and then click the \"Create file\" button. You need to be careful to name your file using the format *<year>-<month>-<day>-<name>.md*, as shown in <<fastpage_name>>, where *<year>* is a four-digit number, and *<month>* and *<day>* are two-digit numbers. *<name>* can be anything you want that will help you remember what this post was about. The *.md* extension is for markdown documents.\n",
|
||||
"\n",
|
||||
"<img width=\"440\" src=\"images/fast_template/image8.png\" alt=\"Screenshot showing the right syntax to create a new blog post\" id=\"fastpage_name\" caption=\"Naming your posts\">\n",
|
||||
"\n",
|
||||
@ -156,7 +156,7 @@
|
||||
"source": [
|
||||
"As before, you can click the \"Preview\" button to see how your markdown formatting will look (<<fastpage_preview1>>).\n",
|
||||
"\n",
|
||||
"<img width=\"400\" src=\"images/fast_template/image10.png\" alt=\"Screenshot showing the same blog post interpreted in HTML\" id=\"fastpage_preview1\" caption=\"What the previous mardown syntax will look like on your blog\">\n",
|
||||
"<img width=\"400\" src=\"images/fast_template/image10.png\" alt=\"Screenshot showing the same blog post interpreted in HTML\" id=\"fastpage_preview1\" caption=\"What the previous markdown syntax will look like on your blog\">\n",
|
||||
"\n",
|
||||
"And you will need to click the \"Commit new file\" button to save it to GitHub, as shown in <<fastpage_commit1>>.\n",
|
||||
"\n",
|
||||
|
@ -46,7 +46,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Bias: Professor Lantanya Sweeney \"Arrested\""
|
||||
"### Bias: Professor Latanya Sweeney \"Arrested\""
|
||||
]
|
||||
},
|
||||
{
|
||||
|
Loading…
Reference in New Issue
Block a user