init commit

2020-02-28 11:44:06 -08:00 · 2020-02-28 11:44:06 -08:00 · 0a3a554230
commit 0a3a554230
parent 366cacac02
204 changed files with 49017 additions and 0 deletions
--- a/01_intro.ipynb
+++ b/01_intro.ipynb
--- a/02_production.ipynb
+++ b/02_production.ipynb
--- a/03_ethics.ipynb
+++ b/03_ethics.ipynb
@ -0,0 +1,988 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    "[[chapter_ethics]]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Data Ethics"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Acknowledgement: Dr Rachel Thomas"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This chapter was co-authored by Dr Rachel Thomas, the co-founder of fast.ai, and founding director of the Center for Applied Data Ethics at the University of San Francisco. It largely follows a subset of her syllabus for the \"Introduction to Data Ethics\" course that she developed."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Introduction to data ethics"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "As we discussed in chapters 1 and 2, sometimes, machine learning models can go wrong. They can have bugs. They can be presented with data that they haven't seen before, and behave in ways we don't expect. Or, they could work exactly as designed, but be used for something that you would much prefer they were never ever used for.\n",
+    "\n",
+    "Because deep learning is such a powerful tool and can be used for so many things, it becomes particularly important that we consider the consequences of our choices. The philosophical study of *ethics* is the study of right and wrong, including how we can define those terms, recognise right and wrong actions, and understand the connection between actions and consequences. The field of *data ethics* has been around for a long time, and there are many academics focused on this field. It is being used to help define policy in many jurisdictions; it is being used in companies big and small to consider how best to ensure good societal outcomes from product development; and it is being used by researchers who want to make sure that the work they are doing is used for good, and not for bad.\n",
+    "\n",
+    "As a deep learning practitioner, therefore, it is likely that at some point you are going to be put in a situation where you need to consider data ethics. So what is data ethics? It's a subfield of ethics, so let's start there."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> j: At university, philosophy of ethics was my main thing (it would have been the topic of my thesis, if I'd finished it, instead of dropping out to join the real-world). Based on the years I spent studying ethics, I can tell you this: no one really agrees on what right and wrong are, whether they exist, how to spot them, which people are good, and which bad, or pretty much anything else. So don't expect too much from the theory! We're going to focus on examples and thoughts starters here, not theory."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In answering the question [What is Ethics](https://www.scu.edu/ethics/ethics-resources/ethical-decision-making/what-is-ethics/), The Markkula Center for Applied Ethics says that *ethics* refers to:\n",
+    "\n",
+    "- Well-founded standards of right and wrong that prescribe what humans ought to do, and\n",
+    "- The study and development of one's ethical standards.\n",
+    "\n",
+    "There is no list of right answers for ethics. There is no list of dos and don'ts. Ethics is complicated, and context-dependent. It involves the perspectives of many stakeholders. Ethics is a muscle that you have to develop and practice. In this chapter, our goal is to provide some signposts to help you on that journey.\n",
+    "\n",
+    "Spotting ethical issues is best to do as part of a collaborative team. This is the only way you can really incorporate different perspectives. Different people's backgrounds will help them to see things which may not be obvious to you. Working with a team is helpful for many \"muscle building\" activities, including this one.\n",
+    "\n",
+    "This chapter is certainly not the only part of the book where we talk about data ethics, but it's good to have a place where we focus on it for a while. To get oriented, it's perhaps easiest to look at a few examples. So we picked out three that we think illustrate effectively some of the key topics."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Getting started with some examples"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We are going to start with three specific examples that illustrate three common ethical issues in tech:\n",
+    "\n",
+    "1.  **Recourse processes**: Arkansas's buggy healthcare algorithms left patients stranded\n",
+    "2.  **Feedback loops**: YouTube's recommendation system helped unleash a conspiracy theory boom\n",
+    "3.  **Bias**: When a traditionally African-American name is searched for on Google, it displays ads for criminal background checks.\n",
+    "\n",
+    "In fact, for every concept that we introduce in this chapter, we are going to provide at least one specific example. For each one, have a think about what you could have done in this situation, and think about what kinds of obstructions there might have been to you getting that done. How would you deal with them? What would you look out for?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Bugs and recourse: Buggy algorithm used for healthcare benefits"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The Verge investigated software used in over half of U.S. states to determine how much healthcare people receive, and documented their findings in an article [What Happens When an Algorithm Cuts Your Healthcare](https://www.theverge.com/2018/3/21/17144260/healthcare-medicaid-algorithm-arkansas-cerebral-palsy). After implementation of the algorithm in Arkansas, people (many with severe disabilities) drastically had their healthcare cut. For instance, Tammy Dobbs, a woman with cerebral palsy who needs an aid to help her to get out of bed, to go to the bathroom, to get food, and more, had her hours of help suddenly reduced by 20 hours a week. She couldn’t get any explanation for why her healthcare was cut. Eventually, a court case revealed that there were mistakes in the software implementation of the algorithm, negatively impacting people with diabetes or cerebral palsy. However, Dobbs and many other people reliant on these health care benefits live in fear that their benefits could again be cut suddenly and inexplicably."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Feedback loops: YouTube's recommendation system"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Feedback loops can occur when your model is controlling the next round of data you get. The data that is returned quickly becomes flawed by the software itself.\n",
+    "\n",
+    "For instance, in <<chapter_production>> we briefly mentioned the reinforcement learning algorithm which Google introduced for YouTube's recommendation system. YouTube has 1.9bn users, who watch over 1 billion hours of YouTube videos a day. Their algorithm, which was designed to optimise watch time, is responsible for around 70% of the content that is watched. It led to out-of-control feedback loops, leading the New York Times to run the headline \"YouTube Unleashed a Conspiracy Theory Boom. Can It Be Contained?\". Ostensibly recommendation systems are predicting what content people will like, but they also have a lot of power in determining what content people even see."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Bias: Professor Lantanya Sweeney \"arrested\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Dr. Latanya Sweeney is a professor at Harvard and director of their data privacy lab. In the paper [Discrimination in Online Ad Delivery](https://arxiv.org/abs/1301.6822) she describes her discovery that googling her name resulted in advertisements saying \"Latanya Sweeney arrested\" even although she is the only Latanya Sweeney and has never been arrested. However when she googled other names, such as Kirsten Lindquist, she got more neutral ads, even though Kirsten Lindquist has been arrested three times."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image1.png\" id=\"lantanya_arrested\" caption=\"Google search showing Professor Lantanya Sweeney 'arrested'\" alt=\"Screenshot of google search showing Professor Lantanya Sweeney 'arrested'\" width=\"400\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Being a computer scientist, she studied this systematically, and looked at over 2000 names. She found that this pattern held where historically black names received advertisements suggesting that the person had a criminal record. Whereas, white names had more neutral advertisements.\n",
+    "\n",
+    "This is an example of bias. It can make a big difference to people's lives — for instance, if a job applicant is googled that it may appear that they have a criminal record when they do not."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## So what?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "One very natural reaction to considering these issues is: \"So what? What's that got to do with me? I'm a data scientist, not a politician. I'm not the senior executive at my company who make the decisions about what we do. I'm just trying to build the most predictive model I can.\"\n",
+    "\n",
+    "These are very reasonable questions. But we're going to try to convince you that the answer is: everybody who is training models absolutely needs to consider how their model will be used. And to consider how to best ensure that it is used as positively as possible. There are things you can do. And if you don't do these things, then things can go pretty bad.\n",
+    "\n",
+    "One particularly hideous example of what happens when technologists focus on technology at all costs is the story of IBM and Nazi Germany. A Swiss judge ruled \"It does not thus seem unreasonable to deduce that IBM's technical assistance facilitated the tasks of the Nazis in the commission of their crimes against humanity, acts also involving accountancy and classification by IBM machines and utilized in the concentration camps themselves.\"\n",
+    "\n",
+    "IBM, you see, supplied the Nazis with data tabulation products necessary to track the extermination of Jews and other groups on a massive scale. This was driven from the top of the company, with marketing to Hitler and his leadership team. Company President Thomas Watson personally approved the 1939 release of special IBM alphabetizing machines to help organize the deportation of Polish Jews. Pictured here is Adolf Hitler (far left) meeting with IBM CEO Tom Watson Sr. (2nd from left), shortly before Hitler awarded Watson a special “Service to the Reich” medal in 1937:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image2.png\" id=\"meeting\" caption=\"IBM CEO Tom Watson Sr. metting with Adolf Hitler\" alt=\"A picture of IBM CEO Tom Watson Sr. metting with Adolf Hitler\" width=\"400\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "But it also happened throughout the organization. IBM and its subsidiaries provided regular training and maintenance on-site at the concentration camps: printing off cards, configuring machines, and repairing them as they broke frequently. IBM set up categorizations on their punch card system for the way that each person was killed, which group they were assigned to, and the logistical information necessary to track them through the vast Holocaust system. IBM's code for Jews in the concentration camps  was 8, where around 6,000,000 were killed. Its code for Romanis was 12 (they were labeled by the Nazis as \"asocials\", with over 300,000 killed in the *Zigeunerlager*, or “Gypsy camp”). General executions were coded as 4, death in the gas chambers as 6."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image3.jpeg\" id=\"punch_card\" caption=\"A punch card used by IBM in concentration camps\" alt=\"Picture of a punch card used by IBM in concentration camps\" width=\"600\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Of course, the project managers and engineers and technicians involved were just living their ordinary lives. Caring for their families, going to the church on Sunday, doing their jobs as best as they could. Following orders. The marketers were just doing what they could to meet their business development goals. Edwin Black, author of \"IBM and the Holocaust\", said: \"To the blind technocrat, the means were more important than the ends. The destruction of the Jewish people became even less important because the invigorating nature of IBM's technical achievement was only heightened by the fantastical profits to be made at a time when bread lines stretched across the world.\"\n",
+    "\n",
+    "Step back for a moment and consider: how would you feel if you discovered that you had been part of a system that ending up hurting society? Would you even know? Would you be open to finding out? How can you help make sure this doesn't happen? We have described the most extreme situation here in Nazi Germany, but there are many negative societal consequences happening due to AI and machine learning right now, some of which we'll describe in this chapter.\n",
+    "\n",
+    "It's not just a moral burden either. Sometimes, technologists very directly pay for their actions. For instance, the first person who was jailed as a result of the Volkswagen scandal, where the car company cheated on their diesel emissions tests, was not the manager that oversaw the project, or an executive at the helm of the company. It was one of the engineers, James Liang, who just did what he was told.\n",
+    "\n",
+    "On the other hand, if a project you are involved in turns out to make a huge positive impact on even one person, this is going to make you feel pretty great!\n",
+    "\n",
+    "Okay, so hopefully we have convinced you that you ought to care. Now the question is: can you actually do anything can you make an impact beyond just maximising the predictive power of your models? Consider the pipeline are steps that occurs between the development of a model or an algorithm by a researcher or practitioner, and the point at which this work is actually used to make some decision. Normally there is a very long chain from one end to the other. This is especially true if you are a researcher where you don't even know if your research will ever get used for anything. It's especially tricky if you're involved in data collection, which is even earlier in the pipeline.\n",
+    "\n",
+    "Data often ends up being used for different purposes than why it was originally collected.  IBM began selling to Nazi Germany well before the Holocaust, including helping with Germany’s 1933 census conducted by Adolf Hitler, which was effective at identifying far more Jewish people than had previously been recognized in Germany. US census data was used to round up Japanese-Americans (who were US citizens) for internment during World War II. It is important to recognize how data and images collected can be weaponized later. Columbia professor [Tim Wu wrote](https://www.nytimes.com/2019/04/10/opinion/sunday/privacy-capitalism.html) that “You must assume that any personal data that Facebook or Android keeps are data that governments around the world will try to get or that thieves will try to steal.”"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Integrating machine learning with product design"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Presumably the reason you're doing this work is because you hope it will be used for something. Otherwise, you're just wasting your time. So, let's start with the assumption that your work will end up somewhere. Now, as you are collecting your data and developing your model, you are making lots of decisions. What level of aggregation will you store your data at? What loss function should you use? What validation and training sets should you use? Should you focus on simplicity of implementation, speed of inference, or accuracy of the model? How will your model handle out of domain data items? Can it be fine-tuned, or must it be retrained from scratch over time?\n",
+    "\n",
+    "These are not just algorithm questions. They are data product design questions. But the product managers, executives, judges, journalists, doctors… whoever ends up developing and using the system of which your model is a part will not be well-placed to understand the decisions that you made, let alone change them.\n",
+    "\n",
+    "For instance, two studies found that Amazon’s facial recognition software produced [inaccurate](https://www.nytimes.com/2018/07/26/technology/amazon-aclu-facial-recognition-congress.html) and [racially biased results](https://www.theverge.com/2019/1/25/18197137/amazon-rekognition-facial-recognition-bias-race-gender). Amazon claimed that the researchers should have changed the default parameters. However, it turned out that [Amazon was not instructing police departments](https://gizmodo.com/defense-of-amazons-face-recognition-tool-undermined-by-1832238149) that use its software to do this either. There was, presumably, a big distance between the researchers that developed these algorithms, and the Amazon documentation staff that wrote the guidelines provided to the police. A lack of tight integration led to serious problems for society, the police, and Amazon themselves. It turned out that their system erroneously *matched* 28 members of congress to criminal mugshots!  (And these members of congress wrongly matched to criminal mugshots disproportionately included people of color.)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image4.png\" id=\"congressmen\" caption=\"Congressmen matched to criminal mugshots by Amazon software\" alt=\"Picture of the congressmen matched to criminal mugshots by Amazon software, they are disproportionatedly people of color\" width=\"500\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Data scientists need to be part of a cross disciplinary team. And researchers need to work closely with the kinds of people who will end up using their research. Better still is if the domain experts themselves have learnt enough to be able to train and debug some models themselves — hopefully there's a few of you reading this book right now!\n",
+    "\n",
+    "The modern workplace is a very specialised place. Everybody tends to have very well-defined jobs to perform. Especially in large companies, it can be very hard to know what all the pieces of the puzzle are. Sometimes companies even intentionally obscure the overall project goals that are being worked on, if they know that their employees are not going to like the answers. This is sometimes done by compartmentalising every piece as much as possible\n",
+    "\n",
+    "In other words, we're not saying that any of this is easy. It's hard. It's really hard. We all have to do our best. And with often seen that the people who do get involved in the higher-level context of these projects, and attempt to develop cross disciplinary capabilities and teams, become some of the most important and well rewarded parts of their organisations. It's the kind of work that tends to be highly appreciated by senior executives, even if it is considered, sometimes, rather uncomfortable by middle management."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Topics in Data Ethics"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Data ethics is a big field, and we can't cover everything. Instead, we're going to pick a few topics which we think are particularly relevant:\n",
+    "- need for recourse and accountability\n",
+    "- feedback loops\n",
+    "- bias\n",
+    "- disinformation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Errors and recourse"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In a complex system it is easy for no one person to feel responsible for outcomes. While this is understandable, it does not lead to good results. In the example above of the Arkansas healthcare system in which a bug led to people with cerebral palsy losing access to needed care, the creator of the algorithm blamed government officials, and government officials could blame those who implemented the software. NYU professor danah boyd described this phenomenon: \"bureaucracy has often been used to evade responsibility, and today's algorithmic systems are extending bureaucracy.\"\n",
+    "\n",
+    "An additional reason why recourse is so necessary, is because data often contains errors. Mechanisms for audits and error-correction are crucial. A database of suspected gang members maintained by California law enforcement officials was found to be full of errors, including 42 babies who had been added to the database when they were less than 1 year old (28 of whom were marked as “admitting to being gang members”). In this case, there was no process in place for correcting mistakes or removing people once they’ve been added. Another example is the US credit report system; in a large-scale study of credit reports by the FTC in 2012, it was found that 26% of consumers had at least one mistake in their files, and 5% had errors that could be devastating.  Yet, the process of getting such errors corrected is incredibly slow and opaque. When public-radio reporter Bobby Allyn discovered that he was erroneously listed as having a firearms conviction, it took him \"more than a dozen phone calls, the handiwork of a county court clerk and six weeks to solve the problem. And that was only after I contacted the company’s communications department as a journalist.\" (as covered in the article [How the careless errors of credit reporting agencies are ruining people’s lives](https://www.washingtonpost.com/posteverything/wp/2016/09/08/how-the-careless-errors-of-credit-reporting-agencies-are-ruining-peoples-lives/))\n",
+    "\n",
+    "As machine learning practitioners, we do not always think of it as our responsibility to understand how our algorithms and up being implemented in practice. But we need to."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Feedback loops"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The New York Times published another article on YouTube's recommendation system, titled [On YouTube’s Digital Playground, an Open Gate for Pedophiles](https://www.nytimes.com/2019/06/03/world/americas/youtube-pedophiles.html). The article started with this chilling story:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> : Christiane C. didn’t think anything of it when her 10-year-old daughter and a friend uploaded a video of themselves playing in a backyard pool… A few days later… the video had thousands of views. Before long, it had ticked up to 400,000... “I saw the video again and I got scared by the number of views,” Christiane said. She had reason to be. YouTube’s automated recommendation system… had begun showing the video to users who watched other videos of prepubescent, partially clothed children, a team of researchers has found.\n",
+    "\n",
+    "> : On its own, each video might be perfectly innocent, a home movie, say, made by a child. Any revealing frames are fleeting and appear accidental. But, grouped together, their shared features become unmistakable."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "YouTube's recommendation algorithm had begun curating playlists for pedophiles, picking out innocent home videos that happened to contain prepubescent, partially clothed children. \n",
+    "\n",
+    "No one at Google planned to create a system that turned family videos into porn for pedophiles. So what happened?\n",
+    "\n",
+    "Part of the problem here is the centrality of metrics in driving a financially important system. When an algorithm has a metric to optimise, as you have seen, it will do everything it can to optimise that number. This tends to lead to all kinds of edge cases, and humans interacting with a system will search for, find, and exploit these edge cases and feedback loops for their advantage.\n",
+    "\n",
+    "There are signs that this is exactly what has happened with YouTube's recommendation system. The Guardian ran an article [How an ex-YouTube insider investigated its secret algorithm](https://www.theguardian.com/technology/2018/feb/02/youtube-algorithm-election-clinton-trump-guillaume-chaslot) about Guillaume Chaslot, an ex-YouTube engineer who created AlgoTransparency, which tracks these issues. Chaslot published this chart, following the release of Robert Mueller's \"Report on the Investigation Into Russian Interference in the 2016 Presidential Election\":"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image18.jpeg\" id=\"ethics_yt_rt\" caption=\"Coverage of the Mueller report\" alt=\"Coverage of the Mueller report\" width=\"500\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Russia Today's coverage of the Mueller report was an extreme outlier in how many channels were recommending it. This suggests the possibility that Russia Today, a state-owned Russia media outlet, has been successful in gaming YouTube's recommendation algorithm.  The lack of transparency of systems like this make it hard to uncover the kinds of problems that we're discussing.\n",
+    "\n",
+    "Another example of a feedbakc loop is a predictive policing algorithm that predicts more crime in certain neighborhoods, causing more police officers to be sent to those neighborhoods, which can result in more crime being recorded in those neighborhoods, and so on. University of Utah computer science processor Suresh Venkatasubramanian says about this: \"Predictive policing is aptly named: it is predicting future policing, not future crime.”\n",
+    "\n",
+    "There are positive examples of people and organizations attempting to combat these problems. Evan Estola, lead machine learning engineer at Meetup, [discussed the example](https://www.youtube.com/watch?v=MqoRzNhrTnQ) of men expressing more interest than women in tech meetups. Meetup’s algorithm could recommend fewer tech meetups to women, and as a result, fewer women would find out about and attend tech meetups, which could cause the algorithm to suggest even fewer tech meetups to women, and so on in a self-reinforcing feedback loop. Evan and his team made the ethical decision for their recommendation algorithm to not create such a feedback loop, but explicitly not using gender for that part of their model. It is encouraging to see a company not just unthinkingly optimize a metric, but to consider their impact. \"You need to decide which feature not to use in your algorithm… the most optimal algorithm is perhaps not the best one to launch into production\", he said.\n",
+    "\n",
+    "While Meetup chose to avoid such an outcome, Facebook provides an example of allowing a runaway feedback loop to run wild. Facebook radicalizes users interested in one conspiracy theory by introducing them to more. As [Renee DiResta, a researcher on proliferation of disinformation, writes](https://www.fastcompany.com/3059742/social-network-algorithms-are-distorting-reality-by-boosting-conspiracy-theories):"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> : \"once people join a single conspiracy-minded \\[Facebook\\] group, they are algorithmically routed to a plethora of others. Join an anti-vaccine group, and your suggestions will include anti-GMO, chemtrail watch, flat Earther (yes, really), and ‘curing cancer naturally’ groups. Rather than pulling a user out of the rabbit hole, the recommendation engine pushes them further in.\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Bias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Discussions of bias online tend to get pretty confusing pretty fast. The word bias mean so many different things. Statisticians often think that when data ethicists are talking about bias that they're talking about the statistical definition of the term bias. But they're not. And they're certainly not talking about the bias is that appear in the weights and bias is which are the parameters of your model!\n",
+    "\n",
+    "What they're talking about is the social science concept of bias. In [A Framework for Understanding Unintended Consequences of Machine Learning](https://arxiv.org/abs/1901.10002) MIT's Suresh and Guttag describe six types of bias in machine learning, summarized in this figure from their paper:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image5.png\" id=\"bias\" caption=\"Bias in machine learning can come from multiple sources\" alt=\"A diagram showing all sources where bias can appear in machine learning\" width=\"650\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We'll discuss four of these types of bias here (see the paper for details on the others)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Historical bias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "*Historical bias* comes from the fact that people are biased, processes are biased, and society is biased. Suresh and Guttag say: \"Historical bias is a fundamental, structural issue with the first step of the data generation process and can exist even given perfect sampling and feature selection\".\n",
+    "\n",
+    "For instance, here's a few examples of historical *race bias* in the US, from the NY Times article [Racial Bias, Even When We Have Good Intentions](https://www.nytimes.com/2015/01/04/upshot/the-measuring-sticks-of-racial-bias-.html), by the University of Chicago's Sendhil Mullainathan:\n",
+    "\n",
+    "  - When doctors were shown identical files, they were much less likely to recommend cardiac catheterization (a helpful procedure) to Black patients\n",
+    "  - When bargaining for a used car, Black people were offered initial prices $700 higher and received far smaller concessions\n",
+    "  - Responding to apartment-rental ads on Craigslist with a Black name elicited fewer responses than with a white name\n",
+    "  - An all-white jury was 16 points more likely to convict a Black defendant than a white one, but when a jury had 1 Black member, it convicted both at same rate.\n",
+    "\n",
+    "The COMPAS algorithm, widely used for sentencing and bail decisions in the US, is an example of an important algorithm which, when tested by ProPublica, showed clear racial bias in practice:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image6.png\" id=\"bail_algorithm\" caption=\"Results of the COMPAS algorithm\" alt=\"Table showing the COMPAS algorithm is more likely to give bail to white people, even if they re-offend more\" width=\"700\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Any dataset involving humans can have this kind of bias, such as medical data, sales data housing data, political data, and so on. Because underlying bias is so pervasive, bias in datasets is very pervasive. Racial bias even turns up in computer vision, as shown in this example of auto-categorized photos shared on Twitter by a Google Photos user:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image7.png\" id=\"google_photos\" caption=\"One of these labels is very wrong...\" alt=\"Screenshot of the use of Google photos labeling a black user and her friend as gorillas\" width=\"450\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Yes, that is showing what you think it is: Google Photos classified a Black user's photo with their friend as \"gorrilas\"! This algorithmic mis-step got a lot of attention in the media. “We’re appalled and genuinely sorry that this happened,” a company spokeswoman said. “There is still clearly a lot of work to do with automatic image labeling, and we’re looking at how we can prevent these types of mistakes from happening in the future.”\n",
+    "\n",
+    "Unfortunately, fixing problems in machine learning systems when the input data has problems is hard. Google's first attempt didn't inspire confidence, as covered by The Guardian:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image8.png\" id=\"gorilla-ban\" caption=\"Google first response to the problem\" alt=\"Pictures of a headlines form the Guardian, whoing Google removed gorialls and other moneys fomr the possible labels of its algorithm\" width=\"500\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "These kinds of problem are certainly not limited to just Google. MIT researchers studied the most popular online computer vision APIs to see how accurate they were. But they didn't just calculate a single accuracy number—instead, they looked at the accuracy across four different groups:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image9.jpeg\" id=\"face_recognition\" caption=\"Error rate per gender and race for various facial recognition systems\" alt=\"Table showing how various facial recognition systems perform way worse on darker shades of skin and females\" width=\"600\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "IBM's system, for instance, had a 34.7% error rate for darker females, vs 0.3% for lighter males—over 100 times more errors! Some people incorrectly reacted to these experiments by claiming that the difference was simply because darker skin is harder for computers to recognise. However, what actually happened, is after the negative publicity that this result created, all of the companies in question dramatically improved their models for darker skin, such that one year later they were nearly as good as for lighter skin. So what this actually showed is that the developers failed to utilise datasets containing enough darker faces, or test their product with darker faces.\n",
+    "\n",
+    "One of the MIT researchers, Joy Buolamwini, warned, \"We have entered the age of automation overconfident yet underprepared. If we fail to make ethical and inclusive artificial intelligence, we risk losing gains made in civil rights and gender equity under the guise of machine neutrality\".\n",
+    "\n",
+    "Part of the issue appears to be a systematic imbalance in the make up of popular datasets used for training models. The abstract to the paper [No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World](https://arxiv.org/abs/1711.08536) states, \"We analyze two large, publicly available image data sets to assess geo-diversity and find that these data sets appear to exhibit an observable amerocentric and eurocentric representation bias. Further, we analyze classifiers trained on these data sets to assess the impact of these training distributions and find strong differences in the relative performance on images from different locales\". Here is one of the charts from the paper, showing the geographic make up of what was, at the time (and still, as this book is being written), the two most important image datasets for training models:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image10.png\" id=\"image_provenance\" caption=\"Image provenance in popular training sets\" alt=\"Grpahs showing how the vast majority of images in porpular training dataset come from the US or Western Europe\" width=\"800\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The vast majority of the images are from the United States and other Western countries, leading to models trained on ImageNet performing worse on scenes from other countries and cultures. For instance, [research](https://arxiv.org/pdf/1906.02659.pdf) found that such models are worse at identifying household items (such as soap, spices, sofas, or beds) from lower-income countries. Below is an image from the paper, [Does Object Recognition Work for Everyone?](https://arxiv.org/pdf/1906.02659.pdf)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image17.png\" id=\"object_detect\" caption=\"Object detection in action\" alt=\"Figure showing an object detection algorithm performing better on western products\" width=\"500\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "As we will discuss shortly, in addition, the vast majority of AI researchers and developers are young white men. Most projects that we have seen do most user testing using friends and families of the immediate product development group. Given this, the kinds of problems we saw above should not be surprising.\n",
+    "\n",
+    "Similar historical bias is found in the texts used as data for natural language processing models. This crops up in downstream machine learning tasks in many ways. For instance, until last year Google Translate showed systematic bias in how it translated the Turkish gender-neutral pronoun \"bir\" into English. For instance, when applied to jobs which are often associated with males, it used \"he\", and when applied to jobs which are often associated with females, it used \"she\":"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image11.png\" width=\"600\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We also see this kind of bias in online advertisements. For instance, a study in 2019 found that even when the person placing the ad does not intentionally discriminate, Facebook will show the ad to very different audiences used on race and gender. Housing ads with the same text, but changing the picture, between a white family and a black family, were shown to racially different audiences."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Measurement bias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the paper [Does Machine Learning Automate Moral Hazard and Error](https://scholar.harvard.edu/files/sendhil/files/aer.p20171084.pdf) in *American Economic Review*, the authors look at a model that tries to answer the question: using historical EHR data, what factors are most predictive of stroke? This are the top predictors from the model:\n",
+    "\n",
+    "  - Prior Stroke\n",
+    "  - Cardiovascular disease\n",
+    "  - Accidental injury\n",
+    "  - Benign breast lump\n",
+    "  - Colonoscopy\n",
+    "  - Sinusitis\n",
+    "\n",
+    "However, only the top two have anything to do with a stroke! Based on what we've studied so far, you can probably guess why. We haven’t really measured *stroke*, which occurs when a region of the brain is denied oxygen due to an interruption in the blood supply. What we’ve measured is who: had symptoms, went to a doctor, got the appropriate tests, AND received a diagnosis of stroke. Actually having a stroke is not the only thing correlated with this complete list — it's also correlated with being the kind of person who actually goes to the doctor (which is influenced by who has access to healthcare, can afford their co-pay, doesn't experience racial or gender-based medical discrimination, and more)! If you are likely to go to the doctor for an *accidental injury*, then you are likely to also go the doctor when you are having a stroke.\n",
+    "\n",
+    "This is an example of *measurement bias*. It occurs when our models make mistakes because we are measuring the wrong thing, or measuring it in the wrong way, or incorporating that measurement into our model inappropriately."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Aggregation Bias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "*Aggregation bias* occurs when models do not aggregate data in a way that incorporates all of the appropriate factors, or when a model does not include the necessary interaction terms, nonlinearities, or so forth. This can particularly occur in medical settings. For instance, the way diabetes is treated is often based on simple univariate statistics and studies involving small groups of heterogeneous people. Analysis of results is often done in a way that does not take account of different ethnicities or genders. However it turns out that diabetes patients have [different complications across ethnicities](https://www.ncbi.nlm.nih.gov/pubmed/24037313), and HbA1c levels (widely used to diagnose and monitor diabetes) [differ in complex ways across ethnicities and genders](https://www.ncbi.nlm.nih.gov/pubmed/22238408). This can result in people being misdiagnosed or incorrectly treated because medical decisions are based on a model which does not include these important variables and interactions."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Representation Bias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The abstract of the paper [Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting](https://arxiv.org/abs/1901.09451) notes that there is gender imbalance in occupations (e.g. females are more likely to be nurses, and males are more likely to be pastors), and says that: \"differences in true positive rates between genders are correlated with existing gender imbalances in occupations, which may compound these imbalances\".\n",
+    "\n",
+    "What this is saying is that the researchers noticed that models predicting occupation did not only reflect the actual gender imbalance in the underlying population, but actually amplified it! This is quite common, particularly for simple models. When there is some clear, easy to see underlying relationship, a simple model will often simply assume that that relationship holds all the time. As the show with the paper, for occupations which had a higher percentage of females, the model tended to overestimate the prevalence of that occupation:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image12.png\" id=\"representation_bias\" caption=\"Model error in predicting occupation plotted against percentage of women in said occupation\" alt=\"Graph showing how model predictions overamplify existing bias\" width=\"500\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "For example, in the training dataset, 14.6% of surgeons were women, yet in the model predictions, only 11.6% of the true positives were women."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Addressing different types of bias"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Different types of bias require different approaches for mitigation. While gathering a more diverse dataset can address representation bias, this would not help with historical bias or measurement bias.  All datasets contain bias.  There is no such thing as a completely de-biased dataset.  Many researchers in the field have been converging on a set of proposals towards better documenting the decisions, context, and specifics about how and why a particular dataset was created, what scenarios it is appropriate to use in, and what the limitations are.  This way, those using the dataset will not be caught off-guard by its biases and limitations."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Humans are biased, so does algorithmic bias matter?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We often hear this question — \"humans are biased, so does algorithmic bias even matter?\" This comes up so often, there must be some reasoning that makes sense to the people that ask it, but it doesn't seem very logically sound to us! Independently of whether this is logically sound, it's important to realise that algorithms and people are different. Machine learning, particularly so. Consider these points about machine learning algorithms:\n",
+    "\n",
+    "  - *Machine learning can create feedback loops*: small amounts of bias can very rapidly, exponentially increase due to feedback loops\n",
+    "  - *Machine learning can amplify bias*: human bias can lead to larger amounts of machine learning bias\n",
+    "  - *Algorithms & humans are used differently*: human decision makers and algorithmic decision makers are not used in a plug-and-play interchangeable way in practice.  For instance, algorithmic decisions are more likely to be implemented at scale and without a process for recourse.  Furthermore, people are more likely to mistakenly believe that the result of an algorithm is objective and error-free.\n",
+    "  - *Technology is power*. And with that comes responsibility.\n",
+    "\n",
+    "As the Arkansas healthcare example showed, machine learning is often implemented in practice not because it leads to better outcomes, but because it is cheaper and more efficient. Cathy O'Neill, in her book *Weapons of Math Destruction*, described the pattern of how the privileged are processed by people, the poor are processed by algorithms. This is just one of a number of ways that algorithms are used differently than human decision makers. Others include:\n",
+    "\n",
+    "  - People are more likely to assume algorithms are objective or error-free (even if they’re given the option of a human override)\n",
+    "  - Algorithms are more likely to be implemented with no appeals process in place\n",
+    "  - Algorithms are often used at scale\n",
+    "  - Algorithmic systems are cheap."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Data contains errors"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Because data is likely to contain errors, mechanisms for audits and error-correction are important. A database of suspected gang members maintained by California law enforcement officials was found to be full of errors, including 42 babies who had been added to the database when they were less than 1 year old (28 of whom were marked as *admitting to being gang members*). In this case, there was no process in place for correcting mistakes or removing people once they’ve been added. Another example is the US credit report system; in a large-scale study of credit reports by the FTC in 2012, it was found that 26% of consumers had at least one mistake in their files, and 5% had errors that could be devastating.  Yet, the process of getting such errors corrected is incredibly slow and opaque. When public-radio reporter Bobby Allyn discovered that he was erroneously listed as having a firearms conviction, it took him \"more than a dozen phone calls, the handiwork of a county court clerk and six weeks to solve the problem. And that was only after I contacted the company’s communications department as a journalist.\" (as covered in the article [How the careless errors of credit reporting agencies are ruining people’s lives](https://www.washingtonpost.com/posteverything/wp/2016/09/08/how-the-careless-errors-of-credit-reporting-agencies-are-ruining-peoples-lives/))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Disinformation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "*Disinformation* has a history stretching back hundreds or even thousands of years. It is not necessarily about getting someone to believe something false, but rather, often to sow disharmony and uncertainty, and to get people to give up on seeking the truth.  Receiving conflicting accounts can lead people to assume that they can never know what to trust.\n",
+    "\n",
+    "Some people think disinformation is primarily about false information or *fake news*, but in reality, disinformation can often contain seeds of truth, or involve half-truths taken out of context.  Ladislav Bittman, who was an intelligence officer in the USSR who later defected to the United States and wrote some books in the 1970s and 1980s on the role of disinformation in Soviet propaganda operations. He said, \"Most campaigns are a carefully designed mixture of facts, half-truths, exaggerations, & deliberate lies.\"\n",
+    "\n",
+    "In the United States this has hit close to home in recent years, with the FBI detailing a massive disinformation campaign linked to Russia in the 2016 US election. Understanding the disinformation that was used in this campaign is very educational. For instance, the FBI found that the Russian disinformation campaign often organized two separate fake *grass roots* protests, one for each side of an issue, and got them to protest at the same time! The Houston Chronicle reported on one of these odd events:\n",
+    "\n",
+    "> : A group that called itself the \"Heart of Texas\" had organized it on social media — a protest, they said, against the \"Islamization\" of Texas. On one side of Travis Street, I found about 10 protesters. On the other side, I found around 50 counterprotesters. But I couldn't find the rally organizers. No \"Heart of Texas.\" I thought that was odd, and mentioned it in the article: What kind of group is a no-show at its own event? Now I know why. Apparently, the rally's organizers were in Saint Petersburg, Russia, at the time. \"Heart of Texas\" is one of the internet troll groups cited in Special Prosecutor Robert Mueller's recent indictment of Russians attempting to tamper with the U.S. presidential election."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"images/ethics/image13.png\" id=\"teax\" caption=\"Event organized by the group Heart of Texas\" alt=\"Screenshot of an event organized by the group Heart of Texas\" width=\"300\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Disinformation often involves coordinated campaigns of inauthentic behavior.  For instance, fraudulent accounts may try to make it seem like many people hold a particular viewpoint.  While most of us like to think of ourselves as independent-minded, in reality we evolved to be influenced by others in our in-group, and in opposition to those in our out-group.  Online discussions can influence our viewpoints, or alter the range of what we consider acceptable viewpoints. Humans are social animals, and as social animals we are extremely influenced by the people around us. Increasingly, radicalisation occurs in online environments. So influence is coming from people in the virtual space of online forums and social networks.\n",
+    "\n",
+    "Disinformation through auto-generated text is a particularly significant issue, due to the greatly increased capability provided by deep learning. We discuss this issue in depth when we learn to create language models, in <<chapter_nlp>>.\n",
+    "\n",
+    "One proposed approach is to develop some form of digital signature, implement it in a seamless way, and to create norms that we should only trust content which has been verified.  Head of the Allen Institute on AI, Oren Etzioni, wrote such a proposal in an article titled [How Will We Prevent AI-Based Forgery?](https://hbr.org/2019/03/how-will-we-prevent-ai-based-forgery), \"AI is poised to make high-fidelity forgery inexpensive and automated, leading to potentially disastrous consequences for democracy, security, and society. The specter of AI forgery means that we need to act to make digital signatures de rigueur as a means of authentication of digital content.\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## What to do"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Mistakes happen. Finding out about them, and dealing with them, needs to be part of the design of any system that includes machine learning (and many other systems too).  The issues raised within data ethics are often complex and interdisciplinary, but it is crucial that we work to address them.\n",
+    "\n",
+    "So what can we do?  This is a big topic, but a few steps towards addressing ethical issues are:\n",
+    "\n",
+    "- analyze a project you are working on\n",
+    "- implement processes at your company to find and address ethical risks\n",
+    "- support good policy\n",
+    "- increase diversity"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Analyze a project you are working on"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "It's easy to miss important issues when considered ethical implications of your work. One thing that helps enormously is simply asking the right questions. Rachel Thomas recommends considering the following questions throughout the development of a data project:\n",
+    "\n",
+    "  - Should we even be doing this?\n",
+    "  - What bias is in the data?\n",
+    "  - Can the code and data be audited?\n",
+    "  - What are error rates for different sub-groups?\n",
+    "  - What is the accuracy of a simple rule-based alternative?\n",
+    "  - What processes are in place to handle appeals or mistakes?\n",
+    "  - How diverse is the team that built it?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Processes to implement"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The Markkula Center has released [An Ethical Toolkit for Engineering/Design Practice](https://www.scu.edu/ethics-in-technology-practice/ethical-toolkit/), which includes some concrete practices to implement at your company, including regularly scheduled ethical risk sweeps to proactively search for ethical risks (in a manner similar to cybersecurity penetration testing), expanding the ethical circle to include the perspectives of a variety of stakeholders, and considering the terrible people (how could bad actors abuse, steal, misinterpret, hack, destroy, or weaponize what you are building?). \n",
+    "\n",
+    "Even if you don't have a diverse team, you can still try to pro-actively include the perspectives of a wider group, considering questions such as these (provided by the Markkula Center):\n",
+    "\n",
+    "  - Whose interests, desires, skills, experiences and values have we simply assumed, rather than actually consulted?\n",
+    "  - Who are all the stakeholders who will be directly affected by our product? How have their interests been protected? How do we know what their interests really are—have we asked?\n",
+    "  - Who/which groups and individuals will be indirectly affected in significant ways?\n",
+    "  - Who might use this product that we didn’t expect to use it, or for purposes we didn’t initially intend?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Ethical Lenses"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Another useful resource from the Markkula Center is [Conceptual Frameworks in Technology and Engineering Practice](https://www.scu.edu/ethics-in-technology-practice/conceptual-frameworks/). This considers how different foundational ethical lenses can help identify concrete issues, and lays out the following approaches and key questions:\n",
+    "\n",
+    "  - The Rights Approach: Which option best respects the rights of all who have a stake?\n",
+    "  - The Justice Approach: Which option treats people equally or proportionately?\n",
+    "  - The Utilitarian Approach: Which option will produce the most good and do the least harm?\n",
+    "  - The Common Good Approach: Which option best serves the community as a whole, not just some members?\n",
+    "  - The Virtue Approach: Which option leads me to act as the sort of person I want to be?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Markkula's recommendations include a deeper dive into each of these perspectives, including looking at a project based on a focus on its *consequences*:\n",
+    "\n",
+    "  - Who will be directly affected by this project? Who will be indirectly affected?\n",
+    "  - Will the effects in aggregate likely create more good than harm, and what types of good and harm?\n",
+    "  - Are we thinking about all relevant types of harm/benefit (psychological, political, environmental, moral, cognitive, emotional, institutional, cultural)?\n",
+    "  - How might future generations be affected by this project?\n",
+    "  - Do the risks of harm from this project fall disproportionately on the least powerful in society? Will the benefits go disproportionately the well-off?\n",
+    "  - Have we adequately considered ‘dual-use?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The alternative lens to this is the *deontological* perspective, which focuses on basic *right* and *wrong*:\n",
+    "\n",
+    "  - What rights of others & duties to others must we respect?\n",
+    "  - How might the dignity & autonomy of each stakeholder be impacted by this project?\n",
+    "  - What considerations of trust & of justice are relevant to this design/project?\n",
+    "  - Does this project involve any conflicting moral duties to others, or conflicting stakeholder rights? How can we prioritize these?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Fairness, accountability, and transparency"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The professional society for computer scientists, the ACM, runs a conference on data ethics called the \"Conference on Fairness, Accountability, and Transparency\". \"Fairness, Accountability, and Transparency\" sometimes goes under the acronym *FAT*, although nowadays it's changing to *FAccT*.  Microsoft has a group focused on \"Fairness, Accountability, Transparency, and Ethics\" (FATE). The various versions of this lens have resulted in the acronym \"FAT*\" seeing wide usage. In this section, we'll use \"FAccT\" to refer to the concepts of *Fairness, Accountability, and Transparency*.\n",
+    "\n",
+    "FAccT is another lens that you may find useful in considering ethical issues. One useful resource for this is the free online book [Fairness and machine learning; Limitations and Opportunities](https://fairmlbook.org/), which \"gives a perspective on machine learning that treats fairness as a central concern rather than an afterthought.\" It also warns, however, that it \"is intentionally narrow in scope... A narrow framing of machine learning ethics might be tempting to technologists and businesses as a way to focus on technical interventions while sidestepping deeper questions about power and accountability. We caution against this temptation.\" Rather than provide an overview of the FAccT approach to ethics (which is better done in books such as the one linked above), our focus here will be on the limitations of this kind of narrow framing.\n",
+    "\n",
+    "One great way to consider whether an ethical lens is complete, is to try to come up with an example where the lens and our own ethical intuitions give diverging results. Os Keyes explored this in a graphic way in their paper [A Mulching Proposal\n",
+    "Analysing and Improving an Algorithmic System for Turning the Elderly into High-Nutrient Slurry](https://arxiv.org/abs/1908.06166). The paper's abstract says:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> : The ethical implications of algorithmic systems have been much discussed in both HCI and the broader community of those interested in technology design, development and policy. In this paper, we explore the application of one prominent ethical framework - Fairness, Accountability, and Transparency - to a proposed algorithm that resolves various societal issues around food security and population ageing. Using various standardised forms of algorithmic audit and evaluation, we drastically increase the algorithm's adherence to the FAT framework, resulting in a more ethical and beneficent system. We discuss how this might serve as a guide to other researchers or practitioners looking to ensure better ethical outcomes from algorithmic systems in their line of work."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this paper, the rather contraversial proposal (\"Turning the Elderly into High-Nutrient Slurry\") and the results (\"drastically increase the algorithm's adherence to the FAT framework, resulting in a more ethical and beneficent system\") are at odds... to say the least!\n",
+    "\n",
+    "In philosophy, and especially philosophy of ethics, this is one of the most effective tools: first, come up with a process, definition, set of questions, etc, which is designed to resolve some problem. Then try to come up with an example where that apparent solution results in a proposal that no-one would consider acceptable. This can then lead to a further refinement of the solution."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Role of Policy"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The ethical issues that arise in the use of automated decision systems, such as machine learning, can be complex and far-reaching. To better address them, we will need thoughtful policy, in addition to the ethical efforts of those in industry.  Neither is sufficient on its own.\n",
+    "\n",
+    "Policy is the appropriate tool for addressing:\n",
+    "- Negative externalities\n",
+    "- Misaligned economic incentives\n",
+    "- “Race to the bottom” situations\n",
+    "- Enforcing accountability.\n",
+    "\n",
+    "Ethical behavior in industry is necessary as well, since:\n",
+    "- Law will not always keep up\n",
+    "- Edge cases will arise in which pracitioners must use their best judgement."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### The power of diversity"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Currently, less than 12% of AI researchers are women, according to a study from element AI. The statistics are similarly dire when it comes to race and age. When everybody on a team has similar backgrounds, there are likely to have similar blindspots around ethical risks. The Harvard Business Review (HBR) has published a number of studies showing many benefits of diverse teams, including:\n",
+    "\n",
+    "- [How Diversity Can Drive Innovation](https://hbr.org/2013/12/how-diversity-can-drive-innovation)\n",
+    "- [Teams Solve Problems Faster When They’re More Cognitively Diverse](https://hbr.org/2017/03/teams-solve-problems-faster-when-theyre-more-cognitively-diverse)\n",
+    "- [Why Diverse Teams Are Smarter](https://hbr.org/2016/11/why-diverse-teams-are-smarter), and\n",
+    "- [What Makes a Team Smarter? More Women](https://hbr.org/2011/06/defend-your-research-what-makes-a-team-smarter-more-women).\n",
+    "\n",
+    "Diversity can lead to problems being identified earlier, and a wider range of solutions being considered. For instance, Tracy Chou was an early engineer at Quora. She [wrote of her experiences](https://qz.com/1016900/tracy-chou-leading-silicon-valley-engineer-explains-why-every-tech-worker-needs-a-humanities-education/), describing how she advocated internally for adding a feature that would allow trolls and other bad actors to be blocked. Chou recounts, “I was eager to work on the feature because I personally felt antagonized and abused on the site (gender isn’t an unlikely reason as to why)... But if I hadn’t had that personal perspective, it’s possible that the Quora team wouldn’t have prioritized building a block button so early in its existence.” Harassment often drives people from marginalised groups off online platforms, so this functionality has been important for maintaining the health of Quora's community.\n",
+    "\n",
+    "A crucial aspect to understand is that women leave the tech industry at over twice the rate that men do, according to the Harvard business review (41% of women working in tech leave, compared to 17% of men). An analysis of over 200 books, white papers, and articles found that the reason they leave is that “they’re treated unfairly; underpaid, less likely to be fast-tracked than their male colleagues, and unable to advance.” \n",
+    "\n",
+    "Studies have confirmed a number of the factors that make it harder for women to advance in the workplace. Women receive more vague feedback and personality criticism in performance evaluations, whereas men receive actionable advice tied to business outcomes (which is more useful). Women frequently experience being excluded from more creative and innovative roles, and not receiving high visibility “stretch” assignments that are helpful in getting promoted. One study found that men’s voices are perceived as more persuasive, fact-based, and logical than women’s voices, even when reading identical scripts.\n",
+    "\n",
+    "Receiving mentorship has been statistically shown to help men advance, but not women. The reason behind this is that when women receive mentorship, it’s advice on how they should change and gain more self-knowledge. When men receive mentorship, it’s public endorsement of their authority. Guess which is more useful in getting promoted?\n",
+    "\n",
+    "As long as qualified women keep dropping out of tech, teaching more girls to code will not solve the diversity issues plaguing the field. Diversity initiatives often end up focusing primarily on white women, even although women of colour face many additional barriers. In interviews with 60 women of color who work in STEM research, 100% had experienced discrimination."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The hiring process is particularly broken in tech. One study indicative of the disfunction comes from Triplebyte, a company that helps place software engineers in companies. They conduct a standardised technical interview as part of this process. They have a fascinating dataset: the results of how over 300 engineers did on their exam, and then the results of how those engineers did during the interview process for a variety of companies. The number one finding from [Triplebyte’s research](https://triplebyte.com/blog/who-y-combinator-companies-want) is that “the types of programmers that each company looks for often have little to do with what the company needs or does. Rather, they reflect company culture and the backgrounds of the founders.”\n",
+    "\n",
+    "This is a challenge for those trying to break into the world of deep learning, since most companies' deep learning groups today were founded by academics. These groups tend to look for people \"like them\"--that is, people that can solve complex math problems and understand dense jargon. They don't always know how to spot people who are actually good at solving real problems using deep learning.\n",
+    "\n",
+    "This leaves a big opportunity for companies that are ready to look beyond status and pedigree, and focus on results!"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Conclusion"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Coming from a background of working with binary logic, the lack of clear answers in ethics can be frustrating at first.  Yet, the implications of how our work impacts the work, including unintended consequences and weaponization by bad actors, are some of the most important questions we can (and should!) consider.  Even though there aren't any easy answers, there are definite pitfalls to avoid and practices to move towards more ethical behavior.\n",
+    "\n",
+    "One of our reviewers for this book, Fred Monroe, used to work in hedge fund trading. He told us, after reading this chapter, that many of the issues discussed here (distribution of data being dramatically different than what was trained on, impact of model and feedback loops once deployed and at scale, and so forth) were also key issues for building profitable trading models. The kinds of things you need to do to consider societal consequences are going to have a lot of overlap with things you need to do to consider organizational, market, and customer consequences too--so thinking carefully about ethics can also help you think carefully about how to make your data product successful more generally!"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Questionnaire"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "1. Does ethics provide a list of \"right answers\"?\n",
+    "1. How can working with people of different backgrounds help when considering ethical questions?\n",
+    "1. What was the role of IBM in Nazi Germany? Why did the company participate as they did? Why did the workers participate?\n",
+    "1. What was the role of the first person jailed in the VW diesel scandal?\n",
+    "1. What was the problem with a database of suspected gang members maintained by California law enforcement officials?\n",
+    "1. Why did YouTube's recommendation algorithm recommend videos of partially clothed children to pedophiles, even although no employee at Google programmed this feature?\n",
+    "1. What are the problems with the centrality of metrics?\n",
+    "1. Why did Meetup.com not include gender in their recommendation system for tech meetups?\n",
+    "1. What are the six types of bias in machine learning, according to Suresh and Guttag?\n",
+    "1. Give two examples of historical race bias in the US\n",
+    "1. Where are most images in Imagenet from?\n",
+    "1. In the paper \"Does Machine Learning Automate Moral Hazard and Error\" why is sinusitis found to be predictive of a stroke?\n",
+    "1. What is representation bias?\n",
+    "1. How are machines and people different, in terms of their use for making decisions?\n",
+    "1. Is disinformation the same as \"fake news\"?\n",
+    "1. Why is disinformation through auto-generated text a particularly significant issue?\n",
+    "1. What are the five ethical lenses described by the Markkula Center?\n",
+    "1. Where is policy an appropriate tool for addressing data ethics issues?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Further research:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "1. Read the article \"What Happens When an Algorithm Cuts Your Healthcare\". How could problems like this be avoided in the future?\n",
+    "1. Research to find out more about YouTube's recommendation system and its societal impacts. Do you think recommendation systems must always have feedback loops with negative results? What approaches could Google take? What about the government?\n",
+    "1. Read the paper \"Discrimination in Online Ad Delivery\". Do you think Google should be considered responsible for what happened to Dr Sweeney? What would be an appropriate response?\n",
+    "1. How can a cross-disciplinary team help avoid negative consequences?\n",
+    "1. Read the paper \"Does Machine Learning Automate Moral Hazard and Error\" in American Economic Review. What actions do you think should be taken to deal with the issues identified in this paper?\n",
+    "1. Read the article \"How Will We Prevent AI-Based Forgery?\" Do you think Etzioni's proposed approach could work? Why?\n",
+    "1. Complete the section \"Analyze a project you are working on\" in this chapter.\n",
+    "1. Consider whether your team could be more diverse. If so, what approaches might help?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Section 1: that's a wrap!"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Congratulations! You've made it to the end of the first section of the book. In this section we've tried to show you what deep learning can do, and how you can use it to create real applications and products. At this point, you will get a lot more out of the book if you spend some time trying out what you've learnt. Perhaps you have already been doing this as you go along — in which case, great! But if not, that's no problem either… Now is a great time to start experimenting yourself.\n",
+    "\n",
+    "If you haven't been to the book website yet, head over there now. Remember, you can find it here: [book.fast.ai](https;//book.fast.ai). It's really important that you have got yourself set up to run the notebooks. Becoming an effective deep learning practitioner is all about practice. So you need to be training models. So please go get the notebooks running now if you haven't already! And also have a look on the website for any important updates or notices; deep learning changes fast, and we can't change the words that in this book, so the website is where you need to look to ensure you have the most up-to-date information.\n",
+    "\n",
+    "Make sure that you have completed the following steps:\n",
+    "\n",
+    "- Connected to one of the GPU Jupyter servers recommended on the book website\n",
+    "- Run the first notebook yourself\n",
+    "- Uploaded an image that you find in the first notebook; then try a few different images of different kinds to see what happens\n",
+    "- Run the second notebook, collecting your own dataset based on image search queries that you come up with\n",
+    "- Thought about how you can use deep learning to help you with your own projects, including what kinds of data you could use, what kinds of problems may come up, and how you might be able to mitigate these issues in practice.\n",
+    "\n",
+    "In the next section of the book we will learn about how and why deep learning works, instead of just seeing how we can use it in practice. Understanding the how and why is important for both practitioners and researchers, because in this fairly new field nearly every project requires some level of customisation and debugging. The better you understand the foundations of deep learning, the better your models will be. These foundations are less important for executives, product managers, and so forth (although still useful, so feel free to keep reading!), but they are critical for anybody who is actually training and deploying models themselves."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "jupytext": {
+   "split_at_heading": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/04_mnist_basics.ipynb
+++ b/04_mnist_basics.ipynb
--- a/05_pet_breeds.ipynb
+++ b/05_pet_breeds.ipynb
--- a/06_multicat.ipynb
+++ b/06_multicat.ipynb
--- a/07_sizing_and_tta.ipynb
+++ b/07_sizing_and_tta.ipynb
--- a/08_collab.ipynb
+++ b/08_collab.ipynb
--- a/09_tabular.ipynb
+++ b/09_tabular.ipynb
--- a/10_nlp.ipynb
+++ b/10_nlp.ipynb
--- a/11_nlp_dive.ipynb
+++ b/11_nlp_dive.ipynb
--- a/12_better_rnn.ipynb
+++ b/12_better_rnn.ipynb
--- a/13_convolutions.ipynb
+++ b/13_convolutions.ipynb
--- a/14_deep_conv.ipynb
+++ b/14_deep_conv.ipynb
--- a/15_resnet.ipynb
+++ b/15_resnet.ipynb
--- a/16_arch_details.ipynb
+++ b/16_arch_details.ipynb
@ -0,0 +1,476 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#hide\n",
+    "from utils import *"
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    "[[chapter_arch_details]]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Application architectures deep dive"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We are now in the exciting position that we can fully understand the entire architectures that we have been using for our state-of-the-art models for computer vision, natural language processing, and tabular analysis. In this chapter, we're going to fill in all the missing details on how fastai's application models work."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Computer vision"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### cnn_learner"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's take a look at what happens when we use the `cnn_learner` function. We pass it an architecture to use for the *body* of the network. Most of the time we use a resnet, which we already know how to create, so we don't need to delve into that any further. Pretrained weights are downloaded as required and loaded into the resnet.\n",
+    "\n",
+    "Then, for transfer learning, the network needs to be *cut*. This refers to slicing off the final layer, which is only responsible for ImageNet-specific categorisation. In fact, we do not only slice off this layer, but everything from the adaptive average pooling layer onwards. The reason for this will become clear in just a moment. Since different architectures might use different types of pooling layers, or even completely different kinds of *heads*, we don't just search for the adaptive pooling layer to decide where to cut the pretrained model. Instead, we have a dictionary of information that is used for each model to know where its body ends, and its head starts. We call this `model_meta` — here it is for resnet 50:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'cut': -2,\n",
+       " 'split': <function fastai2.vision.learner._resnet_split(m)>,\n",
+       " 'stats': ([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])}"
+      ]
+     },
+     "execution_count": null,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "model_meta[resnet50]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> jargon: Body and Head: The \"head\" of a neural net is the part that is specialized for a particular task. For a convnet, it's generally the part after the adaptive average pooling layer. The \"body\" is everything else, and includes the \"stem\" (which we learned about in <<chapter_resnet>>)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If we take all of the layers prior to the cutpoint of `-2`, we get the part of the model which fastai will keep for transfer learning. Now, we put on our new head. This is created using the function create_head:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Sequential(\n",
+       "  (0): AdaptiveConcatPool2d(\n",
+       "    (ap): AdaptiveAvgPool2d(output_size=1)\n",
+       "    (mp): AdaptiveMaxPool2d(output_size=1)\n",
+       "  )\n",
+       "  (1): full: False\n",
+       "  (2): BatchNorm1d(20, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
+       "  (3): Dropout(p=0.25, inplace=False)\n",
+       "  (4): Linear(in_features=20, out_features=512, bias=False)\n",
+       "  (5): ReLU(inplace=True)\n",
+       "  (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
+       "  (7): Dropout(p=0.5, inplace=False)\n",
+       "  (8): Linear(in_features=512, out_features=2, bias=False)\n",
+       ")"
+      ]
+     },
+     "execution_count": null,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "#hide_output\n",
+    "create_head(20,2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "```\n",
+    "Sequential(\n",
+    "  (0): AdaptiveConcatPool2d(\n",
+    "    (ap): AdaptiveAvgPool2d(output_size=1)\n",
+    "    (mp): AdaptiveMaxPool2d(output_size=1)\n",
+    "  )\n",
+    "  (1): Flatten()\n",
+    "  (2): BatchNorm1d(20, eps=1e-05, momentum=0.1, affine=True)\n",
+    "  (3): Dropout(p=0.25, inplace=False)\n",
+    "  (4): Linear(in_features=20, out_features=512, bias=False)\n",
+    "  (5): ReLU(inplace=True)\n",
+    "  (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True)\n",
+    "  (7): Dropout(p=0.5, inplace=False)\n",
+    "  (8): Linear(in_features=512, out_features=2, bias=False)\n",
+    ")\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "With this function you can choose how many additional linear layers are added to the end, how much dropout to use after each one, and what kind of pooling to use. By default, fastai will apply both average pooling, and max pooling, and will concatenate the two together (this is the `AdaptiveConcatPool2d` layer). This is not a particularly common approach, but it was developed independently at fastai and at other research labs in recent years, and tends to provide some small improvement over using just average pooling.\n",
+    "\n",
+    "Fastai is also a bit different to most libraries in adding two linear layers, rather than one, by default in the CNN head. The reason for this is that transfer learning can still be useful even, as we have seen, and transferring two very different domains to the pretrained model. However, just using a single linear layer is unlikely to be enough. So we have found that using two linear layers can allow transfer learning to be used more quickly and easily, in more situations."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> note: One parameter to create_head that is worth looking at is bn_final. Setting this to true will cause a batchnorm layer to be added as your final layer. This can be useful in helping your model to more easily ensure that it is scaled appropriately for your output activations. We haven't seen this approach published anywhere, as yet, but we have found that it works well in practice, wherever we have used it."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### unet_learner"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "One of the most interesting architectures in deep learning is the one that we used for segmentation in <<chapter_intro>>. Segmentation is a challenging task, because the output required is really an image, or a pixel grid, containing the predicted label for every pixel. There are other tasks which share a similar basic design, such as increasing the resolution of an image (*super resolution*), adding colour to a black-and-white image (*colorization*), or converting a photo into a synthetic painting (*style transfer*)--these tasks are covered by an online chapter of this book, so be sure to check it out after you've read this chapter. In each case, we are starting with an image, and converting it to some other image of the same dimensions or aspect ratio, but with the pixels converted in some way. We refer to these as *generative vision models*.\n",
+    "\n",
+    "The way we do this is to start with the exact same approach to developing a CNN head as we saw above. We start with a ResNet, for instance, and cut off the adaptive pooling layer and everything after that. And then we replace that with our custom head which does the generative task.\n",
+    "\n",
+    "There was a lot of handwaving in that last sentence! How on earth do we create a CNN head which generates an image? If we start with, say, a 224 pixel input image, then at the end of the resnet body we will have a 7x7 grid of convolutional activations. How can we convert that into a 224 pixel segmentation mask?\n",
+    "\n",
+    "We will (naturally) do this with a neural network! So we need some kind of layer which can increase the grid size in a CNN. One very simple approach to this is to replace every pixel in the 7x7 grid with four pixels in a 2x2 square. Each of those four pixels would have the same value — this is known as nearest neighbour interpolation. PyTorch provides a layer which does this for us, so we could create a head which contains stride one convolutional layers (along with batchnorm and ReLU as usual) interspersed with 2x2 nearest neighbour interpolation layers. In fact, you could try this now! See if you can create a custom head designed like this, and see if it can complete the CamVid segmentation task. You should find that you get some reasonable results, although it won't be as good as our <<chapter_intro>> results.\n",
+    "\n",
+    "Another approach is to replace the nearest neighbour and convolution combination with a *transposed convolution* otherwise known as a *stride half convolution*. This is identical to a regular convolution, but first zero padding is inserted between every pixel in the input. This is easiest to see with a picture — here's a diagram from the excellent convolutional arithmetic paper we have seen before, showing a 3x3 transposed convolution applied to a 3x3 image:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img alt=\"A transposed convolution\" width=\"815\" caption=\"A transposed convolution\" id=\"transp_conv\" src=\"images/att_00051.png\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "As you see, the result of this is to increase the size of the input. You can try this out now, by using fastai's ConvLayer class; pass the parameter `transpose=True` to create a transposed convolution, instead of a regular one, in your custom head.\n",
+    "\n",
+    "Neither of these approaches, however, works really well. The problem is that our 7x7 grid simply doesn't have enough information to create a 224x224 pixel output. It's asking an awful lot of the activations of each of those grid cells to have enough information to fully regenerate every pixel in the output. The solution to this problem is to use skip connections, like in a resnet, but skipping from the activations in the body of the resnet all the way over to the activations of the transposed convolution on the opposite side of the architecture. This is known as a U-Net, and it was developed in the 2015 paper [U-Net: Convolutional Networks for Biomedical Image Segmentation](https://arxiv.org/abs/1505.04597). Although the paper focussed on medical applications, the U-Net has revolutionized all kinds of generation vision models.\n",
+    "\n",
+    "The U-Net paper shows the architecture like this:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img alt=\"The U-net architecture\" width=\"630\" caption=\"The U-net architecture\" id=\"unet\" src=\"images/att_00052.png\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This picture shows the CNN body on the left (in this case, it's a regular CNN, not a ResNet, and they're using 2x2 max pooling instead of stride 2 convolutions, since this paper was written before ResNets came along) and it shows the transposed convolutional layers on the right (they're called \"up-conv\" in this picture). Then then extra skip connections are shown as grey arrows crossing from left to right (these are sometimes called *cross connections*). You can see why it's called a \"U-net\" when you see this picture!\n",
+    "\n",
+    "With this architecture, the input to the transposed convolutions is not just the lower resolution grid in the preceding layer, but also the higher resolution grid in the resnet head. This allows the U-Net to use all of the information of the original image, as it is needed. One challenge with U-Nets is that the exact architecture depends on the image size. fastai has a unique `DynamicUnet` class which auto-generates an archicture of the right size based on the data provided."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Natural language processing"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now that we've seen how to create complete state of the art computer vision models, let's move on to NLP.\n",
+    "\n",
+    "Converting an AWD-LSTM language model into a transfer learning classifier follows a very similar process to what we saw for `cnn_learner` in the first section of this chapter. We do not need a \"meta\" dictionary in this case, because we do not have such a variety of architectures to support in the body. All we need to do is to select the stacked RNN for the encoder in the language model, which is a single PyTorch module. This encoder will provide an activation for every word of the input, because a language model needs to output a prediction for every next word.\n",
+    "\n",
+    "To create a classifier from this we use an approach described in the ULMFiT paper as \"BPTT for Text Classification (BPT3C)\". The paper describes this:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> In order to make fine-tuning a classifier for large documents feasible, we propose BPTT for Text Classification (BPT3C): We divide the document into fixed-length batches of size `b`. At the beginning of each batch, the model is initialized with the final state of the previous batch; we keep track of the hidden states for mean and max-pooling; gradients are back-propagated to the batches whose hidden states contributed to the final prediction. In practice, we use variable length backpropagation sequences."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In practice, what this is saying is that the classifier contains a for loop, which loops over each batch of a sequence. The state is maintained across batches, and the activations of each batch are stored. At the end, we use the same average and max concatenated pooling trick that we use for computer vision models — but this time, we do not pool over CNN grid cells, but over RNN sequences.\n",
+    "\n",
+    "For this for loop we need to gather our data in batches, but each text needs to be treated separatedly, as they each have their own label. However, it's very likely that those texts won't have the good taste of being all of the same length, which means we won't be able to put them all in the same array, like we did with the language model.\n",
+    "\n",
+    "That's where padding is going to help: when grabbing a bunch of texts, we determine the one with the greater length, then we fill the ones that are shorter with a special token called `xxpad`. To avoid having an extreme case where we have a text with 2,000 tokens in the same batch as a text with 10 tokens (so a lot of padding, and a lot of wasted computation) we alter the randomness by making sure texts of comparable size are put together. It will still be in a somewhat random order for the training set (for the validation set we can simply sort them by order of length), but not completely random.\n",
+    "\n",
+    "This is done automatically behind the scenes by the fastai library when creating our `DataLoaders`."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Tabular"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Finally, we can look at `fastai.tabular` models. (We don't need to look at collaborative filtering separately, since we've already seen that these models are just tabular models, or use dot product, which we've implemented earlier from scratch.\n",
+    "\n",
+    "Here is the forward method for `TabularModel`:\n",
+    "\n",
+    "```python\n",
+    "if self.n_emb != 0:\n",
+    "    x = [e(x_cat[:,i]) for i,e in enumerate(self.embeds)]\n",
+    "    x = torch.cat(x, 1)\n",
+    "    x = self.emb_drop(x)\n",
+    "if self.n_cont != 0:\n",
+    "    x_cont = self.bn_cont(x_cont)\n",
+    "    x = torch.cat([x, x_cont], 1) if self.n_emb != 0 else x_cont\n",
+    "return self.layers(x)\n",
+    "```\n",
+    "\n",
+    "We won't show `__init__` here, since it's not that interesting, but will look at each line of code in turn in `forward`:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "```python\n",
+    "if self.n_emb != 0:\n",
+    "```\n",
+    "\n",
+    "This is just testing whether there are any embeddings to deal with — we can skip this section if we only have continuous variables.\n",
+    "\n",
+    "```python\n",
+    "    x = [e(x_cat[:,i]) for i,e in enumerate(self.embeds)]\n",
+    "```\n",
+    "\n",
+    "`self.embeds` contains the embedding matrices, so this gets the activations of each…\n",
+    "\n",
+    "```python\n",
+    "    x = torch.cat(x, 1)\n",
+    "```\n",
+    "\n",
+    "…and concatenates them into a single tensor.\n",
+    "\n",
+    "```python\n",
+    "    x = self.emb_drop(x)\n",
+    "```\n",
+    "\n",
+    "Then dropout is applied. You can pass `emb_drop` to `__init__` to change this value.\n",
+    "\n",
+    "```python\n",
+    "if self.n_cont != 0:\n",
+    "```\n",
+    "\n",
+    "Now we test whether there are any continuous variables to deal with.\n",
+    "\n",
+    "```python\n",
+    "    x_cont = self.bn_cont(x_cont)\n",
+    "```\n",
+    "\n",
+    "They are passed through a batchnorm layer…\n",
+    "\n",
+    "```python\n",
+    "    x = torch.cat([x, x_cont], 1) if self.n_emb != 0 else x_cont\n",
+    "```\n",
+    "\n",
+    "…and concatenated with the embedding activations, if there were any.\n",
+    "\n",
+    "```python\n",
+    "return self.layers(x)\n",
+    "\n",
+    "```\n",
+    "\n",
+    "Finally, this is passed through the linear layers (each of which includes batchnorm, if `use_bn` is True, and dropout, if `ps` is set to some value or list of values)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Wrapping up architectures"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "As you can see, the details of deep learning architectures need not scare you now. You can look inside the code of fastai and PyTorch and see just what is going on. More importantly, try to understand why that is going on. Take a look at the papers that are being implemented in the code, and try to see how the code matches up to the algorithms that are described.\n",
+    "\n",
+    "Now that we have investigated all of the pieces of a model and the data that is passed into it, we can consider what this means for practical deep learning. If you have unlimited data, unlimited memory, and unlimited time, then the advice is easy: train a huge model on all of your data for a really long time. The reason that deep learning is not straightforward is because your data, memory, and time is limited. If you are running out of memory or time, then the solution is to train a smaller model. If you are not able to train for long enough to overfit, then you are not taking advantage of the capacity of your model.\n",
+    "\n",
+    "So step one is to get to the point that you can overfit. Then, the question is how to reduce that overfitting. Here is how we recommend prioritising the steps from there:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img alt=\"Steps to reducing over-fitting\" width=\"400\" caption=\"Steps to reducing over-fitting\" id=\"reduce_overfit\" src=\"images/att_00047.png\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Many practitioners when faced with an overfitting model start at exactly the wrong end of this diagram. Their starting point is to use a smaller model, or more regularisation. Using a smaller model should be absolutely the last step you take, unless your model is taking up too much time or memory. Reducing the size of your model as reducing the ability of your model to learn subtle relationships in your data.\n",
+    "\n",
+    "Instead, your first step should be to seek to create more data. That could involve adding more labels to data that you already have in your organisation, finding additional tasks that your model could be asked to solve (or to think of it another way, identifying different kinds of labels that you could model), or creating additional synthetic data via using more or different data augmentation. Thanks to the development of mixup and similar approaches, effective data augmentation is now available for nearly all kinds of data.\n",
+    "\n",
+    "Once you've got as much data as you think you can reasonably get a hold of, and are using it as effectively as possible by taking advantage of all of the labels that you can find, and all of the augmentation that make sense, if you are still overfitting and you should think about using more generalisable architectures. For instance, adding batch normalisation may improve generalisation.\n",
+    "\n",
+    "If you are still overfitting after doing the best you can at using your data and tuning your architecture, then you can take a look at regularisation. Generally speaking, adding dropout to the last layer or two will do a good job of regularising your model. However, as we learnt from the story of the development of AWD-LSTM, it is often the case that adding dropout of different types throughout your model can help regularise even better. Generally speaking, a larger model with more regularisation is more flexible, and can therefore be more accurate, and a smaller model with less regularisation.\n",
+    "\n",
+    "Only after considering all of these options would be recommend that you try using smaller versions of your architectures."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Questionnaire"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "1. What is the head of a neural net?\n",
+    "1. What is the body of a neural net?\n",
+    "1. What is \"cutting\" a neural net? Why do we need to do this for transfer learning?\n",
+    "1. What is \"model_meta\"? Try printing it to see what's inside.\n",
+    "1. Read the source code for `create_head` and make sure you understand what each line does.\n",
+    "1. Look at the output of create_head and make sure you understand why each layer is there, and how the create_head source created it.\n",
+    "1. Figure out how to change the dropout, layer size, and number of layers created by create_cnn, and see if you can find values that result in better accuracy from the pet recognizer.\n",
+    "1. What does AdaptiveConcatPool2d do?\n",
+    "1. What is nearest neighbor interpolation? How can it be used to upsample convolutional activations?\n",
+    "1. What is a transposed convolution? What is another name for it?\n",
+    "1. Create a conv layer with `transpose=True` and apply it to an image. Check the output shape.\n",
+    "1. Draw the u-net architecture.\n",
+    "1. What is BPTT for Text Classification (BPT3C)?\n",
+    "1. How do we handle different length sequences in BPT3C?\n",
+    "1. Try to run each line of `TabularModel.forward` separately, one line per cell, in a notebook, and look at the input and output shapes at each step.\n",
+    "1. How is `self.layers` defined in `TabularModel`?\n",
+    "1. What are the five steps for preventing over-fitting?\n",
+    "1. Why don't we reduce architecture complexity before trying other approaches to preventing over-fitting?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Further research"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "1. Write your own custom head and try training the pet recognizer with it. See if you can get a better result than fastai's default.\n",
+    "1. Try switching between AdaptiveConcatPool2d and AdaptiveAvgPool2d in a CNN head and see what difference it makes.\n",
+    "1. Write your own custom splitter to create a separate parameter group for every resnet block, and a separate group for the stem. Try training with it, and see if it improves the pet recognizer.\n",
+    "1. Read the online chapter about generative image models, and create your own colorizer, super resolution model, or style transfer model.\n",
+    "1. Create a custom head using nearest neighbor interpolation and use it to do segmentation on Camvid."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "jupytext": {
+   "split_at_heading": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.5"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": false,
+   "sideBar": true,
+   "skip_h1_title": true,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/17_accel_sgd.ipynb
+++ b/17_accel_sgd.ipynb
--- a/18_callbacks.ipynb
+++ b/18_callbacks.ipynb
@ -0,0 +1,424 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#hide\n",
+    "from utils import *"
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    "[[chapter_callbacks]]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Callbacks"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Introduction to callbacks"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Since we now know how to create state-of-the-art architectures for computer vision, natural image processing, tabular analysis, and collaborative filtering, and we know how to train them quickly with accelerated optimisers, and we know how to regularise them effectively, we're done, right?\n",
+    "\n",
+    "Well… Yes, sort of. But other things come up. Sometimes you need to change how things work a little bit. In fact, we have already seen examples of this: mixup, FP16 training, resetting the model after each epoch for training RNNs, and so forth. How do we go about making these kinds of tweaks to the training process?\n",
+    "\n",
+    "We've seen the basic training loop, which, with the help of the `Optimizer` class, looks like this for a single epoch:\n",
+    "\n",
+    "```python\n",
+    "for xb,yb in dl:\n",
+    "    loss = loss_func(model(xb), yb)\n",
+    "    loss.backward()\n",
+    "    opt.step()\n",
+    "    opt.zero_grad()\n",
+    "```\n",
+    "\n",
+    "Here's one way to picture that:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img alt=\"Basic training loop\" width=\"300\" caption=\"Basic training loop\" id=\"basic_loop\" src=\"images/att_00048.png\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The usual way for deep learning practitioners to customise the training loop is to make a copy of an existing training loop, and then insert their code necessary for their particular changes into it. This is how nearly all code that you find online will look. But it has some very serious problems.\n",
+    "\n",
+    "It's not very likely that some particular tweaked training loop is going to meet your particular needs. There are hundreds of changes that can be made to a training loop, which means there are billions and billions of possible permutations. You can't just copy one tweak from a training loop here, another from a training loop there, and expect them all to work together. Each will be based on different assumptions about the environment that it's working in, use different naming conventions, and expect the data to be in different formats.\n",
+    "\n",
+    "We need a way to allow users to insert their own code at any part of the training loop, but in a consistent and well-defined way. Computer scientists have already come up with an answer to this question: the callback. A callback is a piece of code that you write, and inject into another piece of code at some predefined point. In fact, callbacks have been used with deep learning training loops for years. The problem is that only a small subset of places that may require code injection have been available in previous libraries, and, more importantly, callbacks were not able to do all the things they needed to do.\n",
+    "\n",
+    "In order to be just as flexible as manually copying and pasting a training loop and directly inserting code into it, a callback must be able to read every possible piece of information available in the training loop, modify all of it as needed, and fully control when a batch, epoch, or even all the whole training loop should be terminated. fastai is the first library to provide all of this functionality. It modifies the training loop so it looks like this:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img alt=\"Training loop with callbacks\" width=\"550\" caption=\"Training loop with callbacks\" id=\"cb_loop\" src=\"images/att_00049.png\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The real test of whether this works has been borne out over the last couple of years — it has turned out that every single new paper implemented, or use a request fulfilled, for modifying the training loop has successfully been achieved entirely by using the fastai callback system. The training loop itself has not required modifications. Here are just a few of the callbacks that have been added:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img alt=\"Some fastai callbacks\" width=\"500\" caption=\"Some fastai callbacks\" id=\"some_cbs\" src=\"images/att_00050.png\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The reason that this is important for all of us is that it means that whatever idea we have in our head, we can implement it. We need never dig into the source code of PyTorch or fastai and act together some one-off system to try out our ideas. And when we do implement our own callbacks to develop our own ideas, we know that they will work together with all of the other functionality provided by fastai – so we will get progress bars, mixed precision training, hyperparameter annealing, and so forth.\n",
+    "\n",
+    "Another advantage is that it makes it easy to gradually remove or add functionality and perform ablation studies. You just need to adjust the list of callbacks you pass along to your fit function."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "As an example, here is the fastai source code that is run for each batch of the training loop:\n",
+    "\n",
+    "```python\n",
+    "try:\n",
+    "    self._split(b);                                  self('begin_batch')\n",
+    "    self.pred = self.model(*self.xb);                self('after_pred')\n",
+    "    self.loss = self.loss_func(self.pred, *self.yb); self('after_loss')\n",
+    "    if not self.training: return\n",
+    "    self.loss.backward();                            self('after_backward')\n",
+    "    self.opt.step();                                 self('after_step')\n",
+    "    self.opt.zero_grad()\n",
+    "except CancelBatchException:                         self('after_cancel_batch')\n",
+    "finally:                                             self('after_batch')\n",
+    "```\n",
+    "\n",
+    "The calls of the form `self('...')` are where the callbacks are called. As you see, after every step a callback is called. The callback will receive the entire state of training, and can also modify it. For instance, as you see above, the input data and target labels are in `self.xb` and `self.yb` respectively. A callback can modify these to modify the data the training loop sees. It can also modify `self.loss`, or even modify the gradients."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Creating a callback"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The full list of available callback events is:\n",
+    "\n",
+    "- `begin_fit`: called before doing anything, ideal for initial setup.\n",
+    "- `begin_epoch`: called at the beginning of each epoch, useful for any behavior you need to reset at each epoch.\n",
+    "- `begin_train`: called at the beginning of the training part of an epoch.\n",
+    "- `begin_batch`: called at the beginning of each batch, just after drawing said batch. It can be used to do any setup necessary for the batch (like hyper-parameter scheduling) or to change the input/target before it goes in the model (change of the input with techniques like mixup for instance).\n",
+    "- `after_pred`: called after computing the output of the model on the batch. It can be used to change that output before it's fed to the loss.\n",
+    "- `after_loss`: called after the loss has been computed, but before the backward pass. It can be used to add any penalty to the loss (AR or TAR in RNN training for instance).\n",
+    "- `after_backward`: called after the backward pass, but before the update of the parameters. It can be used to do any change to the gradients before said update (gradient clipping for instance).\n",
+    "- `after_step`: called after the step and before the gradients are zeroed.\n",
+    "- `after_batch`: called at the end of a batch, for any clean-up before the next one.\n",
+    "- `after_train`: called at the end of the training phase of an epoch.\n",
+    "- `begin_validate`: called at the beginning of the validation phase of an epoch, useful for any setup needed specifically for validation.\n",
+    "- `after_validate`: called at the end of the validation part of an epoch.\n",
+    "- `after_epoch`: called at the end of an epoch, for any clean-up before the next one.\n",
+    "- `after_fit`: called at the end of training, for final clean-up.\n",
+    "\n",
+    "This list is available as attributes of the special variable `event`; so just type `event.` and hit `Tab` in your notebook to see a list of all the options"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's take a look at an example. Do you recall how in <<chapter_nlp_dive>> we needed to ensure that our special `reset` method was called at the start of training and validation for each epoch? We used the `ModelReseter` callback provided by fastai to do this for us. But how did `ModelReseter` do that exactly? Here's the full actual source code to that class:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class ModelReseter(Callback):\n",
+    "    def begin_train(self):    self.model.reset()\n",
+    "    def begin_validate(self): self.model.reset()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Yes, that's actually it! It just does what we said in the paragraph above: after completing training and epoch or validation for an epoch, call a method named `reset`.\n",
+    "\n",
+    "Callbacks are often \"short and sweet\" like this one. In fact, let's look at one more. Here's the fastai source for the callback that add RNN regularization (*AR* and *TAR*):"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class RNNRegularizer(Callback):\n",
+    "    def __init__(self, alpha=0., beta=0.): self.alpha,self.beta = alpha,beta\n",
+    "\n",
+    "    def after_pred(self):\n",
+    "        self.raw_out,self.out = self.pred[1],self.pred[2]\n",
+    "        self.learn.pred = self.pred[0]\n",
+    "\n",
+    "    def after_loss(self):\n",
+    "        if not self.training: return\n",
+    "        if self.alpha != 0.:\n",
+    "            self.learn.loss += self.alpha * self.out[-1].float().pow(2).mean()\n",
+    "        if self.beta != 0.:\n",
+    "            h = self.raw_out[-1]\n",
+    "            if len(h)>1:\n",
+    "                self.learn.loss += self.beta * (h[:,1:] - h[:,:-1]\n",
+    "                                               ).float().pow(2).mean()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> stop: Go back to where we discussed TAR and AR regularization, and compare to the code here. Made sure you understand what it's doing, and why."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In both of these examples, notice how we can access attributes of the training loop by directly checking `self.model` or `self.pred`. That's because a `Callback` will always try to get an attribute it doesn't have inside the `Learner` associated to it. This is a shortcut for `self.learn.model` or `self.learn.pred`. Note that this shortcut works for reading attributes, but not for writing them, which is why when `RNNRegularizer` changes the loss or the predictions, you see `self.learn.loss = ` or `self.learn.pred = `. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "When writing a callback, the following attributes of `Learner` are available:\n",
+    "\n",
+    "- `model`: the model used for training/validation\n",
+    "- `data`: the underlying `DataLoaders`\n",
+    "- `loss_func`: the loss function used\n",
+    "- `opt`: the optimizer used to udpate the model parameters\n",
+    "- `opt_func`: the function used to create the optimizer\n",
+    "- `cbs`: the list containing all `Callback`s\n",
+    "- `dl`: current `DataLoader` used for iteration\n",
+    "- `x`/`xb`: last input drawn from `self.dl` (potentially modified by callbacks). `xb` is always a tuple (potentially with one element) and `x` is detuplified. You can only assign to `xb`.\n",
+    "- `y`/`yb`: last target drawn from `self.dl` (potentially modified by callbacks). `yb` is always a tuple (potentially with one element) and `y` is detuplified. You can only assign to `yb`.\n",
+    "- `pred`: last predictions from `self.model` (potentially modified by callbacks)\n",
+    "- `loss`: last computed loss (potentially modified by callbacks)\n",
+    "- `n_epoch`: the number of epochs in this training\n",
+    "- `n_iter`: the number of iterations in the current `self.dl`\n",
+    "- `epoch`: the current epoch index (from 0 to `n_epoch-1`)\n",
+    "- `iter`: the current iteration index in `self.dl` (from 0 to `n_iter-1`)\n",
+    "\n",
+    "The following attributes are added by `TrainEvalCallback` and should be available unless you went out of your way to remove that callback:\n",
+    "\n",
+    "- `train_iter`: the number of training iterations done since the beginning of this training\n",
+    "- `pct_train`: from 0. to 1., the percentage of training iterations completed\n",
+    "- `training`:  flag to indicate if we're in training mode or not\n",
+    "\n",
+    "The following attribute is added by `Recorder` and should be available unless you went out of your way to remove that callback:\n",
+    "\n",
+    "- `smooth_loss`: an exponentially-averaged version of the training loss"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Callback ordering and exceptions"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Sometimes, callbacks need to be able to tell fastai to skip over a batch, or an epoch, or stop training altogether. For instance, consider `TerminateOnNaNCallback`. This handy callback will automatically stop training any time the loss becomes infinite or `NaN` (*not a number*). Here's the fastai source for this callback:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class TerminateOnNaNCallback(Callback):\n",
+    "    run_before=Recorder\n",
+    "    def after_batch(self):\n",
+    "        if torch.isinf(self.loss) or torch.isnan(self.loss):\n",
+    "            raise CancelFitException"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The way it tells the training loop to interrupt training at this point is to `raise CancelFitException`. The training loop catches this exception and does not run any further training or validation. The callback control flow exceptions available are:\n",
+    "\n",
+    "- `CancelFitException`: Skip the rest of this batch and go to `after_batch\n",
+    "- `CancelEpochException`: Skip the rest of the training part of the epoch and go to `after_train\n",
+    "- `CancelTrainException`: Skip the rest of the validation part of the epoch and go to `after_validate\n",
+    "- `CancelValidException`: Skip the rest of this epoch and go to `after_epoch\n",
+    "- `CancelBatchException`: Interrupts training and go to `after_fit"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can detect one of those exceptions occurred and add code that executes right after with the following events:\n",
+    "\n",
+    "- `after_cancel_batch`: reached imediately after a `CancelBatchException` before proceeding to `after_batch`\n",
+    "- `after_cancel_train`: reached imediately after a `CancelTrainException` before proceeding to `after_epoch`\n",
+    "- `after_cancel_valid`: reached imediately after a `CancelValidException` before proceeding to `after_epoch`\n",
+    "- `after_cancel_epoch`: reached imediately after a `CancelEpochException` before proceeding to `after_epoch`\n",
+    "- `after_cancel_fit`: reached imediately after a `CancelFitException` before proceeding to `after_fit`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Sometimes, callbacks need to be called in a particular order. In the case of `TerminateOnNaNCallback`, it's important that `Recorder` runs its `after_batch` after this callback, to avoid registering an NaN loss. You can specify `run_before` (this callback must run before ...) or `run_after` (this callback must run after ...) in your callback to ensure the ordering that you need."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now that we have seen how to tweak the training loop of fastai to do anything we need, let's take a step back and dig a little bit deeper in the foundations of that training loop."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Questionnaire"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "1. What are the four steps of a training loop?\n",
+    "1. Why is the use of callbacks better than writing a new training loop for each tweak you want to add?\n",
+    "1. What are the necessary points in the design of the fastai's callback system that make it as flexible as copying and pasting bits of code?\n",
+    "1. How can you get the list of events available to you when writing a callback?\n",
+    "1. Write the `ModelResetter` callback (without peeking).\n",
+    "1. How can you access the necessary attributes of the training loop inside a callback? When can you use or not use the shortcut that goes with it?\n",
+    "1. How can a callback influence the control flow of the training loop.\n",
+    "1. Write the `TerminateOnNaN` callback (without peeking if possible).\n",
+    "1. How do you make sure your callback runs after or before another callback?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Further research"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "1. Look at the mixed precision callback with the documentation. Try to understand what each event and line of code does.\n",
+    "1. Implement your own version of ther learning rate finder from scratch. Compare it with fastai's version.\n",
+    "1. Look at the source code of the callbacks that ship with fastai. See if you can find one that's similar to what you're looking to do, to get some inspiration."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Foundations of Deep Learning: Wrap up"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Congratulations, you have made it to the end of the \"foundations of deep learning\" section. You now understand how all of fastai's applications and most important architectures are built, and the recommended ways to train them, and have all the information you need to build these from scratch. Whilst you probably won't need to create your own training loop, or batchnorm layer, for instance, knowing what is going on behind the scenes is very helpful for debugging, profiling, and deploying your solutions.\n",
+    "\n",
+    "Since you understand all of the foundations of fastai's applications now, be sure to spend some time digging through fastai's source notebooks, and running and experimenting with parts of them, since you can and see exactly how everything in fastai is developed.\n",
+    "\n",
+    "In the next section, we will be looking even further under the covers, to see how the actual forward and backward passes of a neural network are done, and we will see what tools are at our disposal to get better performance. We will then finish up with a project that brings together everything we have learned throughout the book, which we will use to build a method for interpreting convolutional neural networks."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "jupytext": {
+   "split_at_heading": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.5"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": false,
+   "sideBar": true,
+   "skip_h1_title": true,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/19_foundations.ipynb
+++ b/19_foundations.ipynb
--- a/20_CAM.ipynb
+++ b/20_CAM.ipynb
--- a/21_learner.ipynb
+++ b/21_learner.ipynb
--- a/22_conclusion.ipynb
+++ b/22_conclusion.ipynb
@ -0,0 +1,76 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    "[[chapter_conclusion]]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Concluding thoughts"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Congratulations! You've made it! If you have worked through all of the notebooks to this point, then you have joined a small, but growing group of people that are able to harness the power of deep learning to solve real problems. You may not feel that way; in fact you probably do not feel that way. We have seen again and again that students that complete the fast.AI courses dramatically underestimate how effective they are as deep learning practitioners. We've also seen that these people are often underestimated by those that have come out of a classic academic background. So for you to rise above your own expectations and the expectations of others what you do next, after closing this book, is even more important than what you've done to get to this point.\n",
+    "\n",
+    "The most important thing is to keep the momentum going. In fact, as you know from your study of optimisers, momentum is something which can build upon itself! So think about what it is you can do now to maintain and accelerate your deep learning journey. Here's a few ideas:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img alt=\"What to do next\" width=\"550\" caption=\"What to do next\" id=\"do_next\" src=\"images/att_00053.png\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We've talked a lot in this book about the value of writing, whether it be code or prose. But perhaps you haven't quite written as much as you had hoped so far. That's okay! Now is a great chance to turn that around. You have a lot to say, at this point. Perhaps you have tried some experiments on a dataset which other people don't seem to have looked at in quite the same way — so tell the world about it! Or perhaps you are just curious to try out some ideas that you had been thinking about why you are reading; now is a great chance to turn those ideas into code.\n",
+    "\n",
+    "One fairly low-key place for your writing is the fast.ai forums at forums.fast.ai. You will find that the community there is very supportive and helpful, so please do drop by and let us know what you've been up to. Or see if you can answer a few questions for those folks who are earlier in their journey then you.\n",
+    "\n",
+    "And if you do have some success, big or small, in your deep learning journey, be sure to let us know! It's especially helpful if you post about it on the forums, because for others to learn about the successes of other students can be extremely motivating.\n",
+    "\n",
+    "Perhaps the most important approach for many people to stay connected with their learning journey is to build a community around it. For instance, you could try to set up a small deep learning Meetup in your local neighbourhood, or a study group, or even offer to do a talk at a local meet up about what you've learned so far, or some particular aspect that interested you. It is okay that you are not the world's leading expert just yet – the important thing to remember is that you now know about plenty of stuff that other people don't, so they are very likely to appreciate your perspective.\n",
+    "\n",
+    "Another community event which many people find useful is a regular book club or paper reading club. You might find that there are some in your neighbourhood already, or otherwise you could try to get one started yourself. Even if there is just one other person doing it with you, it will help give you the support and encouragement to get going.\n",
+    "\n",
+    "If you are not in a geography where it's easy to get together with like-minded folks in person, drop by the forums, because there are lots of people always starting up virtual study groups. These generally involve a bunch of people getting together over video chat once every week or so, and discussing some deep learning topic.\n",
+    "\n",
+    "Hopefully, by this point, you have a few little projects that you put together, and experiments that you've run. Our recommendation is generally to pick one of these and make it as good as you can. Really polish it up into the best piece of work that you can — something you really proud of. This will force you to go much deeper into a topic, which will really test out your understanding, and give you the opportunity to see what you can do when you really put your mind to it.\n",
+    "\n",
+    "Also, you may want to take a look at the fast.AI free online course which covers the same material as this book. Sometimes, seeing the same material in two different ways, can really help to crystallise the ideas. In fact, human learning researchers have found that this is one of the best ways to learn material — to see the same thing from different angles, described in different ways.\n",
+    "\n",
+    "Your final mission, should you choose to accept it, is to take this book, and give it to somebody that you know — and let somebody else start their way down their own deep learning journey!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "jupytext": {
+   "split_at_heading": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/23_middle_level.ipynb
+++ b/23_middle_level.ipynb
--- a/3
+++ b/3
@ -1,3 +1,6 @@
+All code, including code in notebooks, but not including prose, is covered
+under the following license:
+
                    GNU GENERAL PUBLIC LICENSE
                       Version 3, 29 June 2007

--- a/README.md
+++ b/README.md
@ -0,0 +1,8 @@
+# The fastai book
+
+These notebooks cover an introduction to deep learning, fastai, and PyTorch. They are copyright Jeremy Howard and Sylvain Gugger, 2020 onwards. The code in the notebooks and python `.py` files is covered by the GPL v3 license; see the LICENSE file for details.
+
+The remainder (including all markdown cells in the notebooks and other prose) is not licensed for any redistribution or change of format or medium, other than making copies of the notebooks or forking this repo for your own private use. No commercial or broadcast use is allowed.
+
+This is an early draft. We will be providing a channel for comments and feedback soon.
+
--- a/app_blog.ipynb
+++ b/app_blog.ipynb
@ -0,0 +1,325 @@
+{
+ "cells": [
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    "[appendix]\n",
+    "[role=\"Creating a blog\"]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#hide\n",
+    "from utils import *\n",
+    "from fastai2.vision.widgets import *"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Creating a blog"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Blogging with GitHub Pages"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Unfortunately, when it comes to blogging, it seems like you have to make a decision: either use a platform that makes it easy, but subjects you and your readers to advertisements, pay walls, and fees, or spend hours setting up your own hosting and weeks learning about all kinds of intricate details. Perhaps the biggest benefit to the \"do-it-yourself\" approach is that you really owning your own posts, rather than being at the whim of a service provider, and their decisions about how to monetize your content in the future.\n",
+    "\n",
+    "It turns out, however, that you can have the best of both worlds! You can host on a platform called [GitHub Pages](https://pages.github.com/), which is free, has no ads or pay wall, and makes your data available in a standard way such that you can at any time move your blog to another host. But all the approaches I’ve seen to using GitHub Pages have required knowledge of the command line and arcane tools that only software developers are likely to be familiar with. For instance, GitHub's [own documentation](https://help.github.com/en/github/working-with-github-pages/creating-a-github-pages-site-with-jekyll) on setting up a blog requires installing the Ruby programming language, using the git command line tool, copying over version numbers, and more. 17 steps in total!\n",
+    "\n",
+    "We’ve curated an easy approach, which allows you to use an **entirely browser-based interface** for all your blogging needs. You will be up and running with your new blog within about five minutes. It doesn’t cost anything, and you can easily add your own custom domain to it if you wish to. Here’s how to do it, using a template we've created called **fast\\_template**. (NB: be sure to check the [book website](https://book.fast.ai) for the latest blog recommendations, since new tools are always coming out; for instance, we're currently working with GitHub on creating a new tool called \"fastpages\" which is a more advanced version of `fast_template` that's particularly designed for people using Jupyter Notebooks)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Creating the repository"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You’ll need an account on GitHub. So, head over there now, and create an account if you don’t have one already. Make sure that you are logged in. Normally, GitHub is used by software developers for writing code, and they use a sophisticated command line tool to work with it. But I'm going to show you an approach that doesn't use the command line at all!\n",
+    "\n",
+    "To get started, click on this link: [https://github.com/fastai/fast_template/generate](https://github.com/fastai/fast_template/generate) . This will allow you to create a place to store your blog, called a \"*repository*\". You will see the following screen; you have to enter your repository name using the **exact form you see below**, that is, the username you used at GitHub followed by `.github.io`.\n",
+    "\n",
+    "<img width=\"440\" src=\"images/fast_template/image1.png\" id=\"githup_repo\" caption=\"Creating your repository\" alt=\"Screebshot of the GitHub page for creating a new repository\">\n",
+    "\n",
+    "> Important: Note that if you don't use username.github.io as the name, it won't work!\n",
+    "\n",
+    "Once you’ve entered that, and any description you like, click on \"create repository from template\". You have the choice to make the repository \"private\" but since you are creating a blog that you want other people to read, having the underlying files publicly available hopefully won't be a problem for you."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Setting up your homepage"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "When readers first arrive at your blog the first thing that they will see is the content of a file called \"index.md\". This is a [markdown](https://guides.github.com/features/mastering-markdown/) file.  Markdown is a powerful yet simple way of creating formatted text, such as bullet points, italics, hyperlinks, and so forth. It is very widely used, including all the formatting in Jupyter notebooks, nearly every part of the GitHub site, and many other places all over the Internet. To create markdown text, you can just type in plain regular English. But then you can add some special characters to add special behavior. For instance, if you type a `*` character around a word or phrase then that will put it in *italics*. Let’s try it now.\n",
+    "\n",
+    "To open the file, click its file name in GitHub.\n",
+    "\n",
+    "<img width=\"140\" src=\"images/fast_template/image2.png\" alt=\"Screenshot showing a click on the index.md file\">\n",
+    "\n",
+    "To edit it, click on the pencil icon at the far right hand side of the\n",
+    "screen.\n",
+    "\n",
+    "<img width=\"800\" src=\"images/fast_template/image3.png\" alt=\"Screenshot showing where to click to edit the file\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can add, edit, or replace the texts that you see. Click on the\n",
+    "\"preview changes\" button to see how well your markdown text will look\n",
+    "on your blog. Lines that you have added or changed will appear with a\n",
+    "green bar on the left-hand side.\n",
+    "\n",
+    "<img width=\"350\" src=\"images/fast_template/image4.png\" alt=\"Screenshot showing where to click to preview changes\">\n",
+    "\n",
+    "To save your changes to your blog, you must scroll to the bottom and\n",
+    "click on the \"commit changes\" green button. On GitHub, to \"commit\"\n",
+    "something means to save it to the GitHub server.\n",
+    "\n",
+    "<img width=\"600\" src=\"images/fast_template/image5.png\" alt=\"Screenshot showing where to click to commit the changes\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Next, you should configure your blog’s settings. To do so, click on the\n",
+    "file called \"\\_config.yml\", and then click on the edit button like you\n",
+    "did for the index file above. Change the title, description, and GitHub\n",
+    "username values. You need to leave the names before the colons in place\n",
+    "and type your new values in after the colon and space on each line. You\n",
+    "can also add to your email and Twitter username if you wish — but note\n",
+    "that these will appear on your public blog if you do fill them in here.\n",
+    "\n",
+    "<img width=\"800\" src=\"images/fast_template/image6.png\" id=\"github_config\" caption=\"Fill the config file\" alt=\"Screenshot showing the config file and how to fill it\">\n",
+    "\n",
+    "After you’re done, commit your changes just like you did with the index\n",
+    "file before. Then wait about a minute, whilst GitHub processes your new\n",
+    "blog. Then you will be able to go to your blog in your web browser, by\n",
+    "opening the URL: username.github.io (replace \"username\" with your\n",
+    "GitHub username). You should see your blog!\n",
+    "\n",
+    "<img width=\"540\" src=\"images/fast_template/image7.png\" id=\"github_blog\" caption=\"Your blog is onlyine!\" alt=\"Screenshot showing the website username.github.io\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "heading_collapsed": true
+   },
+   "source": [
+    "### Creating posts"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "hidden": true
+   },
+   "source": [
+    "Now you’re ready to create your first post. All your posts will go in\n",
+    "the \"\\_posts\" folder. Click on that now, and then click on the \"create\n",
+    "file\" button. You need to be careful to name your file in the following\n",
+    "format: \"year-month-day-name.md\", where year is a four-digit number, and\n",
+    "month and day are two-digit numbers. \"Name\" can be anything you want,\n",
+    "that will help you remember what this post was about. The \"md\" extension\n",
+    "is for markdown documents.\n",
+    "\n",
+    "<img width=\"440\" src=\"images/fast_template/image8.png\" alt=\"Screenshot showing the right syntax to create a new blog post\">\n",
+    "\n",
+    "You can then type the contents of your first post. The only rule is that\n",
+    "the first line of your post must be a markdown heading. This is created\n",
+    "by putting `# ` at the start of a line (that creates a level 1\n",
+    "heading, which you should just use once at the start of your document;\n",
+    "you create level 2 headings using `## `, level 3 with `###`, and so forth.)\n",
+    "\n",
+    "<img width=\"300\" src=\"images/fast_template/image9.png\" alt=\"Screenshot showing the start of a blog post\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "hidden": true
+   },
+   "source": [
+    "As before, you can click on the \"preview\" button to see how your\n",
+    "markdown formatting will look.\n",
+    "\n",
+    "<img width=\"400\" src=\"images/fast_template/image10.png\" alt=\"Screenshot showing the same blog post interpreted in HTML\">\n",
+    "\n",
+    "And you will need to click the \"commit new file\" button to save it to\n",
+    "GitHub.\n",
+    "\n",
+    "<img width=\"700\" src=\"images/fast_template/image11.png\" alt=\"Screenshot showing where to click to commit the new file\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "hidden": true
+   },
+   "source": [
+    "Have a look at your blog homepage again, and you will see that this post\n",
+    "has now appeared! (Remember that you will need to wait a minute or so\n",
+    "for GitHub to process it.)\n",
+    "\n",
+    "<img width=\"500\" src=\"images/fast_template/image12.png\" alt=\"Screenshot showing the first post on the blog website\">\n",
+    "\n",
+    "You’ll also see that we provided a sample blog post, which you can go\n",
+    "ahead and delete now. Go to your posts folder, as before, and click on\n",
+    "\"2020-01-14-welcome.md\". Then click on the trash icon on the far\n",
+    "right.\n",
+    "\n",
+    "<img width=\"400\" src=\"images/fast_template/image13.png\" alt=\"Screenshot showing how to delete the mock post\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "hidden": true
+   },
+   "source": [
+    "In GitHub, nothing actually changes until you commit— including deleting\n",
+    "a file! So, after you click the trash icon, scroll down to the bottom\n",
+    "and commit your changes.\n",
+    "\n",
+    "You can include images in your posts by adding a line of markdown like\n",
+    "the following:\n",
+    "\n",
+    "    ![Image description](images/filename.jpg)\n",
+    "\n",
+    "For this to work, you will need to put the image inside your \"images\"\n",
+    "folder. To do this, click on the images folder to go into it in GitHub,\n",
+    "and then click the \"upload files\" button.\n",
+    "\n",
+    "<img width=\"400\" src=\"images/fast_template/image14.png\" alt=\"Screenshot showing how to upload new files\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Synchronizing GitHub and your computer"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "There’s lots of reasons you might want to copy your blog content from GitHub to your computer. Perhaps you want to read or edit your posts offline. Or maybe you’d like a backup in case something happens to your GitHub repository.\n",
+    "\n",
+    "GitHub does more than just let you copy your repository to your computer; it lets you *synchronize* it with your computer. So, you can make changes on GitHub, and they’ll copy over to your computer, and you can make changes on your computer, and they’ll copy over to GitHub. You can even let other people access and modify your blog, and their changes and your changes will be automatically combined together next time you sync.\n",
+    "\n",
+    "To make this work, you have to install an application called [GitHub Desktop](https://desktop.github.com/) to your computer. It runs on Mac, Windows, and Linux. Follow the directions at the link to install it, then when you run it it’ll ask you to login to GitHub, and then to select your repository to sync; click \"Clone a repository from the Internet\".\n",
+    "\n",
+    "<img src=\"images/gitblog/image1.png\" width=\"400\" alt=\"A screenshot showing how to clone your repository\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Once GitHub has finished syncing your repo, you’ll be able to click \"View the files of your repository in Finder\" (or Explorer), and you’ll see the local copy of your blog! Try editing one of the files on your computer. Then return to GitHub Desktop, and you’ll see the \"Sync\" button is waiting for you to press it. When you click it, your changes will be copied over to GitHub, where you’ll see them reflected on the web site.\n",
+    "\n",
+    "<img src=\"images/gitblog/image2.png\" width=\"600\" alt=\"A screenshot showing the cloned repository\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If you haven't used git before, GitHub Desktop and a blog is a great way to get started. As you'll discover, it's a fundamental tool used by most data scientists."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Jupyter for blogging"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can also write blog posts using Jupyter Notebooks! Your markdown cells, code cells, and all outputs will appear in your exported blog post. The best way to do this may have changed by the time you are reading this book, so be sure to check out the [book website](https://book.fast.ai) for the latest information. As we write this, the easiest way to create a blog from notebooks is to use [fastpages](http://fastpages.fast.ai/), which is a more advanced version of `fast_template`. \n",
+    "\n",
+    "To blog with a notebook, just pop it in the `_notebooks` folder in your blog repo, and it will appear in your blog. When you write your notebook, write whatever you want your audience to see. Since most writing platforms make it much harder to include code and outputs, many of us are in a habit of including less real examples than we should. So try to get into a new habit of including lots of examples as you write.\n",
+    "\n",
+    "Often you'll want to hide boilerplate such as import statements. Add `#hide` to the top of any cell to make it not show up in output. Jupyter displays the result of the last line of a cell, so there's no need to include `print()`. (And including extra code that isn't needed means there's more cognitive overhead for the reader; so don't include code that you don't really need!)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "jupytext": {
+   "split_at_heading": true
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.5"
+  },
+  "toc": {
+   "base_numbering": 1,
+   "nav_menu": {},
+   "number_sections": false,
+   "sideBar": true,
+   "skip_h1_title": true,
+   "title_cell": "Table of Contents",
+   "title_sidebar": "Contents",
+   "toc_cell": false,
+   "toc_position": {},
+   "toc_section_display": true,
+   "toc_window_display": false
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/app_jupyter.ipynb
+++ b/app_jupyter.ipynb
--- a/images/0_jupyter.png
+++ b/images/0_jupyter.png
--- a/images/Dropout.png
+++ b/images/Dropout.png
--- a/images/Dropout1.png
+++ b/images/Dropout1.png
--- a/images/LSTM.png
+++ b/images/LSTM.png
--- a/images/analytics_chain.gif
+++ b/images/analytics_chain.gif
--- a/images/att_00000.png
+++ b/images/att_00000.png
--- a/images/att_00001.png
+++ b/images/att_00001.png
--- a/images/att_00002.png
+++ b/images/att_00002.png
--- a/images/att_00003.png
+++ b/images/att_00003.png
--- a/images/att_00004.png
+++ b/images/att_00004.png
--- a/images/att_00005.png
+++ b/images/att_00005.png
--- a/images/att_00006.png
+++ b/images/att_00006.png
--- a/images/att_00007.png
+++ b/images/att_00007.png
--- a/images/att_00008.png
+++ b/images/att_00008.png
--- a/images/att_00009.png
+++ b/images/att_00009.png
--- a/images/att_00010.png
+++ b/images/att_00010.png
--- a/images/att_00011.png
+++ b/images/att_00011.png
--- a/images/att_00012.png
+++ b/images/att_00012.png
--- a/images/att_00013.png
+++ b/images/att_00013.png
--- a/images/att_00014.png
+++ b/images/att_00014.png
--- a/images/att_00015.png
+++ b/images/att_00015.png
--- a/images/att_00016.png
+++ b/images/att_00016.png
--- a/images/att_00017.png
+++ b/images/att_00017.png
--- a/images/att_00018.png
+++ b/images/att_00018.png
--- a/images/att_00019.png
+++ b/images/att_00019.png
--- a/images/att_00020.png
+++ b/images/att_00020.png
--- a/images/att_00021.png
+++ b/images/att_00021.png
--- a/images/att_00022.png
+++ b/images/att_00022.png
--- a/images/att_00023.png
+++ b/images/att_00023.png
--- a/images/att_00024.png
+++ b/images/att_00024.png
--- a/images/att_00025.png
+++ b/images/att_00025.png
--- a/images/att_00026.png
+++ b/images/att_00026.png
--- a/images/att_00027.png
+++ b/images/att_00027.png
--- a/images/att_00028.png
+++ b/images/att_00028.png
--- a/images/att_00029.png
+++ b/images/att_00029.png
--- a/images/att_00030.png
+++ b/images/att_00030.png
--- a/images/att_00031.png
+++ b/images/att_00031.png
--- a/images/att_00032.png
+++ b/images/att_00032.png
--- a/images/att_00033.png
+++ b/images/att_00033.png
--- a/images/att_00034.png
+++ b/images/att_00034.png
--- a/images/att_00035.png
+++ b/images/att_00035.png
--- a/images/att_00036.png
+++ b/images/att_00036.png
--- a/images/att_00037.png
+++ b/images/att_00037.png
--- a/images/att_00038.png
+++ b/images/att_00038.png
--- a/images/att_00039.png
+++ b/images/att_00039.png
--- a/images/att_00040.png
+++ b/images/att_00040.png
--- a/images/att_00041.png
+++ b/images/att_00041.png
--- a/images/att_00042.png
+++ b/images/att_00042.png
--- a/images/att_00043.png
+++ b/images/att_00043.png
--- a/images/att_00044.png
+++ b/images/att_00044.png
--- a/images/att_00045.png
+++ b/images/att_00045.png
--- a/images/att_00046.png
+++ b/images/att_00046.png
--- a/images/att_00047.png
+++ b/images/att_00047.png
--- a/images/att_00048.png
+++ b/images/att_00048.png
--- a/images/att_00049.png
+++ b/images/att_00049.png
--- a/images/att_00050.png
+++ b/images/att_00050.png
--- a/images/att_00051.png
+++ b/images/att_00051.png
--- a/images/att_00052.png
+++ b/images/att_00052.png
--- a/images/att_00053.png
+++ b/images/att_00053.png
--- a/images/att_00054.png
+++ b/images/att_00054.png
--- a/images/att_00055.png
+++ b/images/att_00055.png
--- a/images/att_00056.png
+++ b/images/att_00056.png
--- a/images/att_00057.png
+++ b/images/att_00057.png
--- a/images/att_00058.png
+++ b/images/att_00058.png
--- a/images/att_00059.png
+++ b/images/att_00059.png
--- a/images/att_00060.png
+++ b/images/att_00060.png
--- a/images/att_00061.png
+++ b/images/att_00061.png
--- a/images/att_00062.png
+++ b/images/att_00062.png
--- a/images/att_00063.png
+++ b/images/att_00063.png
--- a/images/att_00064.png
+++ b/images/att_00064.png
--- a/images/att_00065.png
+++ b/images/att_00065.png
--- a/images/att_00066.png
+++ b/images/att_00066.png
--- a/images/att_00067.png
+++ b/images/att_00067.png
--- a/Show More
+++ b/Show More