diff --git a/02-Discrete-Bayes.ipynb b/02-Discrete-Bayes.ipynb index 59a5de4..84d9f9a 100644 --- a/02-Discrete-Bayes.ipynb +++ b/02-Discrete-Bayes.ipynb @@ -1671,15 +1671,21 @@ "source": [ "We developed the math in this chapter merely by reasoning about the information we have at each moment. In the process we discovered **Bayes Theorem**. We will go into the specifics of the math of Bayes theorem later in the book. For now we will take a more intuitive approach. Recall from the preface that Bayes theorem tells us how to compute the probability of an event given previous information. That is exactly what we have been doing in this chapter. With luck our code should match the Bayes Theorem equation! \n", "\n", - "Bayes theorem is written as\n", + "We implemented the `update()` function with this probability calculation:\n", + "\n", + "$$ \\mathtt{posterior} = \\frac{\\mathtt{evidence}\\times \\mathtt{prior}}{\\mathtt{normalization}}$$ \n", + "\n", + "To review, the *prior* is the probability of something happening before we include the measurement and the *posterior* is the probability we compute after incorporating the information from the measurement.\n", + "\n", + "Bayes theorem is\n", "\n", "$$P(A|B) = \\frac{P(B | A)\\, P(A)}{P(B)}\\cdot$$\n", "\n", "If you are not familiar with this notation, let's review. $P(A)$ means the probability of event $A$. If $A$ is the event of a fair coin landing heads, then $P(A) = 0.5$.\n", "\n", - "$P(A|B)$ is called a **conditional probability**. That is, it represents the probability of $A$ happening *if* $B$ happened. For example, it is more likely to rain today if it also rained yesterday because rain systems tend to last more than one day. We'd write the probability of it raining today given that it rained yesterday as $P(rain_{today}|rain_{yesterday})$.\n", + "$P(A|B)$ is called a **conditional probability**. That is, it represents the probability of $A$ happening *if* $B$ happened. For example, it is more likely to rain today if it also rained yesterday because rain systems tend to last more than one day. We'd write the probability of it raining today given that it rained yesterday as $P(\\mathtt{rain_{today}}|\\mathtt{rain_{yesterday}})$.\n", "\n", - "In Bayesian statistics $P(A)$ is called the **prior**, and $P(A|B)$ is called the **posterior**. To see why, let's rewrite the equation in terms of our problem. We will use $x_i$ for the position at *i*, and $Z$ for the measurement. Hence, we want to know $P(x_i|Z)$, that is, the probability of the dog being at $x_i$ given the measurement $Z$. \n", + "In Bayes theorem $P(A)$ is the *prior*, $P(B)$ is the *evidence*, and $P(A|B)$ is the *posterior*. By substituting the mathematical terms with the corresponding words you can see that Bayes theorem matches out update equation. Let's rewrite the equation in terms of our problem. We will use $x_i$ for the position at *i*, and $Z$ for the measurement. Hence, we want to know $P(x_i|Z)$, that is, the probability of the dog being at $x_i$ given the measurement $Z$. \n", "\n", "So, let's plug that into the equation and solve it.\n", "\n", @@ -1697,11 +1703,7 @@ "\n", "I added the `else` here, which has no mathematical effect, to point out that every element in $x$ (called `belief` in the code) is multiplied by a probability. You may object that I am multiplying by a scale factor, which I am, but this scale factor is derived from the probability of the measurement being correct vs the probability being incorrect.\n", "\n", - "The last term to consider is the denominator $P(Z)$. This is the probability of getting the measurement $Z$ without taking the location into account. We compute that by taking the sum of $x$, or `sum(belief)` in the code. That is how we compute the normalization! So, the `update()` function is doing nothing more than computing Bayes theorem. Recall this equation from earlier in the chapter:\n", - "\n", - "$$ \\mathtt{posterior} = \\frac{\\mathtt{prior}\\times \\mathtt{evidence}}{\\mathtt{normalization}}$$ \n", - "\n", - "That is the Bayes theorem written in words instead of mathematical symbols. I could have given you Bayes theorem and then written a function, but I doubt that would have been illuminating unless you already know Bayesian statistics. Instead, we figured out what to do just by reasoning about the situation, and so of course the resulting code ended up implementing Bayes theorem. Students spend a lot of time struggling to understand this theorem; I hope you found it relatively straightforward." + "The last term to consider is the denominator $P(Z)$. This is the probability of getting the measurement $Z$ without taking the location into account. We compute that by taking the sum of $x$, or `sum(belief)` in the code. That is how we compute the normalization! So, the `update()` function is doing nothing more than computing Bayes theorem." ] }, {