diff --git a/02-Discrete-Bayes.ipynb b/02-Discrete-Bayes.ipynb
index d7e060c..cc2b3d5 100644
--- a/02-Discrete-Bayes.ipynb
+++ b/02-Discrete-Bayes.ipynb
@@ -357,7 +357,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "In [Bayesian statistics](https://en.wikipedia.org/wiki/Bayesian_probability) this is called a [*prior*](https://en.wikipedia.org/wiki/Prior_probability). It is the probability prior to incorporating measurements or other information. More completely, this is called the *prior probability distribution*. A [*probability distribution*](https://en.wikipedia.org/wiki/Probability_distribution) is a collection of all possible probabilities for an event. Probability distributions always to sum to 1 because something had to happen; the distribution lists all possible events and the probability of each.\n",
+    "In [Bayesian statistics](https://en.wikipedia.org/wiki/Bayesian_probability) this is called a [*prior*](https://en.wikipedia.org/wiki/Prior_probability). It is the probability prior to incorporating measurements or other information. More completely, this is called the *prior probability distribution*. A [*probability distribution*](https://en.wikipedia.org/wiki/Probability_distribution) is a collection of all possible probabilities for an event. Probability distributions always sum to 1 because something had to happen; the distribution lists all possible events and the probability of each.\n",
     "\n",
     "I'm sure you've used probabilities before - as in \"the probability of rain today is 30%\". The last paragraph sounds like more of that. But Bayesian statistics was a revolution in probability because it treats probability as a belief about a single event. Let's take an example. I know that if I flip a fair coin infinitely many times I will get 50% heads and 50% tails. This is called [*frequentist statistics*](https://en.wikipedia.org/wiki/Frequentist_inference) to distinguish it from Bayesian statistics. Computations are based on the frequency in which events occur.\n",
     "\n",
@@ -5504,7 +5504,7 @@
     "\n",
     "The second problem is that the filter is discrete, but we live in a continuous world. The histogram requires that you model the output of your filter as a set of discrete points. A 100 meter hallway requires 10,000 positions to model the hallway to 1cm accuracy. So each update and predict operation would entail performing calculations for 10,000 different probabilities. It gets exponentially worse as we add dimensions. A 100x100 m$^2$ courtyard requires 100,000,000 bins to get 1cm accuracy.\n",
     "\n",
-    "A third problem is that the filter is multimodal. In the least example we ended up with strong beliefs that the dog was in position 4 or 9. This is not always a problem. Particle filters, which we will study later, are multimodal and are often used because of this property. But imagine if the GPS in your car reported to you that it is 40% sure that you are on D street, and 30% sure you are on Willow Avenue. \n",
+    "A third problem is that the filter is multimodal. In the last example we ended up with strong beliefs that the dog was in position 4 or 9. This is not always a problem. Particle filters, which we will study later, are multimodal and are often used because of this property. But imagine if the GPS in your car reported to you that it is 40% sure that you are on D street, and 30% sure you are on Willow Avenue. \n",
     "\n",
     "A forth problem is that it requires a measurement of the change in state. We need a motion sensor to detect how much the dog moves. There are ways to work around this problem, but it would complicate the exposition of this chapter, so, given the aforementioned problems, I will not discuss it further.\n",
     "\n",
@@ -5824,7 +5824,7 @@
     "\n",
     "$$P(A \\mid B) = \\frac{P(B \\mid A)\\, P(A)}{\\int P(B \\mid A) P(B) \\mathtt{d}y}\\cdot$$\n",
     "\n",
-    "In practice the denominator can be fiendishly difficult to solve analytically (a recent opinion piece for the Royal Statistical Society [called it](http://www.statslife.org.uk/opinion/2405-we-need-to-rethink-how-we-teach-statistics-from-the-ground-up) a \"dog's breakfast\" [8].  Filtering textbooks are filled with integral laden equations which you cannot be expected to solve. We will learn more techniques to handle this in the **Particle Filters** chapter. Until then, recognize that in practice it is just a normalization term over which we can sum. What I'm trying to say is that when you are faced with a page of integrals, just think of them of sums, and relate them back to this chapter, and often the difficulties will fade. Ask yourself \"why are we summing these values\", and \"why am I dividing by this term\". Surprisingly often the answer is readily apparent."
+    "In practice the denominator can be fiendishly difficult to solve analytically (a recent opinion piece for the Royal Statistical Society [called it](http://www.statslife.org.uk/opinion/2405-we-need-to-rethink-how-we-teach-statistics-from-the-ground-up) a \"dog's breakfast\" [8].  Filtering textbooks are filled with integral laden equations which you cannot be expected to solve. We will learn more techniques to handle this in the **Particle Filters** chapter. Until then, recognize that in practice it is just a normalization term over which we can sum. What I'm trying to say is that when you are faced with a page of integrals, just think of them as sums, and relate them back to this chapter, and often the difficulties will fade. Ask yourself \"why are we summing these values\", and \"why am I dividing by this term\". Surprisingly often the answer is readily apparent."
    ]
   },
   {
diff --git a/04-One-Dimensional-Kalman-Filters.ipynb b/04-One-Dimensional-Kalman-Filters.ipynb
index 569445c..8b336e3 100644
--- a/04-One-Dimensional-Kalman-Filters.ipynb
+++ b/04-One-Dimensional-Kalman-Filters.ipynb
@@ -736,7 +736,7 @@
    "source": [
     "The result of the multiplication is taller and narrow than the original Gaussian but the mean is unchanged. Does this match your intuition?\n",
     "\n",
-    "Think of the Gaussians as two measurements. If I measure twice and get 10 meters each time, I should conclude that the length is close to 10 meters. Thus the mean should be 10. It would make no sense to conclude the length is actually 11, or 9.5. Aslo, I am more confident with two measurements than with one, so the variance of the result should be smaller. \n",
+    "Think of the Gaussians as two measurements. If I measure twice and get 10 meters each time, I should conclude that the length is close to 10 meters. Thus the mean should be 10. It would make no sense to conclude the length is actually 11, or 9.5. Also, I am more confident with two measurements than with one, so the variance of the result should be smaller. \n",
     "\n",
     "\"Measure twice, cut once\" is a well known saying. Gaussian multiplication is a mathematical model of this physical fact. \n",
     "\n",
@@ -840,7 +840,7 @@
    "source": [
     "The result is a Gaussian that is taller than either input. This makes sense - we have incorporated information, so our variance should have been reduced. And notice how the result is far closer to the the input with the smaller variance. We have more confidence in that value, so it makes sense to weight it more heavily.\n",
     "\n",
-    "It *seems* to work, but it is really correct? There is more to say about this, but I want to get a working filter going so you can it experience it in concrete terms. After that we will revisit Gaussian multiplication and determine why it is correct."
+    "It *seems* to work, but is it really correct? There is more to say about this, but I want to get a working filter going so you can it experience it in concrete terms. After that we will revisit Gaussian multiplication and determine why it is correct."
    ]
   },
   {
@@ -1397,7 +1397,7 @@
     "    5. update belief in the state based on how certain we are \n",
     "    in the measurement\n",
     "\n",
-    "You will be hard pressed to find a Bayesian filter algorithm that does not fit into this form. Some filters will not include some aspect, such as error in the prediction, and others will have very complicated methods of computation, but this is what they all do. \n",
+    "You will be hard pressed to find a Bayesian filter algorithm that does not fit into this form. Some filters will not include some aspects, such as error in the prediction, and others will have very complicated methods of computation, but this is what they all do. \n",
     "\n",
     "The equations for the univariate Kalman filter are:\n",
     "\n",