Merge pull request #190 from ernstklrb/typos

Typos
2018-01-01 11:24:10 -08:00 · 2018-01-01 11:24:10 -08:00 · adff5a2b30
commit adff5a2b30
parent 850c7fc0dd 7500b49bed
6 changed files with 35 additions and 35 deletions
--- a/00-Preface.ipynb
+++ b/00-Preface.ipynb
@ -84,7 +84,7 @@
    "\n",
    "As I began to understand the math and theory more difficulties appeared. A book or paper will make some statement of fact and presents a graph as proof.  Unfortunately, why the statement is true is not clear to me, or I cannot reproduce the plot. Or maybe I wonder \"is this true if R=0?\"  Or the author provides pseudocode at such a high level that the implementation is not obvious. Some books offer Matlab code, but I do not have a license to that expensive package. Finally, many books end each chapter with many useful exercises. Exercises which you need to understand if you want to implement Kalman filters for yourself, but exercises with no answers. If you are using the book in a classroom, perhaps this is okay, but it is terrible for the independent reader. I loathe that an author withholds information from me, presumably to avoid 'cheating' by the student in the classroom.\n",
    "\n",
-    "All of this impedes learning.I want to track an image on a screen, or write some code for my Arduino project. I want to know how the plots in the book are made, and to choose different parameters than the author chose. I want to run simulations. I want to inject more noise into the signal and see how a filter performs. There are thousands of opportunities for using Kalman filters in everyday code, and yet this fairly straightforward topic is the provenance of rocket scientists and academics.\n",
+    "All of this impedes learning. I want to track an image on a screen, or write some code for my Arduino project. I want to know how the plots in the book are made, and to choose different parameters than the author chose. I want to run simulations. I want to inject more noise into the signal and see how a filter performs. There are thousands of opportunities for using Kalman filters in everyday code, and yet this fairly straightforward topic is the provenance of rocket scientists and academics.\n",
    "\n",
    "I wrote this book to address all of those needs. This is not the sole book for you if you design military radars. Go get a Masters or PhD at a great STEM school, because you'll need it. This book is for the hobbyist, the curious, and the working engineer that needs to filter or smooth data. If you are a hobbyist this book should provide everything you need. If you are serious about Kalman filters you'll need more. My intention is to introduce enough of the concepts and mathematics to make the textbooks and papers approachable.\n",
    "\n",
--- a/01-g-h-filter.ipynb
+++ b/01-g-h-filter.ipynb
@ -286,7 +286,7 @@
    "\n",
    "Once cells are run you can often jump around and rerun cells in different orders, but not always. I'm trying to fix this, but there is a tradeoff. I'll define a variable in cell 10 (say), and then run code that modifies that variable in cells 11 and 12. If you go back and run cell 11 again the variable will have the value that was set in cell 12, and the code expects the value that was set in cell 10. So, occasionally you'll get weird results if you run cells out of order. My advise is to backtrack a bit, and run cells in order again to get back to a proper state. It's annoying, but the interactive aspect of Jupyter notebooks more than makes up for it. Better yet, submit an issue on GitHub so I know about the problem and fix it!\n",
    "\n",
-    "Finally, some readers have reported problems with the animated plotting features in some browsers. I have not been able to duplicate this. In parts of the book I use the `%matplot notebook` magic, which enables interactive plotting. If these plots are not working for you, try changing this to read `%matplotlib inline`. You will lose the animated plotting, but it seems to work on all platforms and browsers."
+    "Finally, some readers have reported problems with the animated plotting features in some browsers. I have not been able to reproduce this. In parts of the book I use the `%matplotlib notebook` magic, which enables interactive plotting. If these plots are not working for you, try changing this to read `%matplotlib inline`. You will lose the animated plotting, but it seems to work on all platforms and browsers."
   ]
  },
  {
@ -304,7 +304,7 @@
    "\n",
    "Another co-worker hears the commotion and comes over to find out what has you so excited. You explain the invention and once again step onto the scale, and proudly proclaim the result: \"161 lbs.\" And then you hesitate, confused.\n",
    "\n",
-    "\"It read 172 lbs a few seconds ago\" you complain to your co-worker. \n",
+    "\"It read 172 lbs a few seconds ago\", you complain to your co-worker. \n",
    "\n",
    "\"I never said it was accurate,\" she replies.\n",
    "\n",
--- a/05-Multivariate-Gaussians.ipynb
+++ b/05-Multivariate-Gaussians.ipynb
@ -1940,7 +1940,7 @@
   "source": [
    "What makes this possible? Imagine for a moment that we superimposed the velocity from a different airplane over the position graph. Clearly the two are not related, and there is no way that combining the two could possibly yield any additional information. In contrast, the velocity of this airplane tells us something very important - the direction and speed of travel. So long as the aircraft does not alter its velocity the velocity allows us to predict where the next position is. After a relatively small amount of error in velocity the probability that it is a good match with the position is very small. Think about it - if you suddenly change direction your position is also going to change a lot. If the measurement of the position is not in the direction of the velocity change it is very unlikely to be true. The two are correlated, so if the velocity changes so must the position, and in a predictable way. \n",
    "\n",
-    "It is important to understand that we are taking advantage of the fact that velocity and position are correlated. We get a rough estimate of velocity from the distance and time between two measurement, and use Bayes theorem to produce very accurate estimates after only a few observations. Please reread this section if you have any doubts. If you do not understand this you will quickly find it impossible to reason about what you will learn in the following chapters."
+    "It is important to understand that we are taking advantage of the fact that velocity and position are correlated. We get a rough estimate of velocity from the distance and time between two measurements, and use Bayes theorem to produce very accurate estimates after only a few observations. Please reread this section if you have any doubts. If you do not understand this you will quickly find it impossible to reason about what you will learn in the following chapters."
   ]
  },
  {
--- a/06-Multivariate-Kalman-Filters.ipynb
+++ b/06-Multivariate-Kalman-Filters.ipynb
@ -1354,7 +1354,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Keeping track of all of these variables is burdomsome, so FilterPy also implements the filter with the class `KalmanFilter`. I will use the class in the rest of this book, but I wanted you to see the procedural form of these functions since I know some of you are not fans of object oriented programming."
+    "Keeping track of all of these variables is burdensome, so FilterPy also implements the filter with the class `KalmanFilter`. I will use the class in the rest of this book, but I wanted you to see the procedural form of these functions since I know some of you are not fans of object oriented programming."
   ]
  },
  {
@ -1363,14 +1363,14 @@
   "source": [
    "## Implementing the Kalman Filter\n",
    "\n",
-    "I've given you all of the code for the filter, but now let's collect it in one place. First we construct an `KalmanFilter` object. We have to specify the number of variables in the state with the `dim_x` parameter, and the number of measurements with `dim_z`. We have two random variables in the state and one measurement, so we write:\n",
+    "I've given you all of the code for the filter, but now let's collect it in one place. First we construct a `KalmanFilter` object. We have to specify the number of variables in the state with the `dim_x` parameter, and the number of measurements with `dim_z`. We have two random variables in the state and one measurement, so we write:\n",
    "\n",
    "```python\n",
    "from filterpy.kalman import KalmanFilter\n",
    "dog_filter = KalmanFilter(dim_x=2, dim_z=1)\n",
    "```\n",
    "\n",
-    "This creates an obect with default values for all the Kalman filter matrices:"
+    "This creates an object with default values for all the Kalman filter matrices:"
   ]
  },
  {
@ -1585,7 +1585,7 @@
    "\n",
    "The covariance matrix $\\mathbf P$ tells us the *theoretical* performance of the filter *assuming* everything we tell it is true. Recall that the standard deviation is the square root of the variance, and that approximately 68% of a Gaussian distribution occurs within one standard deviation. If at least 68% of the filter output is within one standard deviation  the filter may be performing well. In the top chart I have displayed the one standard deviation as the yellow shaded area between the two dotted lines. To my eye it looks like perhaps the filter is slightly exceeding that bounds, so the filter probably needs some tuning.\n",
    "\n",
-    "In the univariate chapter we filtered very noisy signals with much simpler code than the code above. However, realize that right now we are working with a very simple example - an object moving through 1-D space and one sensor. That is about the limit of what we can compute with the code in the last chapter. In contrast, we can implement very complicated, multidimensional filter with this code merely by altering our assignments to the filter's variables. Perhaps we want to track 100 dimensions in financial models. Or we have an aircraft with a GPS, INS, TACAN, radar altimeter, baro altimeter, and airspeed indicator, and we want to integrate all those sensors into a model that predicts position, velocity, and accelerations in 3D space. We can do that with the code in this chapter.\n",
+    "In the univariate chapter we filtered very noisy signals with much simpler code than the code above. However, realize that right now we are working with a very simple example - an object moving through 1-D space and one sensor. That is about the limit of what we can compute with the code in the last chapter. In contrast, we can implement very complicated, multidimensional filters with this code merely by altering our assignments to the filter's variables. Perhaps we want to track 100 dimensions in financial models. Or we have an aircraft with a GPS, INS, TACAN, radar altimeter, baro altimeter, and airspeed indicator, and we want to integrate all those sensors into a model that predicts position, velocity, and acceleration in 3D space. We can do that with the code in this chapter.\n",
    "\n",
    "I want you to get a better feel for how the Gaussians change over time, so here is a 3D plot showing the Gaussians every 7th epoch (time step). Every 7th separates them enough so can see each one independently. The first Gaussian at $t=0$ is to the left."
   ]
@ -1658,7 +1658,7 @@
    "\n",
    "As a reminder, the linear equation $\\mathbf{Ax} = \\mathbf B$ represents a system of equations, where $\\mathbf{A}$ holds the coefficients of the state $\\mathbf x$. Performing the multiplication $\\mathbf{Ax}$ computes the right hand side values for that set of equations.\n",
    "\n",
-    "If $\\mathbf F$ contains the state transition for a given time step, then the product $\\mathbf{Fx}$ computes the state after that transition. Easy! Likewise, $\\mathbf B$ is the control function, $\\mathbf u$ is the control input, so $\\mathbf{Bu}$ computes the contribution of the controls to the state after the transition. Thus, the prior$\\mathbf{\\bar x}$ is computed as the sum of $\\mathbf{Fx}$ and $\\mathbf{Bu}$.\n",
+    "If $\\mathbf F$ contains the state transition for a given time step, then the product $\\mathbf{Fx}$ computes the state after that transition. Easy! Likewise, $\\mathbf B$ is the control function, $\\mathbf u$ is the control input, so $\\mathbf{Bu}$ computes the contribution of the controls to the state after the transition. Thus, the prior $\\mathbf{\\bar x}$ is computed as the sum of $\\mathbf{Fx}$ and $\\mathbf{Bu}$.\n",
    "\n",
    "The equivalent univariate equation is\n",
    "\n",
@ -2698,7 +2698,7 @@
   "source": [
    "### Solution\n",
    "\n",
-    "The x-axis is for position, and x-axis is velocity. An ellipse that is vertical, or nearly so, says there is no correlation between position and velocity, and an ellipse that is diagonal says that there is a lot of correlation. Phrased that way, the results sound unlikely. The tilt of the ellipse changes, but the correlation shouldn't be changing over time. But this is a measure of the *output of the filter*, not a description of the actual, physical world. When $\\mathbf R$ is very large we are telling the filter that there is a lot of noise in the measurements. In that case the Kalman gain $\\mathbf K$ is set to favor the prediction over the measurement, and the prediction comes from the velocity state variable. Thus there is a large correlation between $x$ and $\\dot x$. Conversely, if $\\mathbf R$ is small, we are telling the filter that the measurement is very trustworthy, and $\\mathbf K$ is set to favor the measurement over the prediction. Why would the filter want to use the prediction if the measurement is nearly perfect? If the filter is not using much from the prediction there will be very little correlation reported. \n",
+    "The x-axis is for position, and y-axis is velocity. An ellipse that is vertical, or nearly so, says there is no correlation between position and velocity, and an ellipse that is diagonal says that there is a lot of correlation. Phrased that way, the results sound unlikely. The tilt of the ellipse changes, but the correlation shouldn't be changing over time. But this is a measure of the *output of the filter*, not a description of the actual, physical world. When $\\mathbf R$ is very large we are telling the filter that there is a lot of noise in the measurements. In that case the Kalman gain $\\mathbf K$ is set to favor the prediction over the measurement, and the prediction comes from the velocity state variable. Thus there is a large correlation between $x$ and $\\dot x$. Conversely, if $\\mathbf R$ is small, we are telling the filter that the measurement is very trustworthy, and $\\mathbf K$ is set to favor the measurement over the prediction. Why would the filter want to use the prediction if the measurement is nearly perfect? If the filter is not using much from the prediction there will be very little correlation reported. \n",
    "\n",
    "**This is a critical point to understand!**. The Kalman filter is a mathematical model for a real world system. A report of little correlation *does not mean* there is no correlation in the physical system, just that there was no *linear* correlation in the mathematical model. It's a report of how much measurement vs prediction was incorporated into the model.  \n",
    "\n",
@ -2907,7 +2907,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "This output is fantastic! Two things are very apparent to me in this chart. First, the RTS smoother's output is much smoother than the KF output. Second, it is almost always more accurate than the KF output (we will examine this claim in detail in the **Smoothing** chapter). The improvement in the velocity, which is an hidden variable, is even more dramatic:"
+    "This output is fantastic! Two things are very apparent to me in this chart. First, the RTS smoother's output is much smoother than the KF output. Second, it is almost always more accurate than the KF output (we will examine this claim in detail in the **Smoothing** chapter). The improvement in the velocity, which is a hidden variable, is even more dramatic:"
   ]
  },
  {
@ -3026,7 +3026,7 @@
    "\n",
    "The Kalman filter is a mathematical model of the world. The output is only as accurate as that model. To make the math tractable we had to make some assumptions. We assume that the sensors and motion model have Gaussian noise. We assume that everything is linear. If that is true, the Kalman filter is *optimal* in a least squares sense. This means that there is no way to make a better estimate than what the filter gives us. However, these assumption are almost never true, and hence the model is necessarily limited, and a working filter is rarely optimal.\n",
    "\n",
-    "In later chapters we will deal with the problem of nonlinearity. For now I want you to understand that designing the matrices of a linear filter is an experimental procedure more than a mathematical one. Use math to establish the initial values, but then you need to experiment. If there is a lot of unaccounted noise in the world (wind, etc) you may have to make $\\mathbf Q$ larger. If you make it too large the filter fails to respond quickly to changes. In the **Adaptive Filters** chapter you will learn some alternative techniques that allow you to change the filter design in real time in response to the inputs and performance, but for now you need to find one set of values that works for the conditions your filter will encounter. Noise matrices for an acrobatic plane might be different if the pilot is a student than if the pilot is an expert as the dynamics will be quite different."
+    "In later chapters we will deal with the problem of nonlinearity. For now I want you to understand that designing the matrices of a linear filter is an experimental procedure more than a mathematical one. Use math to establish the initial values, but then you need to experiment. If there is a lot of unaccounted noise in the world (wind, etc.) you may have to make $\\mathbf Q$ larger. If you make it too large the filter fails to respond quickly to changes. In the **Adaptive Filters** chapter you will learn some alternative techniques that allow you to change the filter design in real time in response to the inputs and performance, but for now you need to find one set of values that works for the conditions your filter will encounter. Noise matrices for an acrobatic plane might be different if the pilot is a student than if the pilot is an expert as the dynamics will be quite different."
   ]
  },
  {
--- a/07-Kalman-Filter-Math.ipynb
+++ b/07-Kalman-Filter-Math.ipynb
@ -327,7 +327,7 @@
    "\n",
    "$\\mathbf w$ may strike you as a poor choice for the name, but you will soon see that the Kalman filter assumes *white* noise.\n",
    "\n",
-    "Finally, we need to consider any inputs into the system. We assume an input $\\mathbf u$, and that there exists a linear model that defines how that input changes the system. For example, pressing the accelerator in your car makes it accelerate, and gravity causes balls to fall. Both are contol inputs. We will need a matrix $\\mathbf B$ to convert $u$ into the effect on the system. We add that into our equation:\n",
+    "Finally, we need to consider any inputs into the system. We assume an input $\\mathbf u$, and that there exists a linear model that defines how that input changes the system. For example, pressing the accelerator in your car makes it accelerate, and gravity causes balls to fall. Both are control inputs. We will need a matrix $\\mathbf B$ to convert $u$ into the effect on the system. We add that into our equation:\n",
    "\n",
    "$$ \\dot{\\mathbf x} = \\mathbf{Ax} + \\mathbf{Bu} + \\mathbf{w}$$\n",
    "\n",
@ -520,7 +520,7 @@
    "\n",
    "That series is found by doing a Taylor series expansion of $e^{\\mathbf At}$, which I will not cover here.\n",
    "\n",
-    "Let's use this to find the solution to Newton's equations. Using $v$ as an substitution for $\\dot x$, and assuming constant velocity we get the linear matrix-vector form \n",
+    "Let's use this to find the solution to Newton's equations. Using $v$ as a substitution for $\\dot x$, and assuming constant velocity we get the linear matrix-vector form \n",
    "\n",
    "$$\\begin{bmatrix}\\dot x \\\\ \\dot v\\end{bmatrix} =\\begin{bmatrix}0&1\\\\0&0\\end{bmatrix} \\begin{bmatrix}x \\\\ v\\end{bmatrix}$$\n",
    "\n",
@ -715,7 +715,7 @@
   "source": [
    "## Design of the Process Noise Matrix\n",
    "\n",
-    "In general the design of the $\\mathbf Q$ matrix is among the most difficult aspects of Kalman filter design. This is due to several factors. First, the math requires a good foundation in signal theory. Second, we are trying to model the noise in something for which we have little information. Consider trying to model the process noise for a thrown baseball. We can model it as a sphere moving through the air, but that leaves many unknown factors - the wind, ball rotation and spin decay, the coefficient of drag of a ball with stitches, the effects of wind and air density, and so on. We develop the equations for an exact mathematical solution for a given process model, but since the process model is incomplete the result for $\\mathbf Q$ will also be incomplete. This has a lot of ramifications for the behavior of the Kalman filter. If $\\mathbf Q$ is too small then the filter will be overconfident in its prediction model and will diverge from the actual solution. If $\\mathbf Q$ is too large than the filter will be unduly influenced by the noise in the measurements and perform sub-optimally. In practice we spend a lot of time running simulations and evaluating collected data to try to select an appropriate value for $\\mathbf Q$. But let's start by looking at the math.\n",
+    "In general the design of the $\\mathbf Q$ matrix is among the most difficult aspects of Kalman filter design. This is due to several factors. First, the math requires a good foundation in signal theory. Second, we are trying to model the noise in something for which we have little information. Consider trying to model the process noise for a thrown baseball. We can model it as a sphere moving through the air, but that leaves many unknown factors - ball rotation and spin decay, the coefficient of drag of a ball with stitches, the effects of wind and air density, and so on. We develop the equations for an exact mathematical solution for a given process model, but since the process model is incomplete the result for $\\mathbf Q$ will also be incomplete. This has a lot of ramifications for the behavior of the Kalman filter. If $\\mathbf Q$ is too small then the filter will be overconfident in its prediction model and will diverge from the actual solution. If $\\mathbf Q$ is too large than the filter will be unduly influenced by the noise in the measurements and perform sub-optimally. In practice we spend a lot of time running simulations and evaluating collected data to try to select an appropriate value for $\\mathbf Q$. But let's start by looking at the math.\n",
    "\n",
    "\n",
    "Let's assume a kinematic system - some system that can be modeled using Newton's equations of motion. We can make a few different assumptions about this process. \n",
@ -1589,9 +1589,9 @@
    "\n",
    "The details of the mathematics for this computation varies based on the problem. The **Discrete Bayes** and **Univariate Kalman Filter** chapters gave two different formulations which you should have been able to reason through. The univariate Kalman filter assumes that for a scalar state both the noise and process are linear model are affected by zero-mean, uncorrelated Gaussian noise. \n",
    "\n",
-    "The Multivariate Kalman filter make the same assumption but for states and measurements that are vectors, not scalars. Dr. Kalman was able to prove that if these assumptions hold true then the Kalman filter is *optimal* in a least squares sense. Colloquially this means there is no way to derive more information from the noise. In the remainder of the book I will present filters that relax the constraints on linearity and Gaussian noise.\n",
+    "The Multivariate Kalman filter make the same assumption but for states and measurements that are vectors, not scalars. Dr. Kalman was able to prove that if these assumptions hold true then the Kalman filter is *optimal* in a least squares sense. Colloquially this means there is no way to derive more information from the noisy measurements. In the remainder of the book I will present filters that relax the constraints on linearity and Gaussian noise.\n",
    "\n",
-    "Before I go on, a few more words about statistical inversion. As Calvetti and Somersalo write in *Introduction to Bayesian Scientific Computing*, \"we adopt the Bayesian point of view: *randomness simply means lack of information*.\"[3] Our state parametize physical phenomena that we could in principle measure or compute: velocity, air drag, and so on. We lack enough information to compute or measure their value, so we opt to consider them as random variables. Strictly speaking they are not random, thus this is a subjective position. \n",
+    "Before I go on, a few more words about statistical inversion. As Calvetti and Somersalo write in *Introduction to Bayesian Scientific Computing*, \"we adopt the Bayesian point of view: *randomness simply means lack of information*.\"[3] Our state parameterizes physical phenomena that we could in principle measure or compute: velocity, air drag, and so on. We lack enough information to compute or measure their value, so we opt to consider them as random variables. Strictly speaking they are not random, thus this is a subjective position. \n",
    "\n",
    "They devote a full chapter to this topic. I can spare a paragraph. Bayesian filters are possible because we ascribe statistical properties to unknown parameters. In the case of the Kalman filter we have closed-form solutions to find an optimal estimate. Other filters, such as the discrete Bayes filter or the particle filter which we cover in a later chapter, model the probability in a more ad-hoc, non-optimal manner. The power of our technique comes from treating lack of information as a random variable, describing that random variable as a probability distribution, and then using Bayes Theorem to solve the statistical inference problem."
   ]
--- a/08-Designing-Kalman-Filters.ipynb
+++ b/08-Designing-Kalman-Filters.ipynb
@ -445,7 +445,7 @@
   "source": [
    "### Design the Process Noise Matrix\n",
    "\n",
-    "FilterPy can compute $\\mathbf Q$ matrix for us. For simplicity I will assume the noise is a discrete time Wiener process - that it is constant for each time period. This assumption allows me to use a variance to specify how much I think the model changes between steps. Revisit the Kalman Filter Math chapter if this is not clear."
+    "FilterPy can compute the $\\mathbf Q$ matrix for us. For simplicity I will assume the noise is a discrete time Wiener process - that it is constant for each time period. This assumption allows me to use a variance to specify how much I think the model changes between steps. Revisit the Kalman Filter Math chapter if this is not clear."
   ]
  },
  {
@ -860,7 +860,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "First we need a system to filter. I'll write a class to simulate an object with constant velocity. Essentially no physical system has a truly constant velocity, so on each update we alter the velocity by a small amount. I also write a sensor to simulate Gaussian noise in a sensor. The code is below, and a plot an example run to verify that it is working correctly. "
+    "First we need a system to filter. I'll write a class to simulate an object with constant velocity. Essentially no physical system has a truly constant velocity, so on each update we alter the velocity by a small amount. I also write a sensor to simulate Gaussian noise in a sensor. The code is below, and I plot an example run to verify that it is working correctly. "
   ]
  },
  {
@ -1113,7 +1113,7 @@
    "\n",
    "Now we can run each Kalman filter against the simulation and evaluate the results. \n",
    "\n",
-    "How do we evaluate the results? We can do this qualitatively by plotting the track and the Kalman filter output and eyeballing the results. However, a rigorous approach uses mathematics. Recall that system covariance matrix $\\mathbf P$ contains the computed variance and covariances for each of the state variables. The diagonal contains the variance. Remember that roughly 99% of all measurements fall within $3\\sigma$ if the noise is Gaussian. If this is not clear please review the Gaussian chapter before continuing, as this is an important point. \n",
+    "How do we evaluate the results? We can do this qualitatively by plotting the track and the Kalman filter output and eyeballing the results. However, a rigorous approach uses mathematics. Recall that the system covariance matrix $\\mathbf P$ contains the computed variance and covariances for each of the state variables. The diagonal contains the variance. Remember that roughly 99% of all measurements fall within $3\\sigma$ if the noise is Gaussian. If this is not clear please review the Gaussian chapter before continuing, as this is an important point. \n",
    "\n",
    "So we can evaluate the filter by looking at the residuals between the estimated state and actual state and comparing them to the standard deviations which we derive from $\\mathbf P$. If the filter is performing correctly 99% of the residuals will fall within $3\\sigma$. This is true for all the state variables, not just for the position. \n",
    "\n",
@ -1462,7 +1462,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Here the story is very different. While the residuals of the second order system fall within the theoretical bounds of the filter's performance, we can see that the residuals are *far* worse than for the first order filter. This is the usual result for this scenario. The filter is assuming that there is acceleration that does not exist. It mistakes noise in the measurement as acceleration and this gets added into the velocity estimate on every predict cycle. Of course the acceleration is not actually there and so the residual for the velocity is much larger than is optimum."
+    "Here the story is very different. While the residuals of the second order system fall within the theoretical bounds of the filter's performance, we can see that the residuals are *far* worse than for the first order filter. This is the usual result for this scenario. The filter is assuming that there is acceleration that does not exist. It mistakes noise in the measurement as acceleration and this gets added into the velocity estimate on every predict cycle. Of course the acceleration is not actually there and so the residual for the velocity is much larger than its optimum."
   ]
  },
  {
@ -1632,7 +1632,7 @@
    "\n",
    "But suppose I told you to fit a higher order polynomial to those two points. There is now an infinite number of answers to the problem. For example, an infinite number of second order parabolas pass through those points. When the Kalman filter is of higher order than your physical process it also has an infinite number of solutions to choose from. The answer is not just non-optimal, it often diverges and never recovers. \n",
    "\n",
-    "For best performance you need a filter whose order matches the system's order.. In many cases that will be easy to do - if you are designing a Kalman filter to read the thermometer of a freezer it seems clear that a zero order filter is the right choice. But what order should we use if we are tracking a car? Order one will work well while the car is moving in a straight line at a constant speed, but cars turn, speed up, and slow down, in which case a second order filter will perform better. That is the problem addressed in the Adaptive Filters chapter. There we will learn how to design a filter that *adapts* to changing order in the tracked object's behavior.\n",
+    "For best performance you need a filter whose order matches the system's order. In many cases that will be easy to do - if you are designing a Kalman filter to read the thermometer of a freezer it seems clear that a zero order filter is the right choice. But what order should we use if we are tracking a car? Order one will work well while the car is moving in a straight line at a constant speed, but cars turn, speed up, and slow down, in which case a second order filter will perform better. That is the problem addressed in the Adaptive Filters chapter. There we will learn how to design a filter that *adapts* to changing order in the tracked object's behavior.\n",
    "\n",
    "With that said, a lower order filter can track a higher order process so long as you add enough process noise and you keep the discretization period small (100 samples a second are usually locally linear). The results will not be optimal, but they can still be very good, and I always reach for this tool first before trying an adaptive filter. Let's look at an example with acceleration. First, the simulation."
   ]
@ -1826,11 +1826,11 @@
    "\n",
    "It is easy to design a Kalman filter for a simulated situation. You know how much noise you are injecting in your process model, so you specify $\\mathbf Q$ to have the same value. You also know how much noise is in the measurement simulation, so the measurement noise matrix $\\mathbf R$ is equally trivial to define. \n",
    "\n",
-    "In practice design is more ad hoc. Real sensors rarely perform to spec, and they rarely perform in a Gaussian manner. They are also easily fooled by environmental noise. For example, circuit noise causes voltage flucuations which can affect the output of a sensor. Creating a process model and noise is even more difficult. Modelling an automobile is very difficult. The steering causes nonlinear behavior, tires slip, people brake and accelerate hard enough to cause tire slips, winds push the car off course. The end result is the Kalman filter is an *inexact* model of the system. This inexactness causes suboptimal behavior, which in the worst case causes the filter to diverge completely. \n",
+    "In practice design is more ad hoc. Real sensors rarely perform to spec, and they rarely perform in a Gaussian manner. They are also easily fooled by environmental noise. For example, circuit noise causes voltage fluctuations which can affect the output of a sensor. Creating a process model and noise is even more difficult. Modelling an automobile is very difficult. The steering causes nonlinear behavior, tires slip, people brake and accelerate hard enough to cause tire slips, winds push the car off course. The end result is the Kalman filter is an *inexact* model of the system. This inexactness causes suboptimal behavior, which in the worst case causes the filter to diverge completely. \n",
    "\n",
    "Because of the unknowns you will be unable to analytically compute the correct values for the filter's matrices. You will start by making the best estimate possible, and then test your filter against a wide variety of simulated and real data. Your evaluation of the performance will guide you towards what changes you need to make to the matrices. We've done some of this already - I've shown you the effect of $\\mathbf Q$ being too large or small.\n",
    "\n",
-    "Now let's consider more analytical ways of accessing performance. If the Kalman filter is performing optimally its estimation errors (the difference between the true state and the estimated state will have these properties:\n",
+    "Now let's consider more analytical ways of accessing performance. If the Kalman filter is performing optimally its estimation errors (the difference between the true state and the estimated state) will have these properties:\n",
    "\n",
    "    1. The mean of the estimation error is zero\n",
    "    2. Its covariance is described by the Kalman filter's covariance matrix\n",
@ -2055,7 +2055,7 @@
    "$$\\begin{aligned}x &= x + \\dot x_\\mathtt{cmd} \\Delta t \\\\\n",
    "\\dot x &= \\dot x_\\mathtt{cmd}\\end{aligned}$$\n",
    "\n",
-    "We need to represent this set of equation in the form $\\bar{\\mathbf x} = \\mathbf{Fx} + \\mathbf{Bu}$.\n",
+    "We need to represent this set of equations in the form $\\bar{\\mathbf x} = \\mathbf{Fx} + \\mathbf{Bu}$.\n",
    "\n",
    "I will use the $\\mathbf{Fx}$ term to extract the $x$ for the top equation, and the $\\mathbf{Bu}$ term for the rest, like so:\n",
    "\n",
@ -2151,7 +2151,7 @@
    "\n",
    "To make it clearer, suppose that the wheel reports not position but the number of rotations of the wheels, where 1 revolution yields 2 meters of travel. In that case we would write\n",
    "\n",
-    "$$ \\begin{bmatrix}z_{wheel} \\\\ z_{ps}\\end{bmatrix} = \\begin{bmatrix}0.5 &0 \\\\ 1& 0\\end{bmatrix} \\begin{bmatrix}x \\\\ \\dot x\\end{bmatrix}$$\n",
+    "$$ \\begin{bmatrix}z_{rot} \\\\ z_{ps}\\end{bmatrix} = \\begin{bmatrix}0.5 &0 \\\\ 1& 0\\end{bmatrix} \\begin{bmatrix}x \\\\ \\dot x\\end{bmatrix}$$\n",
    "\n",
    "Now we have to design the measurement noise matrix $\\mathbf R$. Suppose that the measurement variance for the position is twice the variance of the wheel, and the standard deviation of the wheel is 1.5 meters. That gives us\n",
    "\n",
@ -2249,7 +2249,7 @@
    "\n",
    "It may be somewhat difficult to understand the previous example at an intuitive level. Let's look at a different problem. Suppose we are tracking an object in 2D space, and have two radar systems at different positions. Each radar system gives us a range and bearing to the target. How do the readings from each data affect the results?\n",
    "\n",
-    "This is a nonlinear problem because we need to use a trigonometry  to compute coordinates from a range and bearing, and we have not yet learned how to solve nonlinear problems with Kalman filters. So for this problem ignore the code that I use and just concentrate on the charts that the code outputs. We will revisit this problem in subsequent chapters and learn how to write this code.\n",
+    "This is a nonlinear problem because we need to use trigonometry  to compute coordinates from a range and bearing, and we have not yet learned how to solve nonlinear problems with Kalman filters. So for this problem ignore the code that I use and just concentrate on the charts that the code outputs. We will revisit this problem in subsequent chapters and learn how to write this code.\n",
    "\n",
    "I will position the target at (100, 100). The first radar will be at (50, 50), and the second radar at (150, 50). This will cause the first radar to measure a bearing of 45 degrees, and the second will report 135 degrees.\n",
    "\n",
@ -2361,7 +2361,7 @@
    }
   ],
   "source": [
-    "# compute position and covariance from first radar station\n",
+    "# compute position and covariance from second radar station\n",
    "set_radar_pos((150, 50))\n",
    "kf.predict()\n",
    "kf.update([radians(135), dist])\n",
@ -2388,7 +2388,7 @@
    "\n",
    "One final thing before we move on: sensor fusion is a vast topic, and my coverage is simplistic to the point of being misleading. For example, GPS uses iterated least squares to determine the position from a set of pseudorange readings from the satellites without using a Kalman filter. I cover this topic in the supporting notebook [**Iterative Least Squares for Sensor Fusion**](http://nbviewer.ipython.org/urls/raw.github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python/master/Supporting_Notebooks/Iterative-Least-Squares-for-Sensor-Fusion.ipynb)\n",
    "\n",
-    "That is the usual but not exclusive way this computation is done in GPS receivers. If you are a hobbiest my coverage may get you started. A commercial grade filter requires very careful design of the fusion process. That is the topic of several books, and you will have to further your education by finding one that covers your domain. "
+    "That is the usual but not exclusive way this computation is done in GPS receivers. If you are a hobbyist my coverage may get you started. A commercial grade filter requires very careful design of the fusion process. That is the topic of several books, and you will have to further your education by finding one that covers your domain. "
   ]
  },
  {
@ -2420,7 +2420,7 @@
    "\n",
    "Well, what are the characteristics of that data stream, and more importantly, what are the fundamental requirements of the input to the Kalman filter?\n",
    "\n",
-    "Inputs to the Kalman filter must be *Gaussian* and *time independent*. this is because we imposed the requirement of the Markov property: the current state is dependent only on the previous state and current inputs. This makes the recursive form of the filter possible. The output of the GPS is *time dependent* because the filter bases its current estimate on the recursive estimates of all previous measurements. Hence, the signal is not white, it is not time independent, and if you pass that data into a Kalman filter you have violated the mathematical requirements of the filter. So, the answer is no, you cannot get better estimates by running a KF on the output of a commercial GPS. \n",
+    "Inputs to the Kalman filter must be *Gaussian* and *time independent*. This is because we imposed the requirement of the Markov property: the current state is dependent only on the previous state and current inputs. This makes the recursive form of the filter possible. The output of the GPS is *time dependent* because the filter bases its current estimate on the recursive estimates of all previous measurements. Hence, the signal is not white, it is not time independent, and if you pass that data into a Kalman filter you have violated the mathematical requirements of the filter. So, the answer is no, you cannot get better estimates by running a KF on the output of a commercial GPS. \n",
    "\n",
    "Another way to think of it is that Kalman filters are optimal in a least squares sense. There is no way to take an optimal solution, pass it through a filter, any filter, and get a 'more optimal' answer because it is a logical impossibility. At best the signal will be unchanged, in which case it will still be optimal, or it will be changed, and hence no longer optimal.\n",
    "\n",
@ -2906,7 +2906,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Because we don't have real data we will start by writing a simulator for a ball. As always, we add a noise term independent of time so we can simulate noise sensors."
+    "Because we don't have real data we will start by writing a simulator for a ball. As always, we add a noise term independent of time so we can simulate noisy sensors."
   ]
  },
  {
@ -3072,7 +3072,7 @@
   "source": [
    "### Design State Transition Function\n",
    "\n",
-    "Our next step is to design the state transition function. Recall that the state transistion function is implemented as a matrix $\\mathbf F$ that we multiply with the previous state of our system to get the next state, or prior $\\bar{\\mathbf x} = \\mathbf{Fx}$.\n",
+    "Our next step is to design the state transition function. Recall that the state transition function is implemented as a matrix $\\mathbf F$ that we multiply with the previous state of our system to get the next state, or prior $\\bar{\\mathbf x} = \\mathbf{Fx}$.\n",
    "\n",
    "I will not belabor this as it is very similar to the 1-D case we did in the previous chapter. Our state equations for position and velocity would be:\n",
    "\n",
@ -3106,7 +3106,7 @@
    "\n",
    "We will use the control input to account for the force of gravity. The term $\\mathbf{Bu}$ is added to $\\mathbf{\\bar x}$ to account for how much $\\mathbf{\\bar x}$ changes due to gravity. We can say that  $\\mathbf{Bu}$ contains $\\begin{bmatrix}\\Delta x_g & \\Delta \\dot{x_g} & \\Delta y_g & \\Delta \\dot{y_g}\\end{bmatrix}^\\mathsf T$.\n",
    "\n",
-    "If we look at the discretized equations we see that gravity only affect the velocity for $y$.\n",
+    "If we look at the discretized equations we see that gravity only affects the velocity for $y$.\n",
    "\n",
    "$$\\begin{aligned}\n",
    "x_t &=  x_{t-1} + v_{x(t-1)} {\\Delta t} \\\\\n",
@ -3169,7 +3169,7 @@
   "source": [
    "### Design the Measurement Noise Matrix\n",
    "\n",
-    "As with the robot, we will assume that the error is independent in $x$ and $y$. In this case we will start by assuming that the measurement error in x and y are 0.5 meters. Hence,\n",
+    "As with the robot, we will assume that the error is independent in $x$ and $y$. In this case we will start by assuming that the measurement errors in x and y are 0.5 meters squared. Hence,\n",
    "\n",
    "$$\\mathbf R = \\begin{bmatrix}0.5&0\\\\0&0.5\\end{bmatrix}$$"
   ]
@ -3358,7 +3358,7 @@
    "\\end{aligned}\n",
    "$$\n",
    "\n",
-    "We can incorporate this force (acceleration) into our equations by incorporating $accel * \\Delta t$ into the velocity update equations. We should subtract this component because drag will reduce the velocity. The code to do this is quite straightforward, we just need to break out the Force into $x$ and $y$ components. \n",
+    "We can incorporate this force (acceleration) into our equations by incorporating $accel * \\Delta t$ into the velocity update equations. We should subtract this component because drag will reduce the velocity. The code to do this is quite straightforward, we just need to break out the force into $x$ and $y$ components. \n",
    "\n",
    "I will not belabor this issue further because computational physics is beyond the scope of this book. Recognize that a higher fidelity simulation would require incorporating things like altitude, temperature, ball spin, and several other factors. The aforementioned work by Alan Nathan covers this if you are interested. My intent here is to impart some real-world behavior into our simulation to test how  our simpler prediction model used by the Kalman filter reacts to this behavior. Your process model will never exactly model what happens in the world, and a large factor in designing a good Kalman filter is carefully testing how it performs against real world data. \n",
    "\n",