Fixed math symbology of equations.

2015-05-19 16:53:41 -07:00 · 2015-05-19 16:53:41 -07:00 · 3bc058f2bb
commit 3bc058f2bb
parent f92ea90944
1 changed files with 27 additions and 24 deletions
--- a/07_Kalman_Filter_Math.ipynb
+++ b/07_Kalman_Filter_Math.ipynb
@ -455,22 +455,23 @@
   "source": [
    "I promised that you would not have to understand how to derive Kalman filter equations, and that is true. However, I do think it is worth walking through the equations one by one and becoming familiar with the variables. If this is your first time through the material feel free to skip ahead to the next section. However, you will eventually want to work through this material, so why not now? You will need to have passing familiarity with these equations to read material written about the Kalman filter, as they all presuppose that you are familiar with them. I will reiterate them here for easy reference.\n",
    "\n",
-    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "\\text{Predict Step}\\\\\n",
-    "\\mathbf{x} &= \\mathbf{F x} + \\mathbf{B u}\\;\\;\\;\\;&(1) \\\\\n",
-    "\\mathbf{P} &= \\mathbf{FP{F}}^\\mathsf{T} + \\mathbf{Q}\\;\\;\\;\\;&(2) \\\\\n",
+    "\\mathbf{x^-} &= \\mathbf{F x} + \\mathbf{B u}\\;\\;\\;&(1) \\\\\n",
+    "\\mathbf{P^-} &= \\mathbf{FP{F}}^\\mathsf{T} + \\mathbf{Q}\\;\\;\\;&(2) \\\\\n",
    "\\\\\n",
    "\\text{Update Step}\\\\\n",
-    "\\textbf{y} &= \\mathbf{z} - \\mathbf{H}\\mathbf{x}\\;\\;\\;&(3) \\\\\n",
-    "\\mathbf{S} &= \\mathbf{HPH}^\\mathsf{T} + \\mathbf{R} \\;\\;\\;&(4) \\\\\n",
-    "\\mathbf{K} &= \\mathbf{PH}^\\mathsf{T}\\mathbf{S}^{-1}\\;\\;\\;&(5) \\\\\n",
-    "\\mathbf{x} &= \\mathbf{x} +\\mathbf{K}\\mathbf{y} \\;\\;\\;&(6)\\\\\n",
-    "\\mathbf{P} &= (\\mathbf{I}-\\mathbf{K}\\mathbf{H})\\mathbf{P}\\;\\;\\;&(7)\n",
+    "\\textbf{y} &= \\mathbf{z} - \\mathbf{H x^-} \\;\\;\\;&(3)\\\\\n",
+    "\\textbf{S} &= \\mathbf{HP^-H}^\\mathsf{T} + \\mathbf{R} \\;\\;\\;&(4)\\\\\n",
+    "\\mathbf{K} &= \\mathbf{P^-H}^\\mathsf{T} \\mathbf{S}^{-1}\\;\\;\\;&(5) \\\\\n",
+    "\\mathbf{x} &=\\mathbf{x^-} +\\mathbf{K\\textbf{y}} \\;\\;\\;&(6)\\\\\n",
+    "\\mathbf{P} &= (\\mathbf{I}-\\mathbf{KH})\\mathbf{P^-}\\;\\;\\;&(7)\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
+    "Just a reminder: the superscript $^-$ is used to denote that the value is a prediction, not an estimate. But $\\mathbf{x}$ and $\\mathbf{x}^-$ are the same thing, the *state* of our system, just at different times of the algorithm. I am not entirely pleased with this notation for several reasons. First, it clutters the equations up, making them harder to read. More importantly, that aren't always correct. For example, consider a situation wherre we have no measurement for 1 or more time periods. In that case we just compute the predict step several times in a row without an intervening update step. Thus, $\\mathbf{x}$ is never computed, and the predict step is actually $\\mathbf{x}^-_t = \\mathbf{Fx}^-_{t-1} + \\mathbf{Bu}$. Alternatively, if you have several measurements for one time epoch (from different sensors, say) it is sometimes possible to perform the update once for each measurement instead of trying to incorporate all of the measurements at once. In that case the subsequent updates are not performing the computations on the predicted state, but on the partially updated state. The $^-$ notation does not capture any of these. For most of the book I will dispense with this notation unless I want to call out that I am computing a prediciton. \n",
+    "\n",
    "I will start with the update step, as that is what we started with in the one dimensional Kalman filter case. The first equation is\n",
    "\n",
    "$$\n",
@ -513,8 +514,8 @@
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
-    "\\mathbf{S} &= \\textbf{HPH}^\\mathsf{T} + \\textbf{R} \\;\\;\\;&(4) \\\\\n",
-    "\\textbf{K} &= \\textbf{PH}^\\mathsf{T}\\mathbf{S}^{-1}\\;\\;\\;&(5) \\\\\n",
+    "\\textbf{S} &= \\mathbf{HPH}^\\mathsf{T} + \\mathbf{R} \\;\\;\\;&(4)\\\\\n",
+    "\\mathbf{K} &= \\mathbf{PH}^\\mathsf{T} \\mathbf{S}^{-1}\\;\\;\\;&(5) \\\\\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
@ -559,7 +560,7 @@
   "source": [
    "Our next line is:\n",
    "\n",
-    "$$\\mathbf{x}=\\mathbf{x}' +\\mathbf{Ky}\\tag{5}$$\n",
+    "$$\\mathbf{x}=\\mathbf{x} +\\mathbf{Ky}\\tag{5}$$\n",
    "\n",
    "This just multiplies the residual by the Kalman gain, and adds it to the state variable. In other words, this is the computation of our new estimate.\n",
    "\n",
@ -606,8 +607,8 @@
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
-    "\\mathbf{x}^- &= \\mathbf{F x} + \\mathbf{B u} \\\\\n",
-    "\\mathbf{P^-} &= \\mathbf{FP{F}}^\\mathsf{T} + \\mathbf{Q}\n",
+    "\\mathbf{x} &= \\mathbf{F x} + \\mathbf{B u} \\\\\n",
+    "\\mathbf{P} &= \\mathbf{FPF}^\\mathsf{T} + \\mathbf{Q}\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
@ -627,15 +628,15 @@
    "\n",
    "Hopefully the general process is clear, so now I will go a bit faster on the rest. Our other equation for the predict step is\n",
    "\n",
-    "$$\\mathbf{P}^- = \\mathbf{FP{F}}^\\mathsf{T} + \\mathbf{Q}$$\n",
+    "$$\\mathbf{P} = \\mathbf{FPF}^\\mathsf{T} + \\mathbf{Q}$$\n",
    "\n",
    "Again, since our state only has one variable $\\mathbf{P}$ and $\\mathbf{Q}$ must also be $1\\times 1$ matrix, which we can treat as scalars, yielding  \n",
    "\n",
-    "$$P^- = FPF^\\mathsf{T} + Q$$\n",
+    "$$P = FPF^\\mathsf{T} + Q$$\n",
    "\n",
    "We already know $F=1$. The transpose of a scalar is the scalar, so $F^\\mathsf{T} = 1$. This yields\n",
    "\n",
-    "$$P^- = P + Q$$\n",
+    "$$P = P + Q$$\n",
    "\n",
    "which is equivalent to the Gaussian equation of \n",
    "\n",
@ -652,10 +653,10 @@
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
-    "\\textbf{y} &= \\mathbf{z} - \\mathbf{H x^-}\\\\\n",
-    "\\mathbf{K}&= \\mathbf{P^-H}^\\mathsf{T} (\\mathbf{HP^-H}^\\mathsf{T} + \\mathbf{R})^{-1} \\\\\n",
-    "\\mathbf{x}&=\\mathbf{x}^- +\\mathbf{K\\textbf{y}} \\\\\n",
-    "\\mathbf{P}&= (\\mathbf{I}-\\mathbf{KH})\\mathbf{P^-}\n",
+    "\\textbf{y} &= \\mathbf{z} - \\mathbf{H x}\\\\\n",
+    "\\mathbf{K}&= \\mathbf{PH}^\\mathsf{T} (\\mathbf{HPH}^\\mathsf{T} + \\mathbf{R})^{-1} \\\\\n",
+    "\\mathbf{x}&=\\mathbf{x} +\\mathbf{K\\textbf{y}} \\\\\n",
+    "\\mathbf{P}&= (\\mathbf{I}-\\mathbf{KH})\\mathbf{P}\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
@ -663,10 +664,10 @@
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
-    "y &= z - x^-\\\\\n",
-    "K &=P^- / (P^- + R) \\\\\n",
+    "y &= z - x\\\\\n",
+    "K &=P / (P + R) \\\\\n",
    "x &=x +Ky \\\\\n",
-    "P &= (1-K)P^-\n",
+    "P &= (1-K)P\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
@ -1096,6 +1097,8 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
+    "**Author's note: this section contains some of the more challenging math in this book. Please bear with it, as few books cover this well, and an accurate design is imperative for good filter performance. At the end I present Python functions from FilterPy which will compute the math for you for common scenarios.**\n",
+    "\n",
    "In general the design of the $\\mathbf{Q}$ matrix is among the most difficult aspects of Kalman filter design. This is due to several factors. First, the math itself is somewhat difficult and requires a good foundation in signal theory. Second, we are trying to model the noise in something for which we have little information. For example, consider trying to model the process noise for a baseball. We can model it as a sphere moving through the air, but that leave many unknown factors - the wind, ball rotation and spin decay, the coefficient of friction of a scuffed ball with stitches, the effects of wind and air density, and so on. I will develop the equations for an exact mathematical solution for a given process model, but since the process model is incomplete the result for $\\mathbf{Q}$ will also be incomplete. This has a lot of ramifications for the behavior of the Kalman filter. If $\\mathbf{Q}$ is too small than the filter will be overconfident in it's prediction model and will diverge from the actual solution. If $\\mathbf{Q}$ is too large than the filter will be unduly influenced by the noise in the measurements and perform sub-optimally. In practice we spend a lot of time running simulations and evaluating collected data to try to select an appropriate value for $\\mathbf{Q}$. But let's start by looking at the math.\n",
    "\n",
    "\n",
@ -1591,7 +1594,7 @@
    "If the values for $\\mathbf{Q}$ are small relative to $\\mathbf{P}$\n",
    "than it will be contributing almost nothing to the computation of $\\mathbf{P}$. Setting $\\mathbf{Q}$ to \n",
    "\n",
-    "$$Q=\\begin{bmatrix}0&0&0\\\\0&0&0\\\\0&0&\\sigma^2\\end{bmatrix}$$\n",
+    "$$\\mathbf{Q}=\\begin{bmatrix}0&0&0\\\\0&0&0\\\\0&0&\\sigma^2\\end{bmatrix}$$\n",
    "\n",
    "while not correct, is often a useful approximation. If you do this you will have to perform quite a few studies to guarantee that your filter works in a variety of situations. Given the availability of functions to compute the correct values of $\\mathbf{Q}$ for you I would strongly recommend not using approximations. Perhaps it is justified for quick-and-dirty filters, or on embedded devices where you need to wring out every last bit of performance, and seek to minimize the number of matrix operations required. "
   ]