More copy edits, updated for filterpy 0.1.2

2016-01-24 12:11:39 -08:00 · 2016-01-24 12:11:39 -08:00 · d5ea503fde
commit d5ea503fde
parent 49f604be52
9 changed files with 812 additions and 999 deletions
--- a/02-Discrete-Bayes.ipynb
+++ b/02-Discrete-Bayes.ipynb
@ -1352,9 +1352,9 @@
    "The filter equations are:\n",
    "\n",
    "$$\\begin{aligned} \\bar {\\mathbf x} &= \\mathbf x \\ast f_{\\mathbf x}(\\bullet)\\, \\, &\\text{Predict Step} \\\\\n",
-    "\\mathbf x &= L \\cdot \\bar{\\mathbf x}\\, \\, &\\text{Update Step}\\end{aligned}$$\n",
+    "\\mathbf x &= \\|L \\cdot \\bar{\\mathbf x}\\|\\, \\, &\\text{Update Step}\\end{aligned}$$\n",
    "\n",
-    "$L$ is the usual way to write the Likelihood function, so I use that.\n",
+    "$L$ is the usual way to write the Likelihood function, so I use that. The $\\|\\bullet\\|$ notation denotes the norm.\n",
    "\n",
    "We can express this in pseudocode.\n",
    "\n",
--- a/03-Gaussians.ipynb
+++ b/03-Gaussians.ipynb
@ -1342,18 +1342,139 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Computational Properties of the Gaussian"
+    "## Computational Properties of Gaussians"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Recall how our discrete Bayesian filter worked. We had a vector implemented as a NumPy array representing our belief at a certain moment in time. When we integrated another measurement into our belief using the `update()` function we had to multiply probabilities together, and when we performed the motion step using the `predict()` function we had to shift and add probabilities. I've promised you that the Kalman filter uses essentially the same process, and that it uses Gaussians instead of histograms, so you might reasonable expect that we will be multiplying, adding, and shifting Gaussians in the Kalman filter.\n",
+    "A remarkable property of Gaussians is that the product of two independent Gaussians is another Gaussian! The sum is not Gaussian, but proportional to a Gaussian.\n",
    "\n",
-    "A typical textbook would directly launch into a multi-page proof of the behavior of Gaussians under these operations, but I don't see the value in that right now. I think the math will be much more intuitive and clear if we just start developing a Kalman filter using Gaussians. I will provide the equations for multiplying and shifting Gaussians at the appropriate time. You will then be able to develop a physical intuition for what these operations do, rather than be forced to digest a lot of fairly abstract math.\n",
+    "The discrete Bayes filter works by multiplying and adding probabilities. I'm getting ahead of myself, but the Kalman filter uses Gaussians instead of probabilities, but the rest of the algorithm remains the same. This means we will need to multiply and add Gaussians. \n",
    "\n",
-    "The key point, which I will only assert for now, is that all the operations are very simple, and that they preserve the properties of the Gaussian. This is somewhat remarkable, in that the Gaussian is a nonlinear function, and typically if you multiply a nonlinear equation with itself you end up with a different equation. For example, the shape of `sin(x)sin(x)` is very different from `sin(x)`. But the result of multiplying two Gaussians is yet another Gaussian. This is a fundamental property, and a key reason why Kalman filters are computationally feasible. Said another way, Kalman filters use Gaussians *because* they are computationally nice. "
+    "The Gaussian is a nonlinear function, and typically if you multiply a nonlinear equation with itself you end up with a different equation. For example, the shape of `sin(x)sin(x)` is very different from `sin(x)`. But the result of multiplying two Gaussians is yet another Gaussian. This is a fundamental property, and a key reason why Kalman filters are computationally feasible. Said another way, Kalman filters use Gaussians *because* they are computationally nice. \n",
+    "\n",
+    "The remainder of this section is optiona. I will derive the equations for the sum and product of two Gaussians. You will not need to understand this material to understand the rest of the book, so long as you accept the results. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Product of Gaussians\n",
+    "\n",
+    "The product of two independent Gaussians is given by:\n",
+    "\n",
+    "$$\\begin{aligned}\\mu &=\\frac{\\sigma_1^2\\mu_2 + \\sigma_1^2\\mu_1}{\\sigma_1^2+\\sigma_2^2}\\\\\n",
+    "\\sigma^2 &=\\frac{\\sigma_1^2\\sigma_2^2}{\\sigma_1^2+\\sigma_2^2} \n",
+    "\\end{aligned}$$\n",
+    "\n",
+    "You can find this result by multiplying the equation for two Gaussians together and combining terms. The algebra gets messy. I will derive it using Bayes theorem. We can state the problem as: let the prior be $N(\\bar\\mu, \\bar\\sigma^2)$, and measurement be $z \\propto N(z, \\sigma_z^2)$. What is the posterior  x given the measurement z?\n",
+    "\n",
+    "Write the posterior as $P(x \\mid z)$. Now we can use Bayes Theorem to state\n",
+    "\n",
+    "$$P(x \\mid z) = \\frac{P(z \\mid x)P(x)}{P(z)}$$\n",
+    "\n",
+    "$P(z)$ is a normalizing constant, so we can create a proportinality\n",
+    "\n",
+    "$$P(x \\mid z) \\propto P(z|x)P(x)$$\n",
+    "\n",
+    "Now we subtitute in the equations for the Gaussians, which are\n",
+    "\n",
+    "$$P(z \\mid x) = \\frac{1}{\\sqrt{2\\pi\\sigma_z^2}}\\exp \\Big[-\\frac{(z-x)^2}{2\\sigma_z^2}\\Big]$$\n",
+    "\n",
+    "$$P(x) = \\frac{1}{\\sqrt{2\\pi\\bar\\sigma^2}}\\exp \\Big[-\\frac{(x-\\bar\\mu)^2}{2\\bar\\sigma^2}\\Big]$$\n",
+    "\n",
+    "We can drop the leading terms, as they are constants, giving us\n",
+    "\n",
+    "$$\\begin{aligned}\n",
+    "P(x \\mid z) &\\propto \\exp \\Big[-\\frac{(z-x)^2}{2\\sigma_z^2}\\Big]\\exp \\Big[-\\frac{(x-\\bar\\mu)^2}{2\\bar\\sigma^2}\\Big]\\\\\n",
+    "&\\propto \\exp \\Big[-\\frac{(z-x)^2}{2\\sigma_z^2}-\\frac{(x-\\bar\\mu)^2}{2\\bar\\sigma^2}\\Big] \\\\\n",
+    "&\\propto \\exp \\Big[-\\frac{1}{2\\sigma_z^2\\bar\\sigma^2}[\\bar\\sigma^2(z-x)^2-\\sigma_z^2(x-\\bar\\mu)^2]\\Big]\n",
+    "\\end{aligned}$$\n",
+    "\n",
+    "Now we multiply out the squared terms and group in terms of the posterior $x$.\n",
+    "\n",
+    "$$\\begin{aligned}\n",
+    "P(x \\mid z) &\\propto \\exp \\Big[-\\frac{1}{2\\sigma_z^2\\bar\\sigma^2}[\\bar\\sigma^2(z^2 -2xz + x^2) + \\sigma_z^2(x^2 - 2x\\bar\\mu+\\bar\\mu^2)]\\Big ] \\\\\n",
+    "&\\propto \\exp \\Big[-\\frac{1}{2\\sigma_z^2\\bar\\sigma^2}[x^2(\\bar\\sigma^2+\\sigma_z^2)-2x(\\sigma_z^2\\bar\\mu + \\bar\\sigma^2z) + (\\bar\\sigma^2z^2+\\sigma_z^2\\bar\\mu^2)]\\Big ]\n",
+    "\\end{aligned}$$\n",
+    "\n",
+    "The last parentheses do not contain the posterior $x$, so it can be treated as a constant and discarded.\n",
+    "\n",
+    "$$P(x \\mid z) \\propto \\exp \\Big[-\\frac{1}{2}\\frac{x^2(\\bar\\sigma^2+\\sigma_z^2)-2x(\\sigma_z^2\\bar\\mu + \\bar\\sigma^2z)}{\\sigma_z^2\\bar\\sigma^2}\\Big ]\n",
+    "$$\n",
+    "\n",
+    "Divide numerator and denominator by $\\bar\\sigma^2+\\sigma_z^2$ to get\n",
+    "\n",
+    "$$P(x \\mid z) \\propto \\exp \\Big[-\\frac{1}{2}\\frac{x^2-2x(\\frac{\\sigma_z^2\\bar\\mu + \\bar\\sigma^2z}{\\bar\\sigma^2+\\sigma_z^2})}{\\frac{\\sigma_z^2\\bar\\sigma^2}{\\bar\\sigma^2+\\sigma_z^2}}\\Big ]\n",
+    "$$\n",
+    "\n",
+    "Proportionality lets us create or delete constants at will, so we can factor this into\n",
+    "\n",
+    "$$P(x \\mid z) \\propto \\exp \\Big[-\\frac{1}{2}\\frac{(x-\\frac{\\sigma_z^2\\bar\\mu + \\bar\\sigma^2z}{\\bar\\sigma^2+\\sigma_z^2})^2}{\\frac{\\sigma_z^2\\bar\\sigma^2}{\\bar\\sigma^2+\\sigma_z^2}}\\Big ]\n",
+    "$$\n",
+    "\n",
+    "A Gaussian is\n",
+    "\n",
+    "$$N(\\mu,\\, \\sigma^2) \\propto \\exp\\Big [-\\frac{1}{2}\\frac{(x - \\mu)^2}{\\sigma^2}\\Big ]$$\n",
+    "\n",
+    "So we can see that $P(x \\mid z)$ has a mean of\n",
+    "\n",
+    "$$\\mu_\\mathtt{posterior} = \\frac{\\sigma_z^2\\bar\\mu + \\bar\\sigma^2z}{\\bar\\sigma^2+\\sigma_z^2}$$\n",
+    "\n",
+    "and a variance of\n",
+    "$$\n",
+    "\\sigma_\\mathtt{posterior} = \\frac{\\sigma_z^2\\bar\\sigma^2}{\\bar\\sigma^2+\\sigma_z^2}\n",
+    "$$\n",
+    "\n",
+    "I've dropped the constants, and so the result is not a normal, but proportional to one. Bayes theorem normalizes with the $P(z)$ divisor, ensuring that the result is normal. We normalize in the update step of our filters, ensuring the filter estimate is Gaussian.\n",
+    "\n",
+    "$$\\mathcal N_1 = \\| \\mathcal N_2\\cdot \\mathcal N_3\\|$$"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Sum of Gaussians\n",
+    "\n",
+    "The sum of two Gaussians is given by\n",
+    "\n",
+    "$$\\mu = \\mu_1 + \\mu_2 \\\\\n",
+    "\\sigma^2 = \\sigma^2_1 + \\sigma^2_2\n",
+    "$$\n",
+    "\n",
+    "There are several proofs for this. I will use convolution since we used convolution in the previous chapter for the histograms of probabilities. \n",
+    "\n",
+    "To find the density function of the sum of two Gaussian random variables we sum the density functions of each. They are nonlinear, continuous functions, so we need to compute the sum with an integral. If the random variables $p$ and $z$ (e.g. prior and measurement) are independent we can compute this with\n",
+    "\n",
+    "$p(x) = \\int\\limits_{-\\infty}^\\infty f_p(x-z)f_z(z)\\, dx$\n",
+    "\n",
+    "This is the equation for a convolution. Now we just do some math:\n",
+    "\n",
+    "\n",
+    "$p(x) = \\int\\limits_{-\\infty}^\\infty f_2(x-x_1)f_1(x_1)\\, dx \\\\\n",
+    "=  \\int\\limits_{-\\infty}^\\infty \n",
+    "\\frac{1}{\\sqrt{2\\pi}\\sigma_z}\\exp\\left[-\\frac{x - z - \\mu_z}{2\\sigma^2_z}\\right]\n",
+    "\\frac{1}{\\sqrt{2\\pi}\\sigma_p}\\exp\\left[-\\frac{x - \\mu_p}{2\\sigma^2_p}\\right] \\, dx \\\\\n",
+    "=  \\int\\limits_{-\\infty}^\\infty\n",
+    "\\frac{1}{\\sqrt{2\\pi}\\sqrt{\\sigma_p^2 + \\sigma_z^2}} \\exp\\left[ -\\frac{(x - (\\mu_p + \\mu_z)))^2}{2(\\sigma_z^2+\\sigma_p^2)}\\right]\n",
+    "\\frac{1}{\\sqrt{2\\pi}\\frac{\\sigma_p\\sigma_z}{\\sqrt{\\sigma_p^2 + \\sigma_z^2}}} \\exp\\left[ -\\frac{(x - \\frac{\\sigma_p^2(x-\\mu_z) + \\sigma_z^2\\mu_p}{}))^2}{2\\left(\\frac{\\sigma_p\\sigma_x}{\\sqrt{\\sigma_z^2+\\sigma_p^2}}\\right)^2}\\right] \\, dx\n",
+    "= \\frac{1}{\\sqrt{2\\pi}\\sqrt{\\sigma_p^2 + \\sigma_z^2}} \\exp\\left[ -\\frac{(x - (\\mu_p + \\mu_z)))^2}{2(\\sigma_z^2+\\sigma_p^2)}\\right] \\int\\limits_{-\\infty}^\\infty\n",
+    "\\frac{1}{\\sqrt{2\\pi}\\frac{\\sigma_p\\sigma_z}{\\sqrt{\\sigma_p^2 + \\sigma_z^2}}} \\exp\\left[ -\\frac{(x - \\frac{\\sigma_p^2(x-\\mu_z) + \\sigma_z^2\\mu_p}{}))^2}{2\\left(\\frac{\\sigma_p\\sigma_x}{\\sqrt{\\sigma_z^2+\\sigma_p^2}}\\right)^2}\\right] \\, dx\n",
+    "$\n",
+    "\n",
+    "\n",
+    "The expression inside the integral is a normal distribution. The sum of a normal distribution is one, hence the integral is one. This gives us\n",
+    "\n",
+    "$$p(x) = \\frac{1}{\\sqrt{2\\pi}\\sqrt{\\sigma_p^2 + \\sigma_z^2}} \\exp\\left[ -\\frac{(x - (\\mu_p + \\mu_z)))^2}{2(\\sigma_z^2+\\sigma_p^2)}\\right]$$\n",
+    "\n",
+    "This is in the form of a normal, where\n",
+    "\n",
+    "$$\\mu_x = \\mu_p + \\mu_z \\\\\n",
+    "\\sigma_x^2 = \\sigma_z^2+\\sigma_p^2\\, \\square$$"
   ]
  },
  {
--- a/04-One-Dimensional-Kalman-Filters.ipynb
+++ b/04-One-Dimensional-Kalman-Filters.ipynb
--- a/05-Multivariate-Gaussians.ipynb
+++ b/05-Multivariate-Gaussians.ipynb
--- a/06-Multivariate-Kalman-Filters.ipynb
+++ b/06-Multivariate-Kalman-Filters.ipynb
--- a/book_format.py
+++ b/book_format.py
@ -43,7 +43,7 @@ def test_filterpy_version():
    from distutils.version import LooseVersion

    v = filterpy.__version__
-    min_version = "0.1.1"
+    min_version = "0.1.2"
    if LooseVersion(v) < LooseVersion(min_version):
       raise Exception("Minimum FilterPy version supported is {}.\n"
                       "Please install a more recent version.\n"
--- a/code/gh_internal.py
+++ b/code/gh_internal.py
@ -60,7 +60,7 @@ def print_results(estimates, prediction, weight):
 def create_predict_update_chart(box_bg = '#CCCCCC',
                arrow1 = '#88CCFF',
                arrow2 = '#88FF88'):
-    plt.figure(figsize=(4,4), facecolor='w')
+    plt.figure(figsize=(3,3), facecolor='w')
    ax = plt.axes((0, 0, 1, 1),
                  xticks=[], yticks=[], frameon=False)
    #ax.set_xlim(0, 10)
@ -123,7 +123,7 @@ def create_predict_update_chart(box_bg = '#CCCCCC',
              ha='center', va='center', fontsize=18)
    plt.axis('equal')
    #plt.axis([0,8,0,8])
-    plt.show()
+    #plt.show()


 def plot_estimate_chart_1():
--- a/code/kf_internal.py
+++ b/code/kf_internal.py
@ -36,11 +36,11 @@ def plot_dog_track(xs, dog, measurement_var, process_var):


 def print_gh(predict, update, z):
-    predict_template = '         {: 7.3f} {: 8.3f}'
-    update_template = '{: 7.3f} {: 7.3f}\t   {:.3f}'
+    predict_template = '{: 7.3f} {: 8.3f}'
+    update_template = '{:.3f}\t{: 7.3f} {: 7.3f}'

    print(predict_template.format(predict[0], predict[1]),end='\t')
-    print(update_template.format(update[0], update[1], z))
+    print(update_template.format(z, update[0], update[1]))


 def print_variance(positions):
--- a/code/mkf_internal.py
+++ b/code/mkf_internal.py
@ -181,8 +181,8 @@ def show_position_prediction_chart():
    plt.xlim([0,5])
    plt.ylim([0,5])

-    plt.xlabel("Position")
-    plt.ylabel("Time")
+    plt.xlabel("X")
+    plt.ylabel("Y")

    plt.xticks(np.arange(1,5,1))
    plt.yticks(np.arange(1,5,1))
@ -417,27 +417,27 @@ def plot_3_covariances():
    plt.gca().grid(b=False)
    plt.gca().set_xticks([0,1,2,3,4])
    plot_covariance_ellipse((2, 7), cov=P, facecolor='g', alpha=0.2,
-                            title='|2 0|\n|0 2|', axis_equal=False)
-    plt.ylim((4, 10))
+                            title='|2 0|\n|0 2|', std=[1,2,3], axis_equal=False)
+    plt.ylim((0, 15))
    plt.gca().set_aspect('equal', adjustable='box')

    plt.subplot(132)
    plt.gca().grid(b=False)
    plt.gca().set_xticks([0,1,2,3,4])
-    P = [[2, 0], [0, 9]]
-    plt.ylim((4, 10))
+    P = [[2, 0], [0, 6]]
+    plt.ylim((0, 15))
    plt.gca().set_aspect('equal', adjustable='box')
    plot_covariance_ellipse((2, 7), P, facecolor='g', alpha=0.2,
-                            axis_equal=False, title='|2 0|\n|0 9|')
+                            std=[1,2,3],axis_equal=False, title='|2 0|\n|0 6|')

    plt.subplot(133)
    plt.gca().grid(b=False)
    plt.gca().set_xticks([0,1,2,3,4])
    P = [[2, 1.2], [1.2, 2]]
-    plt.ylim((4, 10))
+    plt.ylim((0, 15))
    plt.gca().set_aspect('equal', adjustable='box')
    plot_covariance_ellipse((2, 7), P, facecolor='g', alpha=0.2,
-                            axis_equal=False,
+                            axis_equal=False,std=[1,2,3],
                            title='|2 1.2|\n|1.2 2|')

    plt.tight_layout()