From f4dfe4113f0452cc20b4a8327ca9d59b43475c1f Mon Sep 17 00:00:00 2001 From: Roger Labbe Date: Fri, 22 Aug 2014 11:15:23 -0700 Subject: [PATCH] Normalizing notation on Wikipedia's choices. Wikipedia is both the most readable of any source, and open and available to all. I like it. --- Appendix_Symbols_and_Notations.ipynb | 35 ++++++++++---- Multidimensional_Kalman_Filters.ipynb | 68 +++++++++++++-------------- 2 files changed, 61 insertions(+), 42 deletions(-) diff --git a/Appendix_Symbols_and_Notations.ipynb b/Appendix_Symbols_and_Notations.ipynb index 5fa8ea5..64a194a 100644 --- a/Appendix_Symbols_and_Notations.ipynb +++ b/Appendix_Symbols_and_Notations.ipynb @@ -1,7 +1,7 @@ { "metadata": { "name": "", - "signature": "sha256:a8b813a30b8d5fe32572b34d0caaf86174716f527fd81c8e83766fd8b0c6d59b" + "signature": "sha256:431420b9bd40c967d24256d0494aae8c585f43685d2ca3e9edae56b410d40d58" }, "nbformat": 3, "nbformat_minor": 0, @@ -355,15 +355,34 @@ "M_{k} &= \\Phi_k P_{k-1}\\phi_k^T + Q_{k} \\\\\n", "K_k &= M_kH^T[HM_kH^T + R_k]^{-1}\\\\\n", "P_k &= (I-K_kH)M_k\n", - "\\end{aligned}$$\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n" + "\\end{aligned}$$" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Wikipedia\n", + "$$\n", + "\\begin{aligned}\n", + "\\hat{\\textbf{x}}_{k\\mid k-1} &= \\textbf{F}_{k}\\hat{\\textbf{x}}_{k-1\\mid k-1} + \\textbf{B}_{k} \\textbf{u}_{k} \\\\\n", + "\\textbf{P}_{k\\mid k-1} &= \\textbf{F}_{k} \\textbf{P}_{k-1\\mid k-1} \\textbf{F}_{k}^{\\text{T}} + \\textbf{Q}_{k}\\\\\n", + "\\tilde{\\textbf{y}}_k &= \\textbf{z}_k - \\textbf{H}_k\\hat{\\textbf{x}}_{k\\mid k-1} \\\\\n", + "\\textbf{S}_k &= \\textbf{H}_k \\textbf{P}_{k\\mid k-1} \\textbf{H}_k^\\text{T} + \\textbf{R}_k \\\\\n", + "\\textbf{K}_k &= \\textbf{P}_{k\\mid k-1}\\textbf{H}_k^\\text{T}\\textbf{S}_k^{-1} \\\\\n", + "\\hat{\\textbf{x}}_{k\\mid k} &= \\hat{\\textbf{x}}_{k\\mid k-1} + \\textbf{K}_k\\tilde{\\textbf{y}}_k \\\\\n", + "\\textbf{P}_{k|k} &= (I - \\textbf{K}_k \\textbf{H}_k) \\textbf{P}_{k|k-1}\n", + "\\end{aligned}$$" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [], + "language": "python", + "metadata": {}, + "outputs": [] + }, { "cell_type": "code", "collapsed": false, diff --git a/Multidimensional_Kalman_Filters.ipynb b/Multidimensional_Kalman_Filters.ipynb index f2aaa1e..9df702c 100644 --- a/Multidimensional_Kalman_Filters.ipynb +++ b/Multidimensional_Kalman_Filters.ipynb @@ -1,7 +1,7 @@ { "metadata": { "name": "", - "signature": "sha256:2018504aed707108ce52dbe73eda7120be80a09fd1ca2e26123013828660bdc6" + "signature": "sha256:326afc9d59ba0acf8577d2d3b4ccdfe95d26c615a934eef4c0991fbd68930da8" }, "nbformat": 3, "nbformat_minor": 0, @@ -835,8 +835,8 @@ "\n", "> $$\n", "\\begin{aligned}\n", - "\\hat{\\mathbf{x}}_{t|t-1} &= \\mathbf{\\Phi_t}\\hat{\\mathbf{x}}_{t-1} + \\mathbf{B u}_t\\;\\;\\;&(1) \\\\\n", - "\\mathbf{P}_{t|t-1} &= \\mathbf{\\Phi_tP}_{t-1}\\mathbf{\\Phi}^T_t + \\mathbf{Q}_t\\;\\;\\;&(2)\n", + "\\hat{\\mathbf{x}}_{k|k-1} &= \\mathbf{\\Phi_k}\\hat{\\mathbf{x}}_{k-1} + \\mathbf{B u}_k\\;\\;\\;&(1) \\\\\n", + "\\mathbf{P}_{k|k-1} &= \\mathbf{\\Phi_kP}_{k-1}\\mathbf{\\Phi}^T_k + \\mathbf{Q}_k\\;\\;\\;&(2)\n", "\\end{aligned}\n", "$$\n", "\n", @@ -844,11 +844,11 @@ "\n", ">$$\n", "\\begin{aligned}\n", - "\\mathbf{\\gamma} &= \\mathbf{z}_t - \\mathbf{H}_t\\hat{\\mathbf{x}}_t\\;\\;\\;&(3) \\\\\n", - "\\mathbf{K}_t &= \\mathbf{P}_t \\mathbf{H}^T_t (\\mathbf{H}_t \\mathbf{P}_t \\mathbf{H}^T_t + \\mathbf{R}_t)^{-1}\\;\\;\\;(4) \\\\\n", - "\\\\\n", - "\\hat{\\mathbf{x}}_t &= \\hat{\\mathbf{x}}_{t|t-1} + \\mathbf{K}_t \\gamma \\;\\;\\;&(5) \\\\\n", - "\\mathbf{P}_{t|t} &= (\\mathbf{I} - \\mathbf{K}_t \\mathbf{H}_t)\\mathbf{P}_{t|t-1} \\;\\;\\;&(6)\n", + "\\tilde{\\textbf{y}}_k &= \\textbf{z}_k - \\textbf{H}_k\\hat{\\textbf{x}}_{k\\mid k-1}\\;\\;\\;&(3) \\\\\n", + "\\textbf{S}_k &= \\textbf{H}_k \\textbf{P}_{k\\mid k-1} \\textbf{H}_k^\\text{T} + \\textbf{R}_k\\;\\;\\;&(4) \\\\\n", + "\\textbf{K}_k &= \\textbf{P}_{k\\mid k-1}\\textbf{H}_k^\\text{T}\\textbf{S}_k^{-1}\\;\\;\\;&(5) \\\\\n", + "\\hat{\\textbf{x}}_{k\\mid k} &= \\hat{\\textbf{x}}_{k\\mid k-1} + \\textbf{K}_k\\tilde{\\textbf{y}}_k \\;\\;\\;&(6)\\\\\n", + "\\textbf{P}_{k|k} &= (I - \\textbf{K}_k \\textbf{H}_k) \\textbf{P}_{k|k-1}\\;\\;\\;&(7)\n", "\\end{aligned}\n", "$$\n", "\n", @@ -859,7 +859,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "These are nothing more than linear algebra equations that implement the algorithm we used in the last chapter, but using multidimensional Gaussians instead of univariate Gaussians, and optimized for a least squares fit. Each capital letter denotes a matrix or vector. The subscripts indicate which time step the data comes from; $t$ is now, $t-1$ is the previous step. $A^T$ is the transpose of A, and $A^{-1}$ is the inverse. Finally, the hat denotes an estimate, so $\\hat{x}_t$ is the estimate of $x$ at time $t$." + "These are nothing more than linear algebra equations that implement the algorithm we used in the last chapter, but using multidimensional Gaussians instead of univariate Gaussians, and optimized for a least squares fit. Each capital letter denotes a matrix or vector. The subscripts indicate which time step the data comes from; $k$ is now, $k-1$ is the previous step. $A^T$ is the transpose of the matrix A, and $A^{-1}$ is the inverse. Finally, the hat denotes an estimate, so $\\hat{x}_k$ is the estimate of $x$ at time $k$." ] }, { @@ -874,18 +874,20 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Different texts use different notation and variable names for the Kalman filter. Later we will expose you to these different forms to prepare you for reading the original literature. However, I find much of the notation very dense, and unnecessary for writing code. The subscripts indicate the time step, but we know the left hand side is for this time step, and the right hand side is for the previous step. For most of this book I'm going to use the following simplified equations, which express an algorithm.\n", + "Different texts use different notation and variable names for the Kalman filter. Later we will expose you to these different forms to prepare you for reading the original literature. In the equations above I adopted the notation used by the Wikipedia article[2] on Kalman filters. I find the choices made there particularly clear and easy to read, and it has the added benefit of being open to all.\n", + "\n", + "However, I still find much of the notation very dense, and unnecessary for writing code. The subscripts indicate the time step, but we know the left hand side is for this time step, and the right hand side is for the previous step. For most of this book I'm going to use the following simplified equations, which express an algorithm.\n", "\n", "$$\n", "\\begin{aligned}\n", "\\text{Predict Step}\\\\\n", - "\\mathbf{x}' &= \\mathbf{F x} + \\mathbf{B u}\\;\\;\\;&(1) \\\\\n", + "\\mathbf{x} &= \\mathbf{F x} + \\mathbf{B u}\\;\\;\\;&(1) \\\\\n", "\\mathbf{P} &= \\mathbf{FP{F}}^T + \\mathbf{Q}\\;\\;\\;&(2) \\\\\n", "\\\\\n", "\\text{Update Step}\\\\\n", - "\\mathbf{\\gamma} &= \\mathbf{z} - \\mathbf{H x} \\;\\;\\;&(3)\\\\\n", + "\\tilde{\\textbf{y}} &= \\mathbf{z} - \\mathbf{H x} \\;\\;\\;&(3)\\\\\n", "\\mathbf{K}&= \\mathbf{PH}^T (\\mathbf{HPH}^T + \\mathbf{R})^{-1}\\;\\;\\;&(4) \\\\\n", - "\\mathbf{x}&=\\mathbf{x}' +\\mathbf{K\\gamma} \\;\\;\\;&(5)\\\\\n", + "\\mathbf{x}&=\\mathbf{x} +\\mathbf{K\\tilde{\\textbf{y}}} \\;\\;\\;&(5)\\\\\n", "\\mathbf{P}&= (\\mathbf{I}-\\mathbf{KH})\\mathbf{P}\\;\\;\\;&(6)\n", "\\end{aligned}\n", "$$" @@ -897,8 +899,6 @@ "source": [ "This is an algorithm, so $=$ denotes assignment, not equality. For example, equation (6) has P on both sides of the $=$. This equation updates the value of P by the computation on the right hand side. \n", "\n", - "Here, a $'$ means estimate, so $\\mathbf{x}'$ is the estimate of the state $\\mathbf{x}$. Many texts use $\\hat{\\mathbf{x}}$ or $\\mathbf{x}^*$ to express this. I find these choices unfortunate because we will often want to express these in matrix form. The notation of a hat over a matrix is clumsy at best, and an asterisk followin a matrix normally means the *complex congugate* of the matrix, which is *not* what is intended. So I use $'$. \n", - "\n", "What do all of the variables mean? What is $\\mathbf{P}$, for example? Don't worry right now. Instead, I am just going to design a Kalman filter, and introduce the names as we go. Then we will just pass them into Python function that implement the equations above, and we will have our solution. Later sections will then delve into more detail about each step and equation. I think learning by example and practice is far easier than trying to memorize a dozen abstract facts at once. \n", "\n", "Look at the code below for the predict step (which we will present a bit later). \n", @@ -959,17 +959,17 @@ "\n", "We know from elementary physics how to compute a new position given a previous position, velocity, and time, like so:\n", "\n", - "$$ x' = {velocity}*{time} + x_{previous}$$\n", + "$$ \\hat{x} = {velocity}*{time} + x_{previous}$$\n", "\n", "In more formal mathematics we would write:\n", "\n", - "$$x' = \\dot{x}(\\Delta t) + x$$\n", + "$$\\hat{x} = \\dot{x}(\\Delta t) + x$$\n", "\n", "where $\\dot{x}$ is velocity, and $\\Delta t$ is the amount of time between $t-1$ and $t$. In our problems we will be running the Kalman filter at fixed time intervals, so $\\Delta t$ is a constant for us. We will just set it to $1$ and worry about the units later.\n", "\n", "As in step one we must express this in the form of matrices so that our linear algebra software and solve the equations for us. The Kalman filter equations require that we write it in the form:\n", "\n", - "$$ \\mathbf{x}' = \\mathbf{Fx}$$\n", + "$$ \\mathbf{\\hat{x}} = \\mathbf{Fx}$$\n", "\n", "where as in step 1 $\\mathbf{x}$ is the matrix containing the state variables, and $\\mathbf{F}$ is the matrix that when multiplied by $\\mathbf{x}$ yields our equations. Note that this is just part of the Kalman filter equation (1) above. We will deal with the second half of equation (1) in the next step.\n", "\n", @@ -985,8 +985,8 @@ "When we multiply the first row of $\\mathbf{F}$ that out we get:\n", "$$ \n", "\\begin{aligned}\n", - "x' &= 1 \\times x + \\Delta t * \\dot{x} \\mbox{, or} \\\\\n", - "x' &= \\dot{x}(\\Delta t) + x\n", + "\\hat{x} &= 1 \\times x + \\Delta t * \\dot{x} \\mbox{, or} \\\\\n", + "\\hat{x} &= \\dot{x}(\\Delta t) + x\n", "\\end{aligned}\n", "$$\n", "\n", @@ -994,21 +994,21 @@ "\n", "Therefore we will assume that\n", "\n", - "$$\\dot{x}' = \\dot{x}$$\n", + "$$\\hat{\\dot{x}} = \\dot{x}$$\n", "\n", "which gives us the second row of $\\mathbf{F}$ as follows, once we set $\\Delta t = 1$:\n", "\n", "\n", "$$\n", - "{\\begin{bmatrix}x\\\\\\dot{x}\\end{bmatrix}}' =\\begin{bmatrix}1&1 \\\\ 0&1\\end{bmatrix} \\times \\begin{bmatrix}x \\\\ \\dot{x}\\end{bmatrix}\n", + "\\hat{x} = {\\begin{bmatrix}x\\\\\\dot{x}\\end{bmatrix}} =\\begin{bmatrix}1&1 \\\\ 0&1\\end{bmatrix} \\times \\begin{bmatrix}x \\\\ \\dot{x}\\end{bmatrix}\n", "$$\n", "\n", "Which, when multiplied out, yields our desired equations:\n", "\n", "$$\n", "\\begin{aligned}\n", - "x' &= x + \\dot{x} \\\\\n", - "\\dot{x}' &= \\dot{x}\n", + "\\hat{x} &= x + \\dot{x} \\\\\n", + "\\hat{\\dot{x}} &= \\dot{x}\n", "\\end{aligned}\n", "$$\n", "\n", @@ -1034,7 +1034,7 @@ "\n", "We will cover this use case later, but for now passive tracking applications we set those terms to 0. In step 2 there was the unexplained term $\\mathbf{Bu}$ in equation (1):\n", "\n", - "$$\\mathbf{x}' = \\mathbf{Fx} + \\mathbf{Bu}$$.\n", + "$$\\mathbf{\\hat{x}} = \\mathbf{Fx} + \\mathbf{Bu}$$.\n", "\n", "Here $\\mathbf{u}$ is the control input, and $\\mathbf{B}$ is its transfer function. For example, $\\mathbf{u}$ might be a voltage controlling how fast the wheel's motor turns, and multiplying by $\\mathbf{B}$ yields $\\begin{smallmatrix}x\\\\\\dot{x}\\end{smallmatrix}$. Since we do not need these terms we will set them both to zero and not concern ourselves with them for now.\n" ] @@ -1055,7 +1055,7 @@ "\n", "In the nomenclature of Kalman filters the $[1\\space\\space0]$ matrix is called $H$. If you scroll up to the Kalman filter equations you will see an $H$ term in the update step.\n", "\n", - "$$\\mathbf{\\gamma} = \\mathbf{z} - \\mathbf{H x}\\tag{3}$$\n", + "$$\\tilde{y} = \\mathbf{z} - \\mathbf{H x}\\tag{3}$$\n", "\n", "Believe it or not, we have designed the majority of our Kalman filter!! All that is left is to model the noise in our sensors." ] @@ -1435,7 +1435,7 @@ "\n", "$$\n", "\\begin{aligned}\n", - "\\mathbf{x}' &= \\mathbf{F x} + \\mathbf{B u} \\\\\n", + "\\mathbf{\\hat{x}} &= \\mathbf{F x} + \\mathbf{B u} \\\\\n", "\\mathbf{P} &= \\mathbf{FP{F}}^T + \\mathbf{Q} \n", "\\end{aligned}\n", "$$\n", @@ -1461,9 +1461,9 @@ "\n", "$$\n", "\\begin{aligned}\n", - "\\mathbf{\\gamma} &= \\mathbf{z} - \\mathbf{H}\\hat{\\mathbf{x}} \\\\\n", + "\\mathbf{\\tilde{y}} &= \\mathbf{z} - \\mathbf{H}\\hat{\\mathbf{x}} \\\\\n", "\\mathbf{K} &= \\mathbf{P} \\mathbf{H}^T (\\mathbf{H} \\mathbf{P} \\mathbf{H}^T +\\mathbf{R})^{-1} \\\\\n", - "\\hat{\\mathbf{x}} &= \\hat{\\mathbf{x}} + \\mathbf{K} \\gamma \\\\\n", + "\\hat{\\mathbf{x}} &= \\hat{\\mathbf{x}} + \\mathbf{K} \\mathbf{\\tilde{y}} \\\\\n", "\\mathbf{P} &= (\\mathbf{I} - \\mathbf{K} \\mathbf{H})\\mathbf{P}\n", "\\end{aligned}\n", "$$\n", @@ -1548,11 +1548,9 @@ " \n", "This code will use self.R if you do not provide a value for `R`. If you did provide a value, it will check if you provided a scalar (number); if so it constructs a matrix of the correct dimension for you. Otherwise it assumes that you passed in a matrix of the correct dimension.\n", "\n", - "The rest of the code implements the Kalman filter equations, with one exception. Instead of implementing $\\mathbf{P} = (\\mathbf{I} - \\mathbf{KH})\\mathbf{P}$ it implements the somewhat more complicated form $\\mathbf{P} = (\\mathbf{I} - \\mathbf{KH})\\mathbf{P}(\\mathbf{I} - \\mathbf{KRK})^T$.\n", + "The rest of the code implements the Kalman filter equations, with one exception. Instead of implementing $\\mathbf{P} = (\\mathbf{I} - \\mathbf{KH})\\mathbf{P}$ it implements the somewhat more complicated form $\\mathbf{P} = (\\mathbf{I} - \\mathbf{KH})\\mathbf{P}(\\mathbf{I} - \\mathbf{KRK})^T$. The reason for this altered equation is somewhat beyond the scope of this chapter. For now, I will say that this equation is more numerically stable than the former equation, at the cost of being a bit more expensive to compute. It is not always possible to find the optimal value for $\\text{K}$, in which case the former equation will not produce good results because it assumes optimality. The longer reformulation used in the code is derived from more general math that does not assume optimality, and hence provides good results for nonoptimal filters (such as when we can not correctly model our measurement error).\n", "\n", - "The reason for this altered equation is somewhat beyond the scope of this chapter. For now, I will say that this equation is more numerically stable than the former equation, at the cost of being a bit more expensive to compute. It is not always possible to find the optimal value for $\\text{K}$, in which case the former equation will not produce good results because it assumes optimality. The longer reformulation used in the code is derived from more general math that does not assume optimality, and hence provides good results for nonoptimal filters (such as when we can not correctly model our measurement error).\n", - "\n", - "Various texts differ as to whether this form of the equation should always be used, or only used when you know you need it. I choose to expend a bit more processing power to ensure stability; if your hardware is very constrained and you are able to prove that the simpler equation is correct for your problem then you might choose to use it instead. Personally, I find that a risky approach and do not recommend it to non-experts. Brown's *Introduction to Random Signals and Applied Kalman Filtering*[2] discusses this issue in some detail, if you are interested." + "Various texts differ as to whether this form of the equation should always be used, or only used when you know you need it. I choose to expend a bit more processing power to ensure stability; if your hardware is very constrained and you are able to prove that the simpler equation is correct for your problem then you might choose to use it instead. Personally, I find that a risky approach and do not recommend it to non-experts. Brown's *Introduction to Random Signals and Applied Kalman Filtering* [3] discusses this issue in some detail, if you are interested." ] }, { @@ -2250,7 +2248,9 @@ "source": [ "[1] http://docs.scipy.org/doc/scipy/reference/tutorial/stats.htmly\n", "\n", - "[2] Brown, Robert Grover. *Introduction to Random Signals and Applied Kalman Filtering* John Wiley & Sons, Inc. 2012" + "[2] https://en.wikipedia.org/wiki/Kalman_filter\n", + "\n", + "[3] Brown, Robert Grover. *Introduction to Random Signals and Applied Kalman Filtering* John Wiley & Sons, Inc. 2012" ] } ],