Improved description of state-space methods.

2015-12-13 15:16:00 -08:00 · 2015-12-13 15:16:00 -08:00 · 7905298583
commit 7905298583
parent 26125c538f
1 changed files with 36 additions and 27 deletions
--- a/07-Kalman-Filter-Math.ipynb
+++ b/07-Kalman-Filter-Math.ipynb
@ -292,7 +292,7 @@
   "source": [
    "## Modeling a Dynamic System that Has Noise\n",
    "\n",
-    "Modeling dynamic systems is properly the topic of several undergraduate and graduate courses in mathematics. To an extent there is no substitute for a few semesters of ordinary and partial differential equations followed by a graduate course in control sytem theory. If you are a hobbyist, or trying to solve one very specific filtering problem at work you probably do not have the time and/or inclination to devote a year or more to that education.\n",
+    "A *dynamic system* is a physical systems whose state evolves over time. Modeling dynamic systems is properly the topic of several undergraduate and graduate courses in mathematics. To an extent there is no substitute for a few semesters of ordinary and partial differential equations followed by a graduate course in control sytem theory. If you are a hobbyist, or trying to solve one very specific filtering problem at work you probably do not have the time and/or inclination to devote a year or more to that education.\n",
    "\n",
    "However, I can present enough of the theory to allow us to create the system equations for many different Kalman filters, and give you enough background to at least follow the mathematics in the literature. My goal is to get you to the stage where you can read a Kalman filtering book or paper and understand it well enough to implement the algorithms. The background math is deep, but we end up using a few simple techniques over and over again in practice.\n",
    "\n",
@ -338,7 +338,7 @@
    "\n",
    "$$ f(\\mathbf{x}) = \\mathbf{Ax} + \\mathbf{w}$$\n",
    "\n",
-    "Finally, we need to consider inputs into the system. We are dealing with linear problems here, so we will assume that there is some input $u$ into the system, and that we have some linear model that defines how that input changes the system. For example, if you press down on the accelerator in your car the car will accelerate. We will need a matrix $\\mathbf{B}$ to convert $u$ into the effect on the system. We add that into our equation:\n",
+    "Finally, we need to consider inputs into the system. We are dealing with linear problems here, so we will assume that there is some input $u$ into the system, and that we have some linear model that defines how that input changes the system. For example, pressing the accelerator in your car makes it accelerate, and gravity causes balls to fall. Both are contol inputs. We will need a matrix $\\mathbf{B}$ to convert $u$ into the effect on the system. We add that into our equation:\n",
    "\n",
    "$$ f(\\mathbf{x}) = \\mathbf{Ax} + \\mathbf{Bu} + \\mathbf{w}$$\n",
    "\n",
@ -356,16 +356,18 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "The equation $\\mathbf{v} = \\frac{d \\mathbf{x}}{d t}$ is the simplest possible differential equation. We trivially integrate it into the high school physics equation $x = vt + x_0$. Almost all other differential equations encountered in physical systems will not yield to this approach. \n",
+    "The equation $\\mathbf{v} = \\frac{d \\mathbf{x}}{d t}$ is the simplest possible differential equation. We trivially integrate it into the high school physics equation $x = v\\Delta t + x_0$. We then use a computer to compute the position $\\mathbf{x}$ for any arbitrary time step $\\Delta t$. We call this equation *discretized* because we can compute the state for a discrete time step $\\Delta t$. Almost all other differential equations encountered in physical systems will not yield to this approach. \n",
    "\n",
-    "*State-space* methods became popular around the time of the Apollo missions, in no small part due to the work of Dr. Kalman. The idea is simple. First, we convert a system of $n^{th}$-order differential equations into an equivalent set of first-order differential systems. We can then represent that set of first order equations as in vector matrix form. Once in this form we use the formidable powers of linear algebra to solve the system of equations. The Kalman filter is an example of this power. "
+    "*State-space* methods became popular around the time of the Apollo missions, largely due to the work of Dr. Kalman. The idea is simple. Start with a system of $n^{th}$-order differential equations which model a dynamic system. Next, convert the equations into an equivalent set of first-order differential equations. We can then represent that set of first order equations in vector matrix form. These equations are continuous. We wish to computes these equations with a computer, so we use another technique to *discretize* the equations. Once in this form we use the formidable powers of linear algebra to solve the system of equations. The Kalman filter is an example of this power. \n",
+    "\n",
+    "This presentation and this book is limited to dynamic systems can be represented with differential equation. Not all systems can be represented in this form. *Hybrid System Theory* tackles those sorts of systems."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Let's put this into mathematical notation so you can see what I mean. You will not need to memorize these equations, so don't try to work through each step. We have a $n^{th}$-order differential equation with control input $u$ which describes our system:\n",
+    "Let's put this into mathematical notation. We have a $n^{th}$-order differential equation with control input $u$ which describes our system:\n",
    "\n",
    "$$a_n \\frac{d^ny}{dt^n} + a_{n-1} \\frac{d^{n-1}y}{dt^{n-1}} +  \\dots + a_2 \\frac{d^2y}{dt^2} + a_1 \\frac{dy}{dt} + a_0 = u$$\n",
    "\n",
@ -392,11 +394,11 @@
    "\n",
    "$$\\frac{d\\mathbf{x}}{dt} = \\mathbf{Ax} + \\mathbf{bu}$$\n",
    "\n",
-    "which you will recognize this form from the previous section. We are interested in the state $\\mathbf{x}$, not $\\frac{d\\mathbf{x}}{dt}$, so we will need a technique to find the *fundamental matrix* $\\Phi$. The fundamental matrix propogates the state from time $t_0$ to time $t_1$ with\n",
+    "which you will recognize this form from the previous section. We are interested in the state $\\mathbf{x}$, not $\\frac{d\\mathbf{x}}{dt}$, so we will need a technique to find the *fundamental matrix* $\\Phi$ for that first order differential equation. The fundamental matrix discretizes the differential equation by propogating the system state from time $t_0$ to time $t_1$ with the recursive equation:\n",
    "\n",
    "$$\\mathbf{x}(t) = \\Phi(t_1-t_0)\\mathbf{x}(t_0)$$\n",
    "\n",
-    "This is the state transition function matrix that we use in the Kalman filter predict step."
+    "$\\Phi$ is the state transition function matrix $\\mathbf{F}$ that we use in the Kalman filter predict step."
   ]
  },
  {
@ -432,7 +434,7 @@
    "\\frac{dx}{dt} = f(x)\n",
    "$$ \n",
    "\n",
-    "Using the *separation of variables* techniques, we divide by $f(x)$ and move the $dt$ term to the right so we can integrate each side:\n",
+    "Using the *separation of variables* techniques we divide by $f(x)$ and move the $dt$ term to the right so we can integrate each side:\n",
    "\n",
    "$$\n",
    "\\int^x_{x_0} \\frac{1}{f(x)} dx = \\int^t_{t_0} dt\\\\\n",
@ -449,7 +451,7 @@
    "\n",
    "In other words, we need to find the inverse of $F$. This is not at all trivial, and a significant amount of coursework in a STEM education is devoted to finding tricky, analytic solutions to this problem. \n",
    "\n",
-    "In the end, however, they are tricks, and many simple forms of $f(x)$ either have no closed form solution or pose extreme difficulties. Instead, the practicing engineer turns to state-space methods to find solutions."
+    "However, they are tricks, and many simple forms of $f(x)$ either have no closed form solution or pose extreme difficulties. Instead, the practicing engineer turns to state-space methods to find solutions."
   ]
  },
  {
@ -458,7 +460,7 @@
   "source": [
    "### Forming First Order Equations from Higher Order Equations\n",
    "\n",
-    "Models of physical systems often require second or higher order equations. However, state-space methods require first-order differential equations, which are equations with only first derivatives. Any higher order system of equations can be converted to a first order set of equations by defining extra variables for the first order terms and then solving. \n",
+    "Many models of physical systems require second or higher order equations. State-space methods require first-order differential equations. Any higher order system of equations can be converted to a first order set of equations by defining extra variables for the first order terms and then solving. \n",
    "\n",
    "Let's do an example. Given the system $\\ddot{x} - 6\\dot{x} + 9x = t$ find the first order equations.\n",
    "\n",
@ -472,7 +474,7 @@
    "x_2(t) = \\dot{x}\n",
    "$$\n",
    "\n",
-    "Now we will substitute these into the original equation and solve, giving us a set of first order equations in terms of these new variables.\n",
+    "Now we will substitute these into the original equation and solve, giving us a set of first order equations in terms of these new variables. It is conventional to drop the $(t)$ for notational convenience.\n",
    "\n",
    "First, we know that $\\dot{x}_1 = x_2$ and that $\\dot{x}_2 = \\ddot{x}$. Therefore\n",
    "\n",
@ -510,6 +512,12 @@
    "\n",
    "You will recognize as the state transition matrix we use in the prediction step of the Kalman filter. We want to compute the value of $\\mathbf{x}$ at time $t$ by multiplying its previous value by some matrix $\\Phi$.\n",
    "\n",
+    "It is conventional to drop the $t_k$ and use the notation\n",
+    "\n",
+    "$$\\mathbf{x}_k = \\Phi(\\Delta t)\\mathbf{x}_{k-1}$$\n",
+    "\n",
+    "$\\mathbf{x}_k$ does not mean the k$^{th}$ value of $\\mathbf{x}$, but the value of $\\mathbf{x}$ at the k$^{th}$ value of $t$.\n",
+    "\n",
    "Broadly speaking there are three common ways to find this matrix for Kalman filters. The technique most often used with Kalman filters is to use a Taylor-series expansion. Linear Time Invariant Theory, also known as LTI System Theory, is a second technique. Finally, there are numerical techniques. You may know of others, but these three are what you will most likely encounter in the Kalman filter literature and praxis."
   ]
  },
@ -521,8 +529,6 @@
    "\n",
    "Taylor series represents a function as an infinite sum of terms. The terms are linear, even for a nonlinear function, so we can express any arbitrary nonlinear function using linear algebra. The cost of this choice is that unless we use an infinite number of terms the value we compute will be approximate rather than exact.\n",
    "\n",
-    "For the Kalman filter we will be using a form of the series that uses a matrix. But before we do that, let's work through a couple of examples with real functions since real functions are easier to plot and reason about. The Taylor series for either are nearly identical, so this is a good first step.\n",
-    "\n",
    "For a real or complex function the Taylor series of a function $f(x)$ evaluated at $t$ is defined as \n",
    "\n",
    "$$ \\Phi(t) = e^{\\mathbf{F}t} = \\mathbf{I} + \\mathbf{F}t  + \\frac{(\\mathbf{F}t)^2}{2!} + \\frac{(\\mathbf{F}t)^3}{3!} + ... $$\n",
@ -547,7 +553,7 @@
    "\n",
    "Now we can substitute these values into the equation.\n",
    "\n",
-    "$$f(x) = \\frac{0}{0!}(x)^0 + \\frac{1}{1!}(x)^1 + \\frac{0}{2!}(x)^2 + \\frac{-1}{3!}(x)^3 + \\frac{0}{4!}(x)^4 + \\frac{-1}{5!}(x)^5 + ... $$\n",
+    "$$\\sin(x) = \\frac{0}{0!}(x)^0 + \\frac{1}{1!}(x)^1 + \\frac{0}{2!}(x)^2 + \\frac{-1}{3!}(x)^3 + \\frac{0}{4!}(x)^4 + \\frac{-1}{5!}(x)^5 + ... $$\n",
    "\n",
    "And let's test this with some code:"
   ]
@ -597,7 +603,7 @@
    "\n",
    "$$\\Phi(t) = e^{\\mathbf{F}t} = \\mathbf{I} + \\mathbf{F}t  + \\frac{(\\mathbf{F}t)^2}{2!} + \\frac{(\\mathbf{F}t)^3}{3!} + ... $$\n",
    "\n",
-    "If you perform the multiplication you will find that $\\mathbf{F}^2=\\begin{bmatrix}0&0\\\\0&0\\end{bmatrix}$, which means that all higher powers of $\\mathbf{F}$ are also $\\mathbf{0}$. This makes the computation very easy.\n",
+    "If you perform the multiplication you will find that $\\mathbf{F}^2=\\begin{bmatrix}0&0\\\\0&0\\end{bmatrix}$, which means that all higher powers of $\\mathbf{F}$ are also $\\mathbf{0}$. Thus we get an exact answer without an infinite number of terms:\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
@ -624,7 +630,7 @@
    "\n",
    "We derived this equation in that chapter by using techniques that are much easier to understand. The advantage of the Taylor series expansion is that we can use it for any arbitrary set of differential equations which are time invariant. \n",
    "\n",
-    "However, we often use a Taylor expansion even when the equations are not time invariant. The answer will still be reasonably accurate so long as the time step is short and the system is nearly constant over that time step."
+    "However, we often use a Taylor expansion even when the equations are not time invariant. As an aircraft flies it loses weight as it burns fuel. The answer will still be reasonably accurate so long as the time step is short and the system is nearly constant over that time step. The weight loss is neglible over one second, so the Taylor expansion will be sufficiently accurate for our purposes."
   ]
  },
  {
@ -633,11 +639,11 @@
   "source": [
    "### Linear Time Invariant Theory\n",
    "\n",
-    "*Linear Time Invariant Theory*, also known as LTI System Theory, gives us a way to find $\\Phi$ using the inverse Laplace transform. You are either nodding your head now, or completely lost. Don't worry, I will not be using the Laplace transform in this book except in this paragraph, as the computation is quite difficult to perform in practice. LTI system theory tells us that \n",
+    "*Linear Time Invariant Theory*, also known as LTI System Theory, gives us a way to find $\\Phi$ using the inverse Laplace transform. You are either nodding your head now, or completely lost. Don't worry, I will not be using the Laplace transform in this book except in this paragraph, as the computation can be quite difficult. LTI system theory tells us that \n",
    "\n",
    "$$ \\Phi(t) = \\mathcal{L}^{-1}[(s\\mathbf{I} - \\mathbf{F})^{-1}]$$\n",
    "\n",
-    "I have no intention of going into this other than to say that the inverse Laplace transform converts a signal into the frequency (time) domain, but finding a solution to the equation above is non-trivial. If you are interested, the Wikipedia article on LTI system theory provides an introduction [2]. I mention LTI because you will find some literature using it to design the Kalman filter matrices for difficult problem. "
+    "I have no intention of going into this other than to say that the inverse Laplace transform $\\mathcal{L}^{-1}$ converts a signal into the frequency (time) domain, but finding a solution to the equation above is non-trivial. If you are interested, the Wikipedia article on LTI system theory provides an introduction [2]. I mention LTI because you will find some literature using it to design the Kalman filter matrices for difficult problems. "
   ]
  },
  {
@ -645,21 +651,24 @@
   "metadata": {},
   "source": [
    "### Numerical Solutions\n",
-    "Finally, there are numerical techniques to find $\\Phi$. As filters get larger finding analytical solutions becomes very tedious (though packages like SymPy make it easier). C. F. van Loan [3] has developed a technique that finds both $\\Phi$ and $Q$ numerically. Given the continuous model\n",
+    "\n",
+    "Finally, there are numerous numerical techniques to find $\\Phi$. As filters get larger finding analytical solutions becomes very tedious (though packages like SymPy make it easier). C. F. van Loan [3] has developed a technique that finds both $\\Phi$ and $\\mathbf{Q}$ numerically. Given the continuous model\n",
    "\n",
    "$$ x' = Fx + Gu$$\n",
    "\n",
-    "where u is the unity white noise, we compute and return the $\\sigma$ and $Q_k$ that discretizes that equation.\n",
+    "where $u$ is the unity white noise, van Loan's method computes the $\\Phi$ and $\\mathbf{Q}_k$ which discretizes that equation.\n",
    "    \n",
    "I have implemented van Loan's method in `FilterPy`. You may use it as follows:\n",
    "\n",
-    "    from filterpy.common import van_loan_discretization\n",
-    "  \n",
-    "    F = np.array([[0,1],[-1,0]], dtype=float)\n",
-    "    G = np.array([[0.],[2.]]) # white noise scaling\n",
-    "    phi, Q = van_loan_discretization(F, G, dt=0.1)\n",
+    "```python\n",
+    "from filterpy.common import van_loan_discretization\n",
+    "\n",
+    "F = np.array([[0., 1.], [-1., 0.]])\n",
+    "G = np.array([[0.], [2.]]) # white noise scaling\n",
+    "phi, Q = van_loan_discretization(F, G, dt=0.1)\n",
+    "```\n",
    "    \n",
-    "As with LTI system theory, I do not intend to teach the topic of solving differential equations so I will not pursue this further."
+    "In the section *Numeric Integration of Differential Equations* I present alternative methods which are very commonly used in Kalman filtering."
   ]
  },
  {
@ -1224,7 +1233,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Euler's Method\n",
+    "### Euler's Method\n",
    "\n",
    "Let's say we have the initial condition problem of \n",
    "\n",