Copy edit of math chapter.

I moved the conversion of the multivariate equations to univariate
equations to the supporting textbook. It's not terribly necessary,
especially since I converted the univariate equations to look like
the multivariate ones.
This commit is contained in:
Roger Labbe 2016-01-24 15:34:57 -08:00
parent 71da22ad8c
commit 53b058bbda
3 changed files with 450 additions and 135 deletions

View File

@ -606,7 +606,7 @@
"\n",
"A counter-example is $x(t) = \\sin(t)$, with the system $f(x) = t\\, x(t) = t \\sin(t)$. This is not time invariant; the value will be different at different times due to the multiplication by t. An aircraft is not time invariant. If you make a control input to the aircraft at a later time its behavior will be different because it will have burned fuel and thus lost weight. Lower weight results in different behavior.\n",
"\n",
"We can solve these equations by integrating each side. I demonstrated integrating the time invariants system $v = \\dot x$ above. However, integrating the time invariant equation $\\dot x = f(x)$ is not so straightforward. Using the *separation of variables* techniques we divide by $f(x)$ and move the $dt$ term to the right so we can integrate each side:\n",
"We can solve these equations by integrating each side. I demonstrated integrating the time invariant system $v = \\dot x$ above. However, integrating the time invariant equation $\\dot x = f(x)$ is not so straightforward. Using the *separation of variables* techniques we divide by $f(x)$ and move the $dt$ term to the right so we can integrate each side:\n",
"\n",
"$$\\begin{gathered}\n",
"\\frac{dx}{dt} = f(x) \\\\\n",
@ -685,7 +685,7 @@
"source": [
"### Linear Time Invariant Theory\n",
"\n",
"*Linear Time Invariant Theory*, also known as LTI System Theory, gives us a way to find $\\Phi$ using the inverse Laplace transform. You are either nodding your head now, or completely lost. Don't worry, I will not be using the Laplace transform in this book except in this paragraph, as the computation can be quite difficult. LTI system theory tells us that \n",
"*Linear Time Invariant Theory*, also known as LTI System Theory, gives us a way to find $\\Phi$ using the inverse Laplace transform. You are either nodding your head now, or completely lost. I will not be using the Laplace transform in this book. LTI system theory tells us that \n",
"\n",
"$$ \\Phi(t) = \\mathcal{L}^{-1}[(s\\mathbf{I} - \\mathbf{F})^{-1}]$$\n",
"\n",
@ -698,7 +698,7 @@
"source": [
"### Numerical Solutions\n",
"\n",
"Finally, there are numerous numerical techniques to find $\\mathbf F$. As filters get larger finding analytical solutions becomes very tedious (though packages like SymPy make it easier). C. F. van Loan [3] has developed a technique that finds both $\\Phi$ and $\\mathbf Q$ numerically. Given the continuous model\n",
"Finally, there are numerical techniques to find $\\mathbf F$. As filters get larger finding analytical solutions becomes very tedious (though packages like SymPy make it easier). C. F. van Loan [3] has developed a technique that finds both $\\Phi$ and $\\mathbf Q$ numerically. Given the continuous model\n",
"\n",
"$$ \\dot x = Ax + Gw$$\n",
"\n",
@ -721,13 +721,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Design of the Process Noise Matrix"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Design of the Process Noise Matrix\n",
"\n",
"In general the design of the $\\mathbf Q$ matrix is among the most difficult aspects of Kalman filter design. This is due to several factors. First, the math requires a good foundation in signal theory. Second, we are trying to model the noise in something for which we have little information. Consider trying to model the process noise for a thrown baseball. We can model it as a sphere moving through the air, but that leave many unknown factors - the wind, ball rotation and spin decay, the coefficient of drag of a ball with stitches, the effects of wind and air density, and so on. We develop the equations for an exact mathematical solution for a given process model, but since the process model is incomplete the result for $\\mathbf Q$ will also be incomplete. This has a lot of ramifications for the behavior of the Kalman filter. If $\\mathbf Q$ is too small then the filter will be overconfident in its prediction model and will diverge from the actual solution. If $\\mathbf Q$ is too large than the filter will be unduly influenced by the noise in the measurements and perform sub-optimally. In practice we spend a lot of time running simulations and evaluating collected data to try to select an appropriate value for $\\mathbf Q$. But let's start by looking at the math.\n",
"\n",
"\n",
@ -737,7 +732,7 @@
"\n",
"$$ \\dot{\\mathbf x} = \\mathbf{Ax} + \\mathbf{Bu} + \\mathbf{w}$$\n",
"\n",
"where $\\mathbf{w}$ is the process noise. Kinematic systems are *continuous* - their inputs and outputs can vary at any arbitrary point in time. However, our Kalman filters are *discrete*. We sample the system at regular intervals. Therefore we must find the discrete representation for the noise term in the equation above. This depends on what assumptions we make about the behavior of the noise. We will consider two different models for the noise."
"where $\\mathbf{w}$ is the process noise. Kinematic systems are *continuous* - their inputs and outputs can vary at any arbitrary point in time. However, our Kalman filters are *discrete* (there are continuous forms for Kalman filters, but we do not cover them in this book). We sample the system at regular intervals. Therefore we must find the discrete representation for the noise term in the equation above. This depends on what assumptions we make about the behavior of the noise. We will consider two different models for the noise."
]
},
{
@ -753,7 +748,7 @@
"source": [
"We model kinematic systems using Newton's equations. We have either used position and velocity, or position, velocity, and acceleration as the models for our systems. There is nothing stopping us from going further - we can model jerk, jounce, snap, and so on. We don't do that normally because adding terms beyond the dynamics of the real system degrades the estimate. \n",
"\n",
"Let's say that we need to model the position, velocity, and acceleration. We can then assume that acceleration is constant for each discrete time step. Of course, there is process noise in the system and so the acceleration is not actually constant. The tracked object will alter the acceleration over time due to external, unmodeled forces. In this section we will assume that the acceleration changes by a continuous time zero-mean white noise $w(t)$. In other words, we are assuming that velocity is acceleration changing by small amounts that over time average to 0 (zero-mean). \n",
"Let's say that we need to model the position, velocity, and acceleration. We can then assume that acceleration is constant for each discrete time step. Of course, there is process noise in the system and so the acceleration is not actually constant. The tracked object will alter the acceleration over time due to external, unmodeled forces. In this section we will assume that the acceleration changes by a continuous time zero-mean white noise $w(t)$. In other words, we are assuming that the small changes in velocity average to 0 over time (zero-mean). \n",
"\n",
"Since the noise is changing continuously we will need to integrate to get the discrete noise for the discretization interval that we have chosen. We will not prove it here, but the equation for the discretization of the noise is\n",
"\n",
@ -1215,7 +1210,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We can see that most of the terms are very small. Recall that the only Kalman filter using this matrix is\n",
"We can see that most of the terms are very small. Recall that the only equation using this matrix is\n",
"\n",
"$$ \\mathbf P=\\mathbf{FPF}^\\mathsf{T} + \\mathbf Q$$\n",
"\n",
@ -1464,7 +1459,7 @@
"metadata": {},
"source": [
"\n",
"Runge Kutta integration is the workhorse of numerical integration. There are a vast number of methods in the literature. In practice, using the Runge Kutta algorithm that I present here will solve most any problem you will face. It offers a very good balance of speed, precision, and stability, and it is the 'go to' numerical integration method unless you have a very good reason to choose something different.\n",
"Runge Kutta is the workhorse of numerical integration. There are a vast number of methods in the literature. In practice, using the Runge Kutta algorithm that I present here will solve most any problem you will face. It offers a very good balance of speed, precision, and stability, and it is the 'go to' numerical integration method unless you have a very good reason to choose something different.\n",
"\n",
"Let's dive in. We start with some differential equation\n",
"\n",
@ -1591,7 +1586,7 @@
"\n",
"Starting in the Discrete Bayes chapter I used a Bayesian formulation for filtering. Suppose we are tracking an object. We define its *state* at a specific time as its position, velocity, and so on. For example, we might write the state at time $t$ as $\\mathbf x_t = \\begin{bmatrix}x_t &\\dot x_t \\end{bmatrix}^\\mathsf T$. \n",
"\n",
"When we take a measurement of the object we are measuring the state. Sensors are noisy, so the measurement is corrupted with noise. Clearly though, the measurement is determined by the state. That is, a change in state may change the measurement, but a change in measurement will not change the state.\n",
"When we take a measurement of the object we are measuring the state or part of it. Sensors are noisy, so the measurement is corrupted with noise. Clearly though, the measurement is determined by the state. That is, a change in state may change the measurement, but a change in measurement will not change the state.\n",
"\n",
"In filtering our goal is to compute an optimal estimate for a set of states $\\mathbf x_{0:t}$ from time 0 to time $t$. If we knew $\\mathbf x_{0:t}$ then it would be trivial to compute a set of measurements $\\mathbf z_{0:t}$ corresponding to those states. However, we receive a set of measurements $\\mathbf z_{0:t}$, and want to compute the corresponding states $\\mathbf x_{0:t}$. This is called *statistical inversion* because we are trying to compute the input from the output. \n",
"\n",
@ -1629,132 +1624,13 @@
"\n",
"The details of the mathematics for this computation varies based on the problem. The **Discrete Bayes** and **Univariate Kalman Filter** chapters gave two different formulations which you should have been able to reason through. The univariate Kalman filter assumes that for a scalar state both the noise and process are linear model are affected by zero-mean, uncorrelated Gaussian noise. \n",
"\n",
"The Multivariate Kalman filter make the same assumption but for states and measurements that are vectors, not scalars. Dr. Kalman was able to prove that if these assumptions hold true then the Kalman filter is *optimal*. Colloquially this means there is no way to derive more information from the noise. In the remainder of the book I will present filters that relax the constraints on linearity and Gaussian noise.\n",
"The Multivariate Kalman filter make the same assumption but for states and measurements that are vectors, not scalars. Dr. Kalman was able to prove that if these assumptions hold true then the Kalman filter is *optimal* in a least squares sense. Colloquially this means there is no way to derive more information from the noise. In the remainder of the book I will present filters that relax the constraints on linearity and Gaussian noise.\n",
"\n",
"Before I go on, a few more words about statistical inversion. As Calvetti and Somersalo write in *Introduction to Bayesian Scientific Computing*, \"we adopt the Bayesian point of view: *randomness simply means lack of information*.\"[4] Our state parametize physical phenomena that we could in principle measure or compute: velocity, air drag, and so on. We lack enough information to compute or measure their value, so we opt to consider them as random variables. Strictly speaking they are not random, thus this is a subjective position. \n",
"\n",
"They devote a full chapter to this topic. I can spare a paragraph. Bayesian filters are possible because we ascribe statistical properties to unknown parameters. In the case of the Kalman filter we have closed-form solutions to find an optimal estimate. Other filters, such as the discrete Bayes filter or the particle filter which we cover in a later chapter, model the probability in a more ad-hoc, non-optimal manner. The power of our technique comes from treating lack of information as a random variable, describing that random variable as a probability distribution, and then using Bayes Theorem to solve the statistical inference problem."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Converting the Multivariate Equations to the Univariate Case\n",
"\n",
"The multivariate Kalman filter equations do not resemble the equations for the univariate filter. However, if we use one dimensional states and measurements the equations do reduce to the univariate equations. This section will provide you with a strong intuition into what the Kalman filter equations are actually doing. While reading this section is not required to understand the rest of the book, I recommend reading this section carefully as it should make the rest of the material easier to understand.\n",
"\n",
"Here are the multivariate equations for the prediction. \n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"\\mathbf{\\bar{x}} &= \\mathbf{F x} + \\mathbf{B u} \\\\\n",
"\\mathbf{\\bar{P}} &= \\mathbf{FPF}^\\mathsf{T} + \\mathbf Q\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"For a univariate problem the state $\\mathbf x$ only has one variable, so it is a $1\\times 1$ matrix. Our motion $\\mathbf{u}$ is also a $1\\times 1$ matrix. Therefore, $\\mathbf{F}$ and $\\mathbf B$ must also be $1\\times 1$ matrices. That means that they are all scalars, and we can write\n",
"\n",
"$$\\bar{x} = Fx + Bu$$\n",
"\n",
"Here the variables are not bold, denoting that they are not matrices or vectors. \n",
"\n",
"Our state transition is simple - the next state is the same as this state, so $F=1$. The same holds for the motion transition, so, $B=1$. Thus we have\n",
"\n",
"$$x = x + u$$\n",
"\n",
"which is equivalent to the Gaussian equation from the last chapter\n",
"\n",
"$$ \\mu = \\mu_1+\\mu_2$$\n",
"\n",
"Hopefully the general process is clear, so now I will go a bit faster on the rest. We have\n",
"\n",
"$$\\mathbf{\\bar{P}} = \\mathbf{FPF}^\\mathsf{T} + \\mathbf Q$$\n",
"\n",
"Again, since our state only has one variable $\\mathbf P$ and $\\mathbf Q$ must also be $1\\times 1$ matrix, which we can treat as scalars, yielding \n",
"\n",
"$$\\bar{P} = FPF^\\mathsf{T} + Q$$\n",
"\n",
"We already know $F=1$. The transpose of a scalar is the scalar, so $F^\\mathsf{T} = 1$. This yields\n",
"\n",
"$$\\bar{P} = P + Q$$\n",
"\n",
"which is equivalent to the Gaussian equation of \n",
"\n",
"$$\\sigma^2 = \\sigma_1^2 + \\sigma_2^2$$\n",
"\n",
"This proves that the multivariate prediction equations are performing the same math as the univariate equations for the case of the dimension being 1.\n",
"\n",
"These are the equations for the update step:\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"\\mathbf{K}&= \\mathbf{\\bar{P}H}^\\mathsf{T} (\\mathbf{H\\bar{P}H}^\\mathsf{T} + \\mathbf R)^{-1} \\\\\n",
"\\textbf{y} &= \\mathbf z - \\mathbf{H \\bar{x}}\\\\\n",
"\\mathbf x&=\\mathbf{\\bar{x}} +\\mathbf{K\\textbf{y}} \\\\\n",
"\\mathbf P&= (\\mathbf{I}-\\mathbf{KH})\\mathbf{\\bar{P}}\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"As above, all of the matrices become scalars. $H$ defines how we convert from a position to a measurement. Both are positions, so there is no conversion, and thus $H=1$. Let's substitute in our known values and convert to scalar in one step. The inverse of a 1x1 matrix is the reciprocal of the value so we will convert the matrix inversion to division.\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"K &=\\frac{\\bar{P}}{\\bar{P} + R} \\\\\n",
"y &= z - \\bar{x}\\\\\n",
"x &=\\bar{x}+Ky \\\\\n",
"P &= (1-K)\\bar{P}\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"Before we continue with the proof, I want you to look at those equations to recognize what a simple concept these equations implement. The residual $y$ is nothing more than the measurement minus the prediction. The gain $K$ is scaled based on how certain we are about the last prediction vs how certain we are about the measurement. We choose a new state $x$ based on the old value of $x$ plus the scaled value of the residual. Finally, we update the uncertainty based on how certain we are about the measurement. Algorithmically this should sound exactly like what we did in the last chapter.\n",
"\n",
"Let's finish off the algebra to prove this. Recall that the univariate equations for the update step are:\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"\\mu &=\\frac{\\sigma_1^2 \\mu_2 + \\sigma_2^2 \\mu_1} {\\sigma_1^2 + \\sigma_2^2}, \\\\\n",
"\\sigma^2 &= \\frac{1}{\\frac{1}{\\sigma_1^2} + \\frac{1}{\\sigma_2^2}}\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"Here we will say that $\\mu_1$ is the state $x$, and $\\mu_2$ is the measurement $z$. Thus it follows that that $\\sigma_1^2$ is the state uncertainty $P$, and $\\sigma_2^2$ is the measurement noise $R$. Let's substitute those in.\n",
"\n",
"$$\\begin{aligned} \\mu &= \\frac{Pz + Rx}{P+R} \\\\\n",
"\\sigma^2 &= \\frac{1}{\\frac{1}{P} + \\frac{1}{R}}\n",
"\\end{aligned}$$\n",
"\n",
"I will handle $\\mu$ first. The corresponding equation in the multivariate case is\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"x &= x + Ky \\\\\n",
"&= x + \\frac{P}{P+R}(z-x) \\\\\n",
"&= \\frac{P+R}{P+R}x + \\frac{Pz - Px}{P+R} \\\\\n",
"&= \\frac{Px + Rx + Pz - Px}{P+R} \\\\\n",
"&= \\frac{Pz + Rx}{P+R}\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"Now let's look at $\\sigma^2$. The corresponding equation in the multivariate case is\n",
"\n",
"$$ \n",
"\\begin{aligned}\n",
"P &= (1-K)P \\\\\n",
"&= (1-\\frac{P}{P+R})P \\\\\n",
"&= (\\frac{P+R}{P+R}-\\frac{P}{P+R})P \\\\\n",
"&= (\\frac{P+R-P}{P+R})P \\\\\n",
"&= \\frac{RP}{P+R}\\\\\n",
"&= \\frac{1}{\\frac{P+R}{RP}}\\\\\n",
"&= \\frac{1}{\\frac{R}{RP} + \\frac{P}{RP}} \\\\\n",
"&= \\frac{1}{\\frac{1}{P} + \\frac{1}{R}}\n",
"\\quad\\blacksquare\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"We have proven that the multivariate equations are equivalent to the univariate equations when we only have one state variable. I'll close this section by recognizing one quibble - I hand waved my assertion that $H=1$ and $F=1$. In general we know this is not true. For example, a digital thermometer may provide measurement in volts, and we need to convert that to temperature, and we use $H$ to do that conversion. I left that issue out to keep the explanation as simple and streamlined as possible. It is very straightforward to add that generalization to the equations above, redo the algebra, and still have the same results."
]
},
{
"cell_type": "markdown",
"metadata": {},

View File

@ -0,0 +1,436 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<style>\n",
"@import url('http://fonts.googleapis.com/css?family=Source+Code+Pro');\n",
"@import url('http://fonts.googleapis.com/css?family=Lora');\n",
"\n",
"//@import url('http://fonts.googleapis.com/css?family=Open+Sans');\n",
"//@import url('http://fonts.googleapis.com/css?family=Vollkorn');\n",
"//@import url('http://fonts.googleapis.com/css?family=Karla');\n",
"//@import url('http://fonts.googleapis.com/css?family=Poppins');\n",
"//@import url('http://fonts.googleapis.com/css?family=Arimo');\n",
"//@import url('http://fonts.googleapis.com/css?family=Roboto');\n",
"//@import url('http://fonts.googleapis.com/css?family=Lato');\n",
"//@import url('http://fonts.googleapis.com/css?family=Domine');\n",
"//@import url('http://fonts.googleapis.com/css?family=Chivo');\n",
"//@import url('http://fonts.googleapis.com/css?family=Cardo');\n",
"//@import url('http://fonts.googleapis.com/css?family=Arvo');\n",
"//@import url('http://fonts.googleapis.com/css?family=Crimson+Text');\n",
"//@import url('http://fonts.googleapis.com/css?family=Ubuntu');\n",
"//@import url('http://fonts.googleapis.com/css?family=Fontin');\n",
"//@import url('http://fonts.googleapis.com/css?family=Raleway');\n",
"//@import url('http://fonts.googleapis.com/css?family=Merriweather');\n",
"\n",
"\n",
".CodeMirror pre {\n",
" font-family: 'Source Code Pro', Consolas, monocco, monospace;\n",
"}\n",
" div.cell{\n",
" width: 850px;\n",
" margin-left: 0% !important;\n",
" margin-right: auto;\n",
" }\n",
" div.text_cell_render{\n",
" font-family: 'Lora';\n",
" //font-family: 'Open Sans';\n",
" //font-family: 'Karla',verdana,arial,sans-serif;\n",
" //font-family: 'Roboto',verdana,arial,sans-serif;\n",
" //font-family: 'Lato',verdana,arial,sans-serif;\n",
" //font-family: 'Domine',verdana,arial,sans-serif;\n",
" //font-family: 'Chivo',verdana,arial,sans-serif;\n",
" //font-family: 'Cardo',verdana,arial,sans-serif;\n",
" //font-family: 'Arvo',verdana,arial,sans-serif;\n",
" //font-family: 'Poppins',verdana,arial,sans-serif; \n",
" //font-family: 'Ubuntu',verdana,arial,sans-serif;\n",
" //font-family: 'Fontin',verdana,arial,sans-serif;\n",
" //font-family: 'Raleway',verdana,arial,sans-serif;\n",
" //font-family: 'Merriweather',verdana,arial,sans-serif;\n",
" //font-family: 'Crimson Text', verdana,arial,sans-serif;\n",
" //font-family: verdana,arial,sans-serif;\n",
" //font-family: arial,sans-serif;\n",
" line-height: 125%;\n",
" font-size: 130%;\n",
" text-align: justify;\n",
" text-justify:inter-word;\n",
" }\n",
" div.text_cell code {\n",
" background: transparent;\n",
" color: #000000;\n",
" font-weight: 400;\n",
" font-size: 12pt;\n",
" //font-style: bold;\n",
" font-family: 'Source Code Pro', Consolas, monocco, monospace;\n",
" }\n",
" h1 {\n",
" font-family: 'Open sans',verdana,arial,sans-serif;\n",
"\t}\n",
"\t\n",
" div.input_area {\n",
" background: #F6F6F9;\n",
" border: 1px solid #586e75; \n",
" }\n",
"\n",
" .text_cell_render h1 {\n",
" font-weight: 200;\n",
" font-size: 30pt;\n",
" line-height: 100%;\n",
" color:#c76c0c;\n",
" margin-bottom: 0.5em;\n",
" margin-top: 1em;\n",
" display: block;\n",
" white-space: wrap;\n",
" text-align: left;\n",
" } \n",
" h2 {\n",
" font-family: 'Open sans',verdana,arial,sans-serif;\n",
" text-align: left;\n",
" }\n",
" .text_cell_render h2 {\n",
" font-weight: 200;\n",
" font-size: 16pt;\n",
" font-style: italic;\n",
" line-height: 100%;\n",
" color:#c76c0c;\n",
" margin-bottom: 0.5em;\n",
" margin-top: 1.5em;\n",
" display: block;\n",
" white-space: wrap;\n",
" text-align: left;\n",
" } \n",
" h3 {\n",
" font-family: 'Open sans',verdana,arial,sans-serif;\n",
" }\n",
" .text_cell_render h3 {\n",
" font-weight: 200;\n",
" font-size: 14pt;\n",
" line-height: 100%;\n",
" color:#d77c0c;\n",
" margin-bottom: 0.5em;\n",
" margin-top: 2em;\n",
" display: block;\n",
" white-space: wrap;\n",
" text-align: left;\n",
" }\n",
" h4 {\n",
" font-family: 'Open sans',verdana,arial,sans-serif;\n",
" }\n",
" .text_cell_render h4 {\n",
" font-weight: 100;\n",
" font-size: 14pt;\n",
" color:#d77c0c;\n",
" margin-bottom: 0.5em;\n",
" margin-top: 0.5em;\n",
" display: block;\n",
" white-space: nowrap;\n",
" }\n",
" h5 {\n",
" font-family: 'Open sans',verdana,arial,sans-serif;\n",
" }\n",
"\n",
" .text_cell_render h5 {\n",
" font-weight: 200;\n",
" font-style: normal;\n",
" color: #1d3b84;\n",
" font-size: 16pt;\n",
" margin-bottom: 0em;\n",
" margin-top: 0.5em;\n",
" display: block;\n",
" white-space: nowrap;\n",
" }\n",
" div.output_subarea.output_text.output_pyout {\n",
" overflow-x: auto;\n",
" overflow-y: scroll;\n",
" max-height: 50000px;\n",
" }\n",
" div.output_subarea.output_stream.output_stdout.output_text {\n",
" overflow-x: auto;\n",
" overflow-y: scroll;\n",
" max-height: 50000px;\n",
" }\n",
" div.output_wrapper{\n",
" margin-top:0.2em;\n",
" margin-bottom:0.2em;\n",
"}\n",
"\n",
" code{\n",
" font-size: 6pt;\n",
"\n",
" }\n",
" .rendered_html code{\n",
" background-color: transparent;\n",
" }\n",
" ul{\n",
" margin: 2em;\n",
" }\n",
" ul li{\n",
" padding-left: 0.5em; \n",
" margin-bottom: 0.5em; \n",
" margin-top: 0.5em; \n",
" }\n",
" ul li li{\n",
" padding-left: 0.2em; \n",
" margin-bottom: 0.2em; \n",
" margin-top: 0.2em; \n",
" }\n",
" ol{\n",
" margin: 2em;\n",
" }\n",
" ol li{\n",
" padding-left: 0.5em; \n",
" margin-bottom: 0.5em; \n",
" margin-top: 0.5em; \n",
" }\n",
" ul li{\n",
" padding-left: 0.5em; \n",
" margin-bottom: 0.5em; \n",
" margin-top: 0.2em; \n",
" }\n",
" a:link{\n",
" font-weight: bold;\n",
" color:#447adb;\n",
" }\n",
" a:visited{\n",
" font-weight: bold;\n",
" color: #1d3b84;\n",
" }\n",
" a:hover{\n",
" font-weight: bold;\n",
" color: #1d3b84;\n",
" }\n",
" a:focus{\n",
" font-weight: bold;\n",
" color:#447adb;\n",
" }\n",
" a:active{\n",
" font-weight: bold;\n",
" color:#447adb;\n",
" }\n",
" .rendered_html :link {\n",
" text-decoration: underline; \n",
" }\n",
" .rendered_html :hover {\n",
" text-decoration: none; \n",
" }\n",
" .rendered_html :visited {\n",
" text-decoration: none;\n",
" }\n",
" .rendered_html :focus {\n",
" text-decoration: none;\n",
" }\n",
" .rendered_html :active {\n",
" text-decoration: none;\n",
" }\n",
" .warning{\n",
" color: rgb( 240, 20, 20 )\n",
" } \n",
" hr {\n",
" color: #f3f3f3;\n",
" background-color: #f3f3f3;\n",
" height: 1px;\n",
" }\n",
" blockquote{\n",
" display:block;\n",
" background: #fcfcfc;\n",
" border-left: 5px solid #c76c0c;\n",
" font-family: 'Open sans',verdana,arial,sans-serif;\n",
" width:680px;\n",
" padding: 10px 10px 10px 10px;\n",
" text-align:justify;\n",
" text-justify:inter-word;\n",
" }\n",
" blockquote p {\n",
" margin-bottom: 0;\n",
" line-height: 125%;\n",
" font-size: 100%;\n",
" }\n",
"</style>\n",
"<script>\n",
" MathJax.Hub.Config({\n",
" TeX: {\n",
" extensions: [\"AMSmath.js\"],\n",
" equationNumbers: { autoNumber: \"AMS\", useLabelIds: true}\n",
" },\n",
" tex2jax: {\n",
" inlineMath: [ ['$','$'], [\"\\\\(\",\"\\\\)\"] ],\n",
" displayMath: [ ['$$','$$'], [\"\\\\[\",\"\\\\]\"] ]\n",
" },\n",
" displayAlign: 'center', // Change this to 'center' to center equations.\n",
" \"HTML-CSS\": {\n",
" scale:95,\n",
" availableFonts: [],\n",
" preferredFont:null,\n",
" webFont: \"TeX\",\n",
" styles: {'.MathJax_Display': {\"margin\": 4}}\n",
" }\n",
" });\n",
"</script>\n"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#format the book\n",
"%matplotlib inline\n",
"from __future__ import division, print_function\n",
"import sys;sys.path.insert(0,'..')\n",
"from book_format import load_style;load_style('..')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Converting the Multivariate Equations to the Univariate Case\n",
"\n",
"The multivariate Kalman filter equations do not resemble the equations for the univariate filter. However, if we use one dimensional states and measurements the equations do reduce to the univariate equations. This section will provide you with a strong intuition into what the Kalman filter equations are actually doing. While reading this section is not required to understand the rest of the book, I recommend reading this section carefully as it should make the rest of the material easier to understand.\n",
"\n",
"Here are the multivariate equations for the prediction. \n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"\\mathbf{\\bar{x}} &= \\mathbf{F x} + \\mathbf{B u} \\\\\n",
"\\mathbf{\\bar{P}} &= \\mathbf{FPF}^\\mathsf{T} + \\mathbf Q\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"For a univariate problem the state $\\mathbf x$ only has one variable, so it is a $1\\times 1$ matrix. Our motion $\\mathbf{u}$ is also a $1\\times 1$ matrix. Therefore, $\\mathbf{F}$ and $\\mathbf B$ must also be $1\\times 1$ matrices. That means that they are all scalars, and we can write\n",
"\n",
"$$\\bar{x} = Fx + Bu$$\n",
"\n",
"Here the variables are not bold, denoting that they are not matrices or vectors. \n",
"\n",
"Our state transition is simple - the next state is the same as this state, so $F=1$. The same holds for the motion transition, so, $B=1$. Thus we have\n",
"\n",
"$$x = x + u$$\n",
"\n",
"which is equivalent to the Gaussian equation from the last chapter\n",
"\n",
"$$ \\mu = \\mu_1+\\mu_2$$\n",
"\n",
"Hopefully the general process is clear, so now I will go a bit faster on the rest. We have\n",
"\n",
"$$\\mathbf{\\bar{P}} = \\mathbf{FPF}^\\mathsf{T} + \\mathbf Q$$\n",
"\n",
"Again, since our state only has one variable $\\mathbf P$ and $\\mathbf Q$ must also be $1\\times 1$ matrix, which we can treat as scalars, yielding \n",
"\n",
"$$\\bar{P} = FPF^\\mathsf{T} + Q$$\n",
"\n",
"We already know $F=1$. The transpose of a scalar is the scalar, so $F^\\mathsf{T} = 1$. This yields\n",
"\n",
"$$\\bar{P} = P + Q$$\n",
"\n",
"which is equivalent to the Gaussian equation of \n",
"\n",
"$$\\sigma^2 = \\sigma_1^2 + \\sigma_2^2$$\n",
"\n",
"This proves that the multivariate prediction equations are performing the same math as the univariate equations for the case of the dimension being 1.\n",
"\n",
"These are the equations for the update step:\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"\\mathbf{K}&= \\mathbf{\\bar{P}H}^\\mathsf{T} (\\mathbf{H\\bar{P}H}^\\mathsf{T} + \\mathbf R)^{-1} \\\\\n",
"\\textbf{y} &= \\mathbf z - \\mathbf{H \\bar{x}}\\\\\n",
"\\mathbf x&=\\mathbf{\\bar{x}} +\\mathbf{K\\textbf{y}} \\\\\n",
"\\mathbf P&= (\\mathbf{I}-\\mathbf{KH})\\mathbf{\\bar{P}}\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"As above, all of the matrices become scalars. $H$ defines how we convert from a position to a measurement. Both are positions, so there is no conversion, and thus $H=1$. Let's substitute in our known values and convert to scalar in one step. The inverse of a 1x1 matrix is the reciprocal of the value so we will convert the matrix inversion to division.\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"K &=\\frac{\\bar{P}}{\\bar{P} + R} \\\\\n",
"y &= z - \\bar{x}\\\\\n",
"x &=\\bar{x}+Ky \\\\\n",
"P &= (1-K)\\bar{P}\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"Before we continue with the proof, I want you to look at those equations to recognize what a simple concept these equations implement. The residual $y$ is nothing more than the measurement minus the prediction. The gain $K$ is scaled based on how certain we are about the last prediction vs how certain we are about the measurement. We choose a new state $x$ based on the old value of $x$ plus the scaled value of the residual. Finally, we update the uncertainty based on how certain we are about the measurement. Algorithmically this should sound exactly like what we did in the last chapter.\n",
"\n",
"Let's finish off the algebra to prove this. Recall that the univariate equations for the update step are:\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"\\mu &=\\frac{\\sigma_1^2 \\mu_2 + \\sigma_2^2 \\mu_1} {\\sigma_1^2 + \\sigma_2^2}, \\\\\n",
"\\sigma^2 &= \\frac{1}{\\frac{1}{\\sigma_1^2} + \\frac{1}{\\sigma_2^2}}\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"Here we will say that $\\mu_1$ is the state $x$, and $\\mu_2$ is the measurement $z$. Thus it follows that that $\\sigma_1^2$ is the state uncertainty $P$, and $\\sigma_2^2$ is the measurement noise $R$. Let's substitute those in.\n",
"\n",
"$$\\begin{aligned} \\mu &= \\frac{Pz + Rx}{P+R} \\\\\n",
"\\sigma^2 &= \\frac{1}{\\frac{1}{P} + \\frac{1}{R}}\n",
"\\end{aligned}$$\n",
"\n",
"I will handle $\\mu$ first. The corresponding equation in the multivariate case is\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
"x &= x + Ky \\\\\n",
"&= x + \\frac{P}{P+R}(z-x) \\\\\n",
"&= \\frac{P+R}{P+R}x + \\frac{Pz - Px}{P+R} \\\\\n",
"&= \\frac{Px + Rx + Pz - Px}{P+R} \\\\\n",
"&= \\frac{Pz + Rx}{P+R}\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"Now let's look at $\\sigma^2$. The corresponding equation in the multivariate case is\n",
"\n",
"$$ \n",
"\\begin{aligned}\n",
"P &= (1-K)P \\\\\n",
"&= (1-\\frac{P}{P+R})P \\\\\n",
"&= (\\frac{P+R}{P+R}-\\frac{P}{P+R})P \\\\\n",
"&= (\\frac{P+R-P}{P+R})P \\\\\n",
"&= \\frac{RP}{P+R}\\\\\n",
"&= \\frac{1}{\\frac{P+R}{RP}}\\\\\n",
"&= \\frac{1}{\\frac{R}{RP} + \\frac{P}{RP}} \\\\\n",
"&= \\frac{1}{\\frac{1}{P} + \\frac{1}{R}}\n",
"\\quad\\blacksquare\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"We have proven that the multivariate equations are equivalent to the univariate equations when we only have one state variable. I'll close this section by recognizing one quibble - I hand waved my assertion that $H=1$ and $F=1$. In general we know this is not true. For example, a digital thermometer may provide measurement in volts, and we need to convert that to temperature, and we use $H$ to do that conversion. I left that issue out to keep the explanation as simple and streamlined as possible. It is very straightforward to add that generalization to the equations above, redo the algebra, and still have the same results.\\\\\\"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.1"
}
},
"nbformat": 4,
"nbformat_minor": 0
}

View File

@ -128,6 +128,9 @@
"\n",
"Interactive simulations of various algorithms. Use sliders to change the output in real time.\n",
"\n",
"[**Converting the Multivariate Equations to the Univariate Case**](http://nbviewer.ipython.org/urls/raw.github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python/master/Supporting_Notebooks/Converting-Multivariate-Equations-to-Univariate.ipynb)\n",
"\n",
"Demonstrates that the Multivariate equations are identical to the univariate Kalman filter equations by setting the dimension of all vectors and matrices to one.\n",
"\n",
"[**Iterative Least Squares for Sensor Fusion**](http://nbviewer.ipython.org/urls/raw.github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python/master/Supporting_Notebooks/Iterative-Least-Squares-for-Sensor-Fusion.ipynb)\n",
"\n",