Minor typo fixes.
This commit is contained in:
parent
77ce3e9951
commit
056ada57af
@ -508,7 +508,7 @@
|
||||
"\n",
|
||||
"There's a better way. If I want to perform Runge Kutta I call `ode45`, I do not embed an Runge Kutta implementation in my code. I don't want to implement Runge Kutta multiple times and debug it several times. if I do find a bug, I can fix it once and be assured that it now works across all my different projects. And, it is readable. It is rare that I care about the implementation of Runge Kutta.\n",
|
||||
"\n",
|
||||
"This is a textbook on Kalman filtering, and you can argue that we *do* care about the implementation of Kalman filters. That is true, but you will find out the code that performs the filtering amounts to 10 or so lines of code. The code to implement the math is fairly trivial. Most of the work that Kalman filters requires is the design of the matrices that get fed into the math engine.\n",
|
||||
"This is a textbook on Kalman filtering, and you can argue that we *do* care about the implementation of Kalman filters. That is true, but you will find out the code that performs the filtering amounts to 10 or so lines of code. The code to implement the math is fairly trivial. Most of the work that Kalman filter requires is the design of the matrices that get fed into the math engine.\n",
|
||||
"\n",
|
||||
"A possible downside is that the equations that perform the filtering are hidden behind functions, which we could argue is a loss in a pedagogical text. I argue the converse. I want you to learn how to use Kalman filters in the real world, for real projects, and you shouldn't be cutting and pasting established algorithms all over the place.\n",
|
||||
"\n",
|
||||
|
@ -938,7 +938,7 @@
|
||||
"\n",
|
||||
"Well, no. Recall the g-h filter chapter. In that chapter we agreed that if I weighed myself on two scales, and the first read 160 lbs while the second read 170 lbs, and both were equally accurate, the best estimate was 165 lbs. Furthermore I should be a bit more confident about 165 lbs vs 160 lbs or 170 lbs because I know have two readings, both near this estimate, increasing my confidence that neither is wildly wrong. \n",
|
||||
"\n",
|
||||
"Of course, this example is quite exaggerated. The width of the Gaussians is fairly narrow, so this combination of measurements is extremely unlikely. It is hard to eyeball this, but the measurements are well over $3\\sigma$ apart, so the probability of this happening is less than 1%. Still, it can happen, and the math is correct. In practice this is the sort of thing we use to decide if \n",
|
||||
"Of course, this example is quite exaggerated. The width of the Gaussians is fairly narrow, so this combination of measurements is extremely unlikely. It is hard to eyeball this, but the measurements are well over $3\\sigma$ apart, so the probability of this happening is less than 1%. Still, it can happen, and the math is correct. \n",
|
||||
"\n",
|
||||
"Let's look at the math again to convince ourselves that the physical interpretation of the Gaussian equations makes sense. I'm going to switch back to using priors and measurements. The math and reasoning is the same whether you are using a prior and incorporating a measurement, or just trying to compute the mean and variance of two measurements. For Kalman filters we will be doing a lot more of the former than the latter, so let's get used to it.\n",
|
||||
"\n",
|
||||
|
@ -992,7 +992,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Here I have plotted the original estimate (prior) it a very transparent yellow, the radar reading in blue (evidence), and the finale estimate (posterior) in yellow.\n",
|
||||
"Here I have plotted the original estimate (prior) in a very transparent yellow, the radar reading in blue (evidence), and the finale estimate (posterior) in yellow.\n",
|
||||
"\n",
|
||||
"The Gaussian retained the same shape and position as the radar measurement, but is smaller. We've seen this with one dimensional Gaussians. Multiplying two Gaussians makes the variance smaller because we are incorporating more information, hence we are less uncertain. But the main point I want to make is that the covariance shape reflects the physical layout of the aircraft and the radar system.\n",
|
||||
"\n",
|
||||
@ -1428,7 +1428,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.4.3"
|
||||
"version": "3.4.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
@ -2920,7 +2920,7 @@
|
||||
"\n",
|
||||
"These problems led some researchers and engineers to derogatorily call the Kalman filter a 'ball of mud'. In other words, it doesn't always hold together so well. Another term to know - Kalman filters can become **smug**. Their estimates are based solely on what you tell it the noises are. Those values can lead to overly confident estimates. $\\mathbf{P}$ gets smaller and smaller while the filter is actually becoming more and more inaccurate! In the worst case the filter diverges. We will see a lot of that when we start studying nonlinear filters. \n",
|
||||
"\n",
|
||||
"The reality is that the Kalman filter is a mathematical model of the world. The output is only as accurate as that model. To make the math tractable we had to make some assumptions. I We assume that the sensors and motion model have Gaussian noise. We assume that everything is linear. If that is true, the Kalman filter is *optimal* in a least squares sense. This means that there is no way to make a better estimate than what the filter gives us. However, these assumption are almost never true, and hence the model is necessarily limited, and a working filter is rarely optimal.\n",
|
||||
"The reality is that the Kalman filter is a mathematical model of the world. The output is only as accurate as that model. To make the math tractable we had to make some assumptions. We assume that the sensors and motion model have Gaussian noise. We assume that everything is linear. If that is true, the Kalman filter is *optimal* in a least squares sense. This means that there is no way to make a better estimate than what the filter gives us. However, these assumption are almost never true, and hence the model is necessarily limited, and a working filter is rarely optimal.\n",
|
||||
"\n",
|
||||
"In later chapters we will deal with the problem of nonlinearity. For now I want you to understand that designing the matrices of a linear filter is an experimental procedure more than a mathematical one. Use math to establish the initial values, but then you need to experiment. If there is a lot of unaccounted noise in the world (wind, etc) you may have to make $\\mathbf{Q}$ larger. If you make it too large the filter fails to respond quickly to changes. In the **Adaptive Filters** chapter you will learn some alternative techniques that allow you to change the filter design in real time in response to the inputs and performance, but for now you need to find one set of values that works for the conditions your filter will encounter. Noise matrices for an acrobatic plane might be different if the pilot is a student than if the pilot is an expert as the dynamics will be quite different."
|
||||
]
|
||||
|
@ -1503,7 +1503,7 @@
|
||||
"source": [
|
||||
"**Author's note: this section contains some of the more challenging math in this book. Please bear with it, as few books cover this well, and an accurate design is imperative for good filter performance. At the end I present Python functions from FilterPy which will compute the math for you for common scenarios.**\n",
|
||||
"\n",
|
||||
"In general the design of the $\\mathbf{Q}$ matrix is among the most difficult aspects of Kalman filter design. This is due to several factors. First, the math itself is somewhat difficult and requires a good foundation in signal theory. Second, we are trying to model the noise in something for which we have little information. For example, consider trying to model the process noise for a baseball. We can model it as a sphere moving through the air, but that leave many unknown factors - the wind, ball rotation and spin decay, the coefficient of friction of a scuffed ball with stitches, the effects of wind and air density, and so on. I will develop the equations for an exact mathematical solution for a given process model, but since the process model is incomplete the result for $\\mathbf{Q}$ will also be incomplete. This has a lot of ramifications for the behavior of the Kalman filter. If $\\mathbf{Q}$ is too small than the filter will be overconfident in it's prediction model and will diverge from the actual solution. If $\\mathbf{Q}$ is too large than the filter will be unduly influenced by the noise in the measurements and perform sub-optimally. In practice we spend a lot of time running simulations and evaluating collected data to try to select an appropriate value for $\\mathbf{Q}$. But let's start by looking at the math.\n",
|
||||
"In general the design of the $\\mathbf{Q}$ matrix is among the most difficult aspects of Kalman filter design. This is due to several factors. First, the math itself is somewhat difficult and requires a good foundation in signal theory. Second, we are trying to model the noise in something for which we have little information. For example, consider trying to model the process noise for a baseball. We can model it as a sphere moving through the air, but that leave many unknown factors - the wind, ball rotation and spin decay, the coefficient of friction of a scuffed ball with stitches, the effects of wind and air density, and so on. I will develop the equations for an exact mathematical solution for a given process model, but since the process model is incomplete the result for $\\mathbf{Q}$ will also be incomplete. This has a lot of ramifications for the behavior of the Kalman filter. If $\\mathbf{Q}$ is too small then the filter will be overconfident in its prediction model and will diverge from the actual solution. If $\\mathbf{Q}$ is too large than the filter will be unduly influenced by the noise in the measurements and perform sub-optimally. In practice we spend a lot of time running simulations and evaluating collected data to try to select an appropriate value for $\\mathbf{Q}$. But let's start by looking at the math.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"Let's assume a kinematic system - some system that can be modeled using Newton's equations of motion. We can make a few different assumptions about this process. \n",
|
||||
|
@ -360,7 +360,7 @@
|
||||
"source": [
|
||||
"I particularly like the following way of looking at the problem, which I am borrowing from Dan Simon's *Optimal State Estimation* [[1]](#[1]). Consider a tracking problem where we get the range and bearing to a target, and we want to track its position. Reported distance is 50 km, and the reported angle is 90$^\\circ$. Now, given that sensors are imperfect, assume that the errors in both range and angle are distributed in a Gaussian manner. Given an infinite number of measurements what is the expected value of the position?\n",
|
||||
"\n",
|
||||
"I have been recommending using intuition to gain insight, so let's see how it fares for this problem (hint: nonlinear problems are *not* intuitive). We might reason that since the mean of the range will be 50 km, and the mean of the angle will be 90$^\\circ$, that clearly the answer will be x=0 km, y=90 km.\n",
|
||||
"I have been recommending using intuition to gain insight, so let's see how it fares for this problem (hint: nonlinear problems are *not* intuitive). We might reason that since the mean of the range will be 50 km, and the mean of the angle will be 90$^\\circ$, that clearly the answer will be x=0 km, y=50 km.\n",
|
||||
"\n",
|
||||
"Well, let's plot that and find out. Here are 3000 points plotted with a normal distribution of the distance of 0.4 km, and the angle having a normal distribution of 0.35 radians. We compute the average of the all of the positions, and display it as a star. Our intuition is displayed with a large circle."
|
||||
]
|
||||
@ -492,7 +492,7 @@
|
||||
"> I explain how to plot Gaussians, and much more, in the Notebook *Computing_and_Plotting_PDFs* in the \n",
|
||||
"Supporting_Notebooks folder. You can also read it online [here](https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/Supporting_Notebooks/Computing_and_plotting_PDFs.ipynb)[1]\n",
|
||||
"\n",
|
||||
"The plot labeled 'input' is the histogram of the original data. This is passed through the transfer function $f(x)=2x+1$ which is displayed in the chart on the bottom lweft\n",
|
||||
"The plot labeled 'input' is the histogram of the original data. This is passed through the transfer function $f(x)=2x+1$ which is displayed in the chart on the bottom left\n",
|
||||
". The red lines shows how one value, $x=0$ is passed through the function. Each value from input is passed through in the same way to the output function on the left. For the output I computed the mean by taking the average of all the points, and drew the results with the dotted blue line. A solid blue line shows the actual mean for the point $x=0$. The output looks like a Gaussian, and is in fact a Gaussian. We can see that it is altered -the variance in the output is larger than the variance in the input, and the mean has been shifted from 0 to 1, which is what we would expect given the transfer function $f(x)=2x+1$ The $2x$ affects the variance, and the $+1$ shifts the mean The computed mean, represented by the dotted blue line, is nearly equal to the actual mean. If we used more points in our computation we could get arbitrarily close to the actual value.\n",
|
||||
"\n",
|
||||
"Now let's look at a nonlinear function and see how it affects the probability distribution."
|
||||
|
@ -396,7 +396,7 @@
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"scrolled": false
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
@ -1144,7 +1144,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This gave me a standard deviation 0f 0.013 which is quite small. \n",
|
||||
"This gave me a standard deviation of 0.013 which is quite small. \n",
|
||||
"\n",
|
||||
"So far implementation of the UKF is not *that* different from the linear Kalman filter. Instead of implementing the state function and measurement function as the matrices $\\mathbf{F}$ and $\\mathbf{H}$ you supply functions `f(x)` and `h(x)`. The rest of the theory and implementation remains the same. Of course the `FilterPy` code for the UKF is quite different than the Kalman filter code, but from a designer's point of view the problem formulation and filter design is very similar."
|
||||
]
|
||||
@ -3085,7 +3085,7 @@
|
||||
"\\tan^{-1}(\\frac{p_y - y}{p_x - x}) - \\theta \n",
|
||||
"\\end{bmatrix}$$\n",
|
||||
"\n",
|
||||
"The expression $\\tan^{-1}(\\frac{p_y - y}{p_x - x}) - \\theta$ can produce a result outside the range $[\\pi, \\pi)$, so we should normalize the angle to that range.\n",
|
||||
"The expression $\\tan^{-1}(\\frac{p_y - y}{p_x - x}) - \\theta$ can produce a result outside the range $[-\\pi, \\pi)$, so we should normalize the angle to that range.\n",
|
||||
"\n",
|
||||
"The function will be passed an array of landmarks and needs to produce an array of measurements in the form `[dist_to_1, bearing_to_1, dist_to_2, bearing_to_2, ...]`."
|
||||
]
|
||||
@ -3407,7 +3407,7 @@
|
||||
"\n",
|
||||
"However, the advantage of the UKF over the EKF is not only the relative ease of implementation. It is somewhat premature to discuss this because you haven't learned the EKF yet, but the EKF linearizes the problem at one point and passes that point through a linear Kalman filter. In contrast, the UKF takes $2n+1$ samples. Therefore the UKF is almost always more accurate than the EKF, especially when the problem is highly nonlinear. While it is not true that the UKF is guaranteed to always outperform the EKF, in practice it has been shown to perform at least as well, and usually much better than the EKF. \n",
|
||||
"\n",
|
||||
"Hence my recommendation is to always start by implementing the UKF. If your filter has real world consequences if it diverges (people die, lots of money lost, power plant blows up) of course you will have to engage in a lot of sophisticated analysis and experimentation to chose the best filter. That is beyond the scope of this book, and you should be going to graduate school to learn this theory in much greater detail than this book provides. \n",
|
||||
"Hence my recommendation is to always start by implementing the UKF. If your filter has real world consequences if it diverges (people die, lots of money lost, power plant blows up) of course you will have to engage in a lot of sophisticated analysis and experimentation to choose the best filter. That is beyond the scope of this book, and you should be going to graduate school to learn this theory in much greater detail than this book provides. \n",
|
||||
"\n",
|
||||
"I have spoken of the UKF I presented in this chapter as *the* way to perform sigma point filters. This is not true. The specific version I chose is Julier's scaled unscented filter as parameterized by Van der Merwe in his 2004 dissertation. If you search for Julier, Van der Merwe, Uhlmann, and Wan you will find a family of similar sigma point filters that they developed. Each technique uses a different way of choosing and weighting the sigma points. But the choices don't stop there. For example, there is the SVD Kalman filter that uses singular value decomposition (SVD) to find the approximate mean and covariance of the probability distribution. There are many more. Think of this chapter as an introduction to the sigma point filters, rather than a definitive treatment of how they work. If you have been reading carefully and writing your own code you should be able to read the literature and implement your own filters without needing FilterPy, which does not implement every possible sigma point filter. "
|
||||
]
|
||||
|
@ -1194,7 +1194,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now we can turn our attention to the noise. Here, the noise is in our control input, so it is in *control space*. In other words, we command a specific velocity and steering angle, but we need to convert that into errors in $x, y, \\theta$. In a real system this might vary depending on velocity, so it will need to be recomputed for every prediction. I will chose this as the noise model; for a real robot you will need to choose a model that accurately depicts the error in your system. \n",
|
||||
"Now we can turn our attention to the noise. Here, the noise is in our control input, so it is in *control space*. In other words, we command a specific velocity and steering angle, but we need to convert that into errors in $x, y, \\theta$. In a real system this might vary depending on velocity, so it will need to be recomputed for every prediction. I will choose this as the noise model; for a real robot you will need to choose a model that accurately depicts the error in your system. \n",
|
||||
"\n",
|
||||
"$$\\mathbf{M} = \\begin{bmatrix}0.01 vel^2 & 0 \\\\ 0 & \\sigma_\\alpha^2\\end{bmatrix}$$\n",
|
||||
"\n",
|
||||
@ -1814,7 +1814,7 @@
|
||||
"source": [
|
||||
"## UKF vs EKF\n",
|
||||
"\n",
|
||||
"I implemented this tracking problem using and unscented Kalman filter in the previous chapter. The difference in implementation should be very clear. Computing the Jacobians for the state and measurement models was not trivial and we used a very rudimentary model for the motion of the car. I am justified in using this model because the research resulting from the DARPA car challenges has shown that it works well in practice. Nonetheless, a different problem, such as an aircraft or rocket will yield a very difficult to impossible to compute Jacobian. In contrast, the UKF only requires you to provide a function that computes the system motion model and another for the measurement model. This is will always be easier than deriving a Jacobian analytically. In fact, there are many physical processes for which we cannot find an analytical solution. It is beyond the scope of this book, but in that case you have to design a numerical method to compute the Jacobian. That is a very nontrivial undertaking, and you will spend a significant portion of a master's degree at a STEM school learning various techniques to handle such situations. Even then you'll likely only be able to solve problems related to your field - an aeronautical engineer learns a lot about Navier Stokes equations, but not much about modelling chemical reaction rates. \n",
|
||||
"I implemented this tracking problem using an unscented Kalman filter in the previous chapter. The difference in implementation should be very clear. Computing the Jacobians for the state and measurement models was not trivial and we used a very rudimentary model for the motion of the car. I am justified in using this model because the research resulting from the DARPA car challenges has shown that it works well in practice. Nonetheless, a different problem, such as an aircraft or rocket will yield a very difficult to impossible to compute Jacobian. In contrast, the UKF only requires you to provide a function that computes the system motion model and another for the measurement model. This is will always be easier than deriving a Jacobian analytically. In fact, there are many physical processes for which we cannot find an analytical solution. It is beyond the scope of this book, but in that case you have to design a numerical method to compute the Jacobian. That is a very nontrivial undertaking, and you will spend a significant portion of a master's degree at a STEM school learning various techniques to handle such situations. Even then you'll likely only be able to solve problems related to your field - an aeronautical engineer learns a lot about Navier Stokes equations, but not much about modelling chemical reaction rates. \n",
|
||||
"\n",
|
||||
"So, UKFs are easy. Are they accurate? Everything I have read states that there is no way to prove that a UKF will always perform as well or better than an EKF. However, in practice, they do perform better. You can search and find any number of research papers that prove that the UKF outperforms the EKF in various problem domains. It's not hard to understand why this would be true. The EKF works by linearizing the system model and measurement model at a single point. \n",
|
||||
"\n",
|
||||
|
@ -277,7 +277,7 @@
|
||||
"source": [
|
||||
"## Motivation\n",
|
||||
"\n",
|
||||
"Here is our problem. We have object moving in a space, and we want to track them. Maybe the objects are fighter jets and missiles in the sky, or maybe we are tracking people playing cricket in a field. It doesn't really matter. Which of the filters that we have learned can handle this problem? Well, none of them are ideal. Let's think about the characteristics of this problem. \n",
|
||||
"Here is our problem. We have objects moving in a space, and we want to track them. Maybe the objects are fighter jets and missiles in the sky, or maybe we are tracking people playing cricket in a field. It doesn't really matter. Which of the filters that we have learned can handle this problem? Well, none of them are ideal. Let's think about the characteristics of this problem. \n",
|
||||
"\n",
|
||||
"1. **multi-modal**: We want to track zero, one, or more than one object simultaneously.\n",
|
||||
"\n",
|
||||
|
Loading…
Reference in New Issue
Block a user