Merge pull request #204 from ernstklrb/typos-09ff

Various typos
This commit is contained in:
Roger Labbe 2018-01-26 07:39:34 -08:00 committed by GitHub
commit ccaaa4b04b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 26 additions and 26 deletions

View File

@ -310,7 +310,7 @@
"* homogeneity: $f(ax) = af(x)$\n",
"\n",
"\n",
"This leads us to say that a linear system is defined as a system whose output is linearly proportional to the sum of all its inputs. A consequence of this is that to be linear if the input is zero than the output must also be zero. Consider an audio amp - if I sing into a microphone, and you start talking, the output should be the sum of our voices (input) scaled by the amplifier gain. But if amplifier outputs a nonzero signal such as a hum for a zero input the additive relationship no longer holds. This is because you linearity requires that $amp(voice) = amp(voice + 0)$ This clearly should give the same output, but if amp(0) is nonzero, then\n",
"This leads us to say that a linear system is defined as a system whose output is linearly proportional to the sum of all its inputs. A consequence of this is that to be linear if the input is zero than the output must also be zero. Consider an audio amp - if I sing into a microphone, and you start talking, the output should be the sum of our voices (input) scaled by the amplifier gain. But if the amplifier outputs a nonzero signal such as a hum for a zero input the additive relationship no longer holds. This is because linearity requires that $amp(voice) = amp(voice + 0)$. This clearly should give the same output, but if amp(0) is nonzero, then\n",
"\n",
"$$\n",
"\\begin{aligned}\n",
@ -633,7 +633,7 @@
"source": [
"Unfortunately the nonlinear version is not stable. It drifted significantly from the mean of 0, and the variance is half an order of magnitude larger.\n",
"\n",
"I minimized the issue by using a function that is quite close to a straight line. What happens if the function is $y(x)=x^2$?"
"I minimized the issue by using a function that is quite close to a straight line. What happens if the function is $y(x)=-x^2$?"
]
},
{
@ -792,7 +792,7 @@
"source": [
"That sort of error often leads to disastrous results. The error in this estimate is large. But in the next innovation of the filter that very bad estimate will be used to linearize the next radar measurement, so the next estimate is likely to be markedly worse than this one. After only a few iterations the Kalman filter will diverge, and start producing results that have no correspondence to reality.\n",
"\n",
"This covariance ellipse spans miles. I exaggerated the size to illustrate the difficulties of highly nonlinear systems. In real radar tracking problems the nonlinearity is usually not that bad, but the errors will still accumulate. Other systems you may be work could have this amount of nonlinearity - this was not an exaggeration only to make a point. You will always be battling divergence when working with nonlinear systems."
"This covariance ellipse spans miles. I exaggerated the size to illustrate the difficulties of highly nonlinear systems. In real radar tracking problems the nonlinearity is usually not that bad, but the errors will still accumulate. Other systems you might work with could have this amount of nonlinearity - this was not an exaggeration only to make a point. You will always be battling divergence when working with nonlinear systems."
]
},
{
@ -814,9 +814,9 @@
"\n",
"Recently the field has been changing in exciting ways. First, computing power has grown to the point that we can use techniques that were once beyond the ability of a supercomputer. These use *Monte Carlo* techniques - the computer generates thousands to tens of thousands of random points and tests all of them against the measurements. It then probabilistically kills or duplicates points based on how well they match the measurements. A point far away from the measurement is unlikely to be retained, whereas a point very close is quite likely to be retained. After a few iterations there is a clump of particles closely tracking your object, and a sparse cloud of points where there is no object.\n",
"\n",
"This has two benefits. First, the algorithm is robust even for extremely nonlinear problems. Second, the algorithm can track arbitrarily many objects at once - some particles will match the behavior on one object, and other particles will match other objects. So this technique is often used to track automobile traffic, people in crowds, and so on. \n",
"This has two benefits. First, the algorithm is robust even for extremely nonlinear problems. Second, the algorithm can track arbitrarily many objects at once - some particles will match the behavior of one object, and other particles will match other objects. So this technique is often used to track automobile traffic, people in crowds, and so on. \n",
"\n",
"The costs should be clear. It is computationally expensive to test tens of thousands of points for every step in the filter. But modern CPUs are very fast, and this is a good problem for GPUs because the part of the algorithm is parallelizable. Another cost is that the answer is not mathematical. With a Kalman filter my covariance matrix gives me important information about the amount of error in the estimate. The particle filter does not give me a rigorous way to compute this. Finally, the output of the filter is a cloud of points; I then have to figure out how to interpret it. Usually you will be doing something like taking the mean and standard deviations of the points, but this is a difficult problem. There are still many points that do not 'belong' to a tracked object, so you first have to run some sort of clustering algorithm to first find the points that seem to be tracking an object, and then you need another algorithm to produce an state estimate from those points. None of this is intractable, but it is all quite computationally expensive. \n",
"The costs should be clear. It is computationally expensive to test tens of thousands of points for every step in the filter. But modern CPUs are very fast, and this is a good problem for GPUs because the part of the algorithm is parallelizable. Another cost is that the answer is not mathematical. With a Kalman filter my covariance matrix gives me important information about the amount of error in the estimate. The particle filter does not give me a rigorous way to compute this. Finally, the output of the filter is a cloud of points; I then have to figure out how to interpret it. Usually you will be doing something like taking the mean and standard deviations of the points, but this is a difficult problem. There are still many points that do not 'belong' to a tracked object, so you first have to run some sort of clustering algorithm to first find the points that seem to be tracking an object, and then you need another algorithm to produce a state estimate from those points. None of this is intractable, but it is all quite computationally expensive. \n",
"\n",
"\n",
"Finally, we have a new algorithm called the *unscented Kalman filter* (UKF). It does not require you to find analytic solutions to nonlinear equations, and yet almost always performs better than the EKF. It does well with nonlinear problems - problems where the EKF has significant difficulties. Designing the filter is extremely easy. Some will say the jury is still out on the UKF, but to my mind the UKF is superior in almost every way to the EKF. I suggest that the UKF should be the starting point for any implementation, especially if you are not a Kalman filter professional with a graduate degree in control theory. The main downside is that the UKF can be a few times slower than the EKF, but this really depends on whether the EKF solves the Jacobian analytically or numerically. If numerically the UKF is almost certainly faster. It has not been proven (and probably it cannot be proven) that the UKF always yields more accurate results than the EKF. In practice it almost always does, often significantly so. It is very easy to understand and implement, and I strongly suggest this filter as your starting point. "
@ -837,15 +837,15 @@
"\n",
"Until recently the linearized Kalman filter and EKF have been the standard way to solve these problems. They are very difficult to understand and use, and they are also potentially very unstable. \n",
"\n",
"Recent developments have offered what are to my mind superior approaches. The UKF dispenses with the need to find solutions to partial differential equations, yet it is also usually more accurate than the EKF. It is easy to use and understand. I can get a basic UKF going in a few minutes by using FilterPy. The particle filter dispenses with mathimatical modeling completely in favor of a Monte Carlo technique of generating a random cloud of thousands of points. It runs slowly, but it can solve otherwise intractable problems with relative ease.\n",
"Recent developments have offered what are to my mind superior approaches. The UKF dispenses with the need to find solutions to partial differential equations, yet it is also usually more accurate than the EKF. It is easy to use and understand. I can get a basic UKF going in a few minutes by using FilterPy. The particle filter dispenses with mathematical modeling completely in favor of a Monte Carlo technique of generating a random cloud of thousands of points. It runs slowly, but it can solve otherwise intractable problems with relative ease.\n",
"\n",
"I get more email about the EKF than anything else; I suspect that this is because most treatments in books, papers, and on the internet use the EKF. If your interest is in mastering the field of course you will want to learn about the EKF. But if you are just trying to get good results I point you to the UKF and particle filter first. They are much easier to implement, understand, and use, and they are typically far more stable than the EKF. \n",
"\n",
"Some will quibble with that advice. A lot of recent publications are devoted to a comparison of the EKF, UKF, and perhaps a few other choices for a given problem. Do you not need to perform a similar comparison for your problem? If you are sending a rocket to Mars then of course you do. You will be balancing issues such as accuracy, round off errors, divergence, mathematical proof of correctness, and the computational effort required. I can't imagine not knowing the EKF intimately. \n",
"\n",
"On the other hand the UKF works spectacularly! I use it at work for real world applications. I mostly haven't even tried to implement an EKF for these applications because I can verify that the UKF is working fine. Is it possible that I might eke out another 0.2% of performance from the EKF in certain situations? Sure! Do I care? No! I completely understand the UKF implementation, it is easy to test and verify, I can pass the code to others and be confident that they can understand and modify it, and I am not a masochist that wants to battle difficult equations when I already have a working solution. If the UKF or particle filters start to perform poorly for some problem then I will turn other to techniques, but not before then. And realistically, the UKF usually provides substantially better performance than the EKF over a wide range of problems and conditions. If \"really good\" is good enough I'm going to spend my time working on other problems. \n",
"On the other hand the UKF works spectacularly! I use it at work for real world applications. I mostly haven't even tried to implement an EKF for these applications because I can verify that the UKF is working fine. Is it possible that I might eke out another 0.2% of performance from the EKF in certain situations? Sure! Do I care? No! I completely understand the UKF implementation, it is easy to test and verify, I can pass the code to others and be confident that they can understand and modify it, and I am not a masochist that wants to battle difficult equations when I already have a working solution. If the UKF or particle filters start to perform poorly for some problem then I will turn to other techniques, but not before then. And realistically, the UKF usually provides substantially better performance than the EKF over a wide range of problems and conditions. If \"really good\" is good enough I'm going to spend my time working on other problems. \n",
"\n",
"I'm belaboring this point because in most textbooks the EKF is given center stage, and the UKF is either not mentioned at all or just given a 2 page gloss that leaves you completely unprepared to use the filter. The UKF is still relatively new, and it takes time to write new editions of books. At the time many books were written the UKF was either not discovered yet, or it was just an unproven but promising curiosity. But I am writing this now, the UKF has had enormous success, and it needs to be in your toolkit. That is what I will spend most of my effort trying to teach you. "
"I'm belaboring this point because in most textbooks the EKF is given center stage, and the UKF is either not mentioned at all or just given a 2 page gloss that leaves you completely unprepared to use the filter. The UKF is still relatively new, and it takes time to write new editions of books. At the time many books were written the UKF was either not discovered yet, or it was just an unproven but promising curiosity. But as I am writing this now, the UKF has had enormous success, and it needs to be in your toolkit. That is what I will spend most of my effort trying to teach you. "
]
},
{

View File

@ -381,8 +381,8 @@
"\n",
"through the nonlinear system:\n",
"\n",
"$$\\begin{cases}\\begin{aligned}x&=x+y\\\\\n",
"y &= 0.1x^2 + y^2\\end{aligned} \\end{cases}$$ "
"$$\\begin{cases}\\begin{aligned}\\bar x&=x+y\\\\\n",
"\\bar y&= 0.1x^2 + y^2\\end{aligned} \\end{cases}$$ "
]
},
{
@ -447,7 +447,7 @@
"\n",
"The fewest number of points that we can use is one per dimension. This is the number that the linear Kalman filter uses. The input to a Kalman filter for the distribution $\\mathcal{N}(\\mu,\\sigma^2)$ is $\\mu$ itself. So while this works for the linear case, it is not a good answer for the nonlinear case.\n",
"\n",
"Perhaps we can use one point per dimension, but altered somehow. However, if we were to pass some value $\\mu+\\Delta$ into the identity function $f(x)=x$ it would not converge, so this will not work. If we didn't alter $\\mu$ then this would the standard Kalman filter. We must conclude that one sample will not work.\n",
"Perhaps we can use one point per dimension, but altered somehow. However, if we were to pass some value $\\mu+\\Delta$ into the identity function $f(x)=x$ it would not converge, so this will not work. If we didn't alter $\\mu$ then this would be the standard Kalman filter. We must conclude that one sample will not work.\n",
"\n",
"What is the next lowest number we can choose? Two. Consider the fact that Gaussians are symmetric, and that we probably want to always have one of our sample points be the mean of the input for the identity function to work. Two points would require us to select the mean, and then one other point. That one other point would introduce an asymmetry in our input that we probably don't want. It would be very difficult to make this work for the identity function $f(x)=x$.\n",
"\n",
@ -877,7 +877,7 @@
"source": [
"## Using the UKF\n",
"\n",
"Let's solve some problems so you can gain confidence in how easy the UKF is to use. We will start with a linear problem you already know how to solve with the linear Kalman filter. Although the UKF was designed for nonlinear problems, it finds the same optimal result as the linear Kalman filter for linear problems. We will write a filter to track an object in 2D using a constant velocity model. This will allows us to focus on what is the same (and most is the same!) and what is different with the UKF. \n",
"Let's solve some problems so you can gain confidence in how easy the UKF is to use. We will start with a linear problem you already know how to solve with the linear Kalman filter. Although the UKF was designed for nonlinear problems, it finds the same optimal result as the linear Kalman filter for linear problems. We will write a filter to track an object in 2D using a constant velocity model. This will allow us to focus on what is the same (and most is the same!) and what is different with the UKF. \n",
"\n",
"Designing a Kalman filter requires you to specify the $\\bf{x}$, $\\bf{F}$, $\\bf{H}$, $\\bf{R}$, and $\\bf{Q}$ matrices. We have done this many times so I will give you the matrices without a lot of discussion. We want a constant velocity model, so we define $\\bf{x}$ to be\n",
"\n",
@ -1095,7 +1095,7 @@
"source": [
"## Tracking an Airplane\n",
"\n",
"Let's tackle our first nonlinear problem. We will write a filter to track an airplane using radar as the sensor. To keep the problem similar to the previous one as possible we will track in two dimensions. We will track one dimension on the ground and the altitude of the aircraft. Each dimension is independent so we can do this with no loss of generality.\n",
"Let's tackle our first nonlinear problem. We will write a filter to track an airplane using radar as the sensor. To keep the problem as similar to the previous one as possible we will track in two dimensions. We will track one dimension on the ground and the altitude of the aircraft. Each dimension is independent so we can do this with no loss of generality.\n",
"\n",
"Radars work by emitting radio waves or microwaves. Anything in the beam's path will reflect some of the signal back to the radar. By timing how long it takes for the reflected signal to return it can compute the *slant distance* to the target. Slant distance is the straight line distance from the radar to the object. Bearing is computed using the *directive gain* of the antenna.\n",
"\n",
@ -1434,7 +1434,7 @@
"source": [
"The filter is unable to track the changing altitude. What do we have to change in our design?\n",
"\n",
"I hope you answered add climb rate to the state, like so:\n",
"I hope you answered \"add climb rate to the state\", like so:\n",
"\n",
"\n",
"$$\\mathbf x = \\begin{bmatrix}\\mathtt{distance} \\\\\\mathtt{velocity}\\\\ \\mathtt{altitude} \\\\ \\mathtt{climb\\, rate}\\end{bmatrix}= \\begin{bmatrix}x \\\\\\dot x\\\\ y \\\\ \\dot y\\end{bmatrix}$$\n",
@ -1706,7 +1706,7 @@
"\n",
"The last sensor fusion problem was a toy example. Let's tackle a problem that is not so toy-like. Before GPS ships and aircraft navigated via various range and bearing systems such as VOR, LORAN, TACAN, DME, and so on. These systems emit beacons in the form of radio waves. The sensor extracts the range and/or bearing to the beacon from the signal. For example, an aircraft might have two VOR receivers. The pilot tunes each receiver to a different VOR station. Each VOR receiver displays the *radial* - the direction from the VOR station on the ground to the aircraft. The pilot uses a chart to find the intersection point of the radials, which identifies the location of the aircraft.\n",
"\n",
"That is a manual approach with low accuracy. A Kalman filter will produce far more accurate position estimates. Assume we have two sensors, each which provides a bearing only measurement to the target, as in the chart below. The width of the perimeters are proportional to the $3\\sigma$ of the sensor noise. The aircraft must be positioned somewhere within the intersection of the two perimeters with a high degee of probability."
"That is a manual approach with low accuracy. A Kalman filter will produce far more accurate position estimates. Assume we have two sensors, each of which provides a bearing only measurement to the target, as in the chart below. The width of the perimeters are proportional to the $3\\sigma$ of the sensor noise. The aircraft must be positioned somewhere within the intersection of the two perimeters with a high degree of probability."
]
},
{
@ -1937,7 +1937,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The geometry of the sensors relative to the tracked object imposes a physical limitation that can be extremely difficult to deal with when designing filters. If the radials of the VOR stations are nearly parallel to each other than a very small angular error translates into a very large distance error. What is worse, this behavior is nonlinear - the error in the *x-axis* vs the *y-axis* will vary depending on the actual bearing. These scatter plots show the error distribution for a 1°$\\sigma$ error for two different bearings."
"The geometry of the sensors relative to the tracked object imposes a physical limitation that can be extremely difficult to deal with when designing filters. If the radials of the VOR stations are nearly parallel to each other then a very small angular error translates into a very large distance error. What is worse, this behavior is nonlinear - the error in the *x-axis* vs the *y-axis* will vary depending on the actual bearing. These scatter plots show the error distribution for a 1°$\\sigma$ error for two different bearings."
]
},
{
@ -2022,7 +2022,7 @@
"\n",
"This graph makes it look easy because we have plotted 100 measurements for each position update. The movement of the aircraft is obvious. In contrast, the Kalman filter only gets one measurement per update. Therefore the filter will not be able to generate as good a fit as the dotted green line implies. \n",
"\n",
"Now consider that the bearing gives us no distance information. Suppose we set the initial estimate it 1,000 kilometers away from the sensor (vs the actual distance of 7.07 km) and make $\\mathbf P$ very small. At that distance a 1° error translates into a positional error of 17.5 km. The KF would never be able to converge onto the actual target position because the filter is incorrectly very certain about its position estimates and because there is no distance information provided in the measurements."
"Now consider that the bearing gives us no distance information. Suppose we set the initial estimate to 1,000 kilometers away from the sensor (vs the actual distance of 7.07 km) and make $\\mathbf P$ very small. At that distance a 1° error translates into a positional error of 17.5 km. The KF would never be able to converge onto the actual target position because the filter is incorrectly very certain about its position estimates and because there is no distance information provided in the measurements."
]
},
{
@ -2582,7 +2582,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"From these charts we can see that the improvement in the position is small, but the improvement in the velocity is good, and spectacular for the altitude. The difference in the position are very small, so I printed the difference between the UKF and the smoothed results for the last 5 points. I recommend always usng the RTS smoother if you can post-process your data."
"From these charts we can see that the improvement in the position is small, but the improvement in the velocity is good, and spectacular for the altitude. The difference in the position are very small, so I printed the difference between the UKF and the smoothed results for the last 5 points. I recommend always using the RTS smoother if you can post-process your data."
]
},
{
@ -2624,7 +2624,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"So what is going on here? We can see that for a mean of 0 the algorithm choose sigma points of 0, 3, and -3, but why? Recall the equation for computing the sigma points:\n",
"So what is going on here? We can see that for a mean of 0 the algorithm chooses sigma points of 0, 3, and -3, but why? Recall the equation for computing the sigma points:\n",
"\n",
"$$\\begin{aligned}\n",
"\\mathcal{X}_0 &= \\mu\\\\\n",
@ -2669,7 +2669,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We can see that the sigma point spread over 100 standard deviations. If our data was Gaussian we'd be incorporating data many standard deviations away from the mean; for nonlinear problems this is unlikely to produce good results. But suppose our distribution was not Gaussian, but instead had very fat tails? We might need to sample from those tails to get a good estimate, and hence it would make sense to make $\\kappa$ larger (not 200, which was absurdly large to make the change in the sigma points stark). \n",
"We can see that the sigma points spread over 100 standard deviations. If our data was Gaussian we'd be incorporating data many standard deviations away from the mean; for nonlinear problems this is unlikely to produce good results. But suppose our distribution was not Gaussian, but instead had very fat tails? We might need to sample from those tails to get a good estimate, and hence it would make sense to make $\\kappa$ larger (not 200, which was absurdly large to make the change in the sigma points stark). \n",
"\n",
"With a similar line of reasoning, suppose that our distribution has nearly no tails - the probability distribution looks more like an inverted parabola. In such a case we'd probably want to pull the sigma points in closer to the mean to avoid sampling in regions where there will never be real data.\n",
"\n",
@ -2787,17 +2787,17 @@
"\n",
"After the move forward for time $\\Delta t$ the new position and orientation of the robot is\n",
"\n",
"$$\\begin{aligned} x &= C_x + R\\sin(\\theta + \\beta) \\\\\n",
"y &= C_y - R\\cos(\\theta + \\beta) \\\\\n",
"\\theta &= \\theta + \\beta\n",
"$$\\begin{aligned} \\bar x &= C_x + R\\sin(\\theta + \\beta) \\\\\n",
"\\bar y &= C_y - R\\cos(\\theta + \\beta) \\\\\n",
"\\bar \\theta &= \\theta + \\beta\n",
"\\end{aligned}\n",
"$$\n",
"\n",
"Once we substitute in for $C$ we get\n",
"\n",
"$$\\begin{aligned} x &= x - R\\sin(\\theta) + R\\sin(\\theta + \\beta) \\\\\n",
"y &= y + R\\cos(\\theta) - R\\cos(\\theta + \\beta) \\\\\n",
"\\theta &= \\theta + \\beta\n",
"$$\\begin{aligned} \\bar x &= x - R\\sin(\\theta) + R\\sin(\\theta + \\beta) \\\\\n",
"\\bar y &= y + R\\cos(\\theta) - R\\cos(\\theta + \\beta) \\\\\n",
"\\bar \\theta &= \\theta + \\beta\n",
"\\end{aligned}\n",
"$$\n",
"\n",