diff --git a/10-Unscented-Kalman-Filter.ipynb b/10-Unscented-Kalman-Filter.ipynb index d583c82..933122b 100644 --- a/10-Unscented-Kalman-Filter.ipynb +++ b/10-Unscented-Kalman-Filter.ipynb @@ -461,7 +461,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let's consider the simplest possible case and see if it offers any insight. The simplest possible system is the **identity function**: $f(x) = x$. If our algorithm does not work for the identity function then the filter will never converge. In other words, if the input is 1 (for a one dimensional system), the output must also be 1. If the output was different, such as 1.1, then when we fed 1.1 into the transform at the next time step, we'd get out yet another number, maybe 1.23. This filter diverges. \n", + "Let's consider the simplest possible case and see if it offers any insight. The simplest possible system is the **identity function**: $f(x) = x$. If our algorithm does not work for the identity function then the filter cannot converge. In other words, if the input is 1 (for a one dimensional system), the output must also be 1. If the output was different, such as 1.1, then when we fed 1.1 into the transform at the next time step, we'd get out yet another number, maybe 1.23. This filter diverges. \n", "\n", "The fewest number of points that we can use is one per dimension. This is the number that the linear Kalman filter uses. The input to a Kalman filter for the distribution $\\mathcal{N}(\\mu,\\sigma^2)$ is $\\mu$ itself. So while this works for the linear case, it is not a good answer for the nonlinear case.\n", "\n", @@ -619,9 +619,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Unscented Transform Example\n", + "### Accuracy of the Unscented Transform\n", "\n", - "Let's see how accurate the unscented transform is. Earlier we wrote a function that found the mean of a distribution by passing 50,000 points through a nonlinear function. Let's now pass 5 sigma points through the same function, and compute their mean with the unscented transform." + "Earlier we wrote a function that found the mean of a distribution by passing 50,000 points through a nonlinear function. Let's now pass 5 sigma points through the same function, and compute their mean with the unscented transform." ] }, { @@ -716,7 +716,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "As with the linear Kalman filter, the UKF's predict step computes the prior. The process model $f()$ is assumed to be nonlinear, so we generate sigma points $\\mathcal{X}$ and their corresponding weights $W^m, W^c$\n", + "The UKF's predict step computes the prior using the process model $f()$. $f()$ is assumed to be nonlinear, so we generate sigma points $\\mathcal{X}$ and their corresponding weights $W^m, W^c$\n", "according to some function:\n", "\n", "$$\\begin{aligned}\n", @@ -774,11 +774,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now we can perform the update step of the filter. Recall that Kalman filters perform the update state in measurement space. Thus we must convert the sigma points of the prior into measurements using a measurement function $h(x)$ that you define.\n", + "Kalman filters perform the update in measurement space. Thus we must convert the sigma points of the prior into measurements using a measurement function $h(x)$ that you define.\n", "\n", "$$\\boldsymbol{\\mathcal{Z}} = h(\\boldsymbol{\\mathcal{Y}})$$\n", "\n", - "Now we can compute the mean and covariance of these points using the unscented transform. The $z$ subscript denotes that these are the mean and covariance of the measurement sigmas.\n", + "We compute the mean and covariance of these points using the unscented transform. The $z$ subscript denotes that these are the mean and covariance of the measurement sigma points.\n", "\n", "$$\\begin{aligned}\n", "\\boldsymbol\\mu_z, \\mathbf P_z &= \n", @@ -788,7 +788,7 @@ "\\end{aligned}\n", "$$\n", "\n", - "All that is left is to compute the residual and Kalman gain. The residual of the measurement $\\mathbf z$ is trivial to compute:\n", + "Next we compute the residual and Kalman gain. The residual of the measurement $\\mathbf z$ is trivial to compute:\n", "\n", "$$\\mathbf y = \\mathbf z - \\boldsymbol\\mu_z$$\n", "\n", @@ -815,7 +815,7 @@ "\n", "This step contains a few equations you have to take on faith, but you should be able to see how they relate to the linear Kalman filter equations. The linear algebra is slightly different from the linear Kalman filter, but the algorithm is the same Bayesian algorithm we have been implementing throughout the book. \n", "\n", - "This table comparing the Kalman filter equations with the UKF equations should help clarify the relationship between the filters." + "This table compares the equations of the linear KF and UKF equations." ] }, { @@ -855,9 +855,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "There are many algorithms published for selecting the sigma points for the UKF. Since 2005 or so research and industry have mostly settled on the version published by Rudolph Van der Merwe in his 2004 PhD dissertation [1]. It performs well with a variety of problems and it has a good tradeoff between performance and accuracy. It is a slight reformulation of the *Scaled Unscented Transform* published by Simon J. Julier [2].\n", + "There are many algorithms for selecting sigma points. Since 2005 or so research and industry have mostly settled on the version published by Rudolph Van der Merwe in his 2004 PhD dissertation [1]. It performs well with a variety of problems and it has a good tradeoff between performance and accuracy. It is a slight reformulation of the *Scaled Unscented Transform* published by Simon J. Julier [2].\n", "\n", - "Van der Merwe's formulation uses 3 parameters to control how the sigma points are distributed and weighted: $\\alpha$, $\\beta$, and $\\kappa$. Before we work through the equations, let's look at an example. I will plot the sigma points on top of a covariance ellipse showing the first and second standard deviations, and scale the points based on the mean weights." + "This formulation uses 3 parameters to control how the sigma points are distributed and weighted: $\\alpha$, $\\beta$, and $\\kappa$. Before we work through the equations, let's look at an example. I will plot the sigma points on top of a covariance ellipse showing the first and second standard deviations, and scale the points based on the mean weights." ] }, { @@ -904,7 +904,7 @@ "source": [ "### Sigma Point Computation\n", "\n", - "Our first sigma point is always going to be the mean of our input. This is the sigma point displayed in the center of the ellipses in the diagram above. We will call this $\\boldsymbol{\\chi}_0$. So,\n", + "The first sigma point is the mean of the input. This is the sigma point displayed in the center of the ellipses in the diagram above. We will call this $\\boldsymbol{\\chi}_0$.\n", "\n", "$$ \\mathcal{X}_0 = \\mu$$\n", "\n", @@ -912,9 +912,10 @@ "\n", "$$ \n", "\\boldsymbol{\\chi}_i = \\begin{cases}\n", - "\\mu + (\\sqrt{(n+\\lambda)\\Sigma})& \\text{for i=1 .. n} \\\\\n", - "\\mu - (\\sqrt{(n+\\lambda)\\Sigma})_{i-n} &\\text{for i=(n+1) .. 2n}\\end{cases}\n", + "\\mu + \\left[ \\sqrt{(n+\\lambda)\\Sigma}\\right ]_{i}& \\text{for i=1 .. n} \\\\\n", + "\\mu - \\left[ \\sqrt{(n+\\lambda)\\Sigma}\\right]_{i-n} &\\text{for i=(n+1) .. 2n}\\end{cases}\n", "$$\n", + "The $i$ subscript chooses the i$^{th}$ column vector of the matrix.\n", "\n", "In other words, we scale the covariance matrix by a constant, take the square root of it, and ensure symmetry by both adding and subtracting it from the mean. We will discuss how you take the square root of a matrix later.\n", "\n", @@ -1690,7 +1691,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "To include Doppler we need to include the velocity in $x$ and $y$ into the measurement. The `ACSim` class stores velocity in the data member `vel`. To perform the Kalman filter update we just need to call `update` with a list containing the slant distance, bearing, and velocity in $x$ and $y$:\n", + "For Doppler we need to include the velocity in $x$ and $y$ into the measurement. The `ACSim` class stores velocity in the data member `vel`. To perform the Kalman filter update we just need to call `update` with a list containing the slant distance, bearing, and velocity in $x$ and $y$:\n", "\n", "$$z = [\\mathtt{slant\\_range},\\, \\text{bearing},\\, \\dot x,\\, \\dot y]$$\n", "\n", @@ -1797,7 +1798,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The last sensor fusion problem was a toy example. Let's tackle a problem that is not so toy-like. Before GPS, ships and etcand so on. I do not intend to cover the intricacies of these systems - Wikipedia will fill in the basics if you are interested. These systems emit beacons in the form of radio waves. The sensor extracts the range and/or bearing to the beacon from the signal. For example, an aircraft might have two VOR receivers. The pilot tunes each receiver to a different VOR station. Each VOR receiver displays what is called the *radial* - the direction from the VOR station on the ground to the aircraft. The pilot uses a chart to find the intersection point of the radials, which identifies the location of the aircraft.\n", + "The last sensor fusion problem was a toy example. Let's tackle a problem that is not so toy-like. Before GPS ships and aircraft navigated via various range and bearing systems such as VOR, LORAN, TACAN, DME, and so on. These systems emit beacons in the form of radio waves. The sensor extracts the range and/or bearing to the beacon from the signal. For example, an aircraft might have two VOR receivers. The pilot tunes each receiver to a different VOR station. Each VOR receiver displays the *radial* - the direction from the VOR station on the ground to the aircraft. The pilot uses a chart to find the intersection point of the radials, which identifies the location of the aircraft.\n", "\n", "That is a manual approach with low accuracy. A Kalman filter will produce far more accurate position estimates. Assume we have two sensors, each which provides a bearing only measurement to the target, as in the chart below. The width of the perimeters are proportional to the $3\\sigma$ of the sensor noise. The aircraft must be positioned somewhere within the intersection of the two perimeters with a high degee of probability." ] @@ -1891,7 +1892,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Next we need to implement the measurement function, which converts the prior to an array containing the measurement to both stations. I'm not a fan of global variables, but I put the position of the stations in the global variables `sa_pos` and `sb_pos` to demonstrate this method of sharing data with $h()$:" + "Next we implement the measurement function. It converts the prior to an array containing the measurement to both stations. I'm not a fan of global variables, but I put the position of the stations in the global variables `sa_pos` and `sb_pos` to demonstrate this method of sharing data with $h()$:" ] }, { @@ -2474,7 +2475,6 @@ "source": [ "The update step converts the sigmas into measurement space via the function `h(x)`.\n", "\n", - "\n", "$$\\mathcal{Z} = h(\\mathcal{Y})$$\n", "\n", "The mean and covariance of those points is computed with the unscented transform. The residual and Kalman gain is then computed. The cross variance is computed as:\n", @@ -2550,7 +2550,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The Kalman filter is designed as a recursive algorithm - as new measurements come in we immediately create a new estimate. But it is very common to have a set of data that have been already collected which we want to filter. Kalman filters can always be run in a *batch* mode, where all of the measurements are filtered at once.\n", + "The Kalman filter is recursive - estimates are based on the currernt measurement and prior estimate. But it is very common to have a set of data that have been already collected which we want to filter. In this case the filter can be run in a *batch* mode, where all of the measurements are filtered at once.\n", "\n", "Collect your measurements into an array or list.\n", "\n", @@ -2564,7 +2564,7 @@ "Xs, Ps = ukf.batch_filter(zs)\n", "```\n", "\n", - "The function takes the list/array of measurements, filters it, and returns an array of state estimates (Xs) and covariance matrices (Ps) for the entire data set. \n", + "The function takes the list/array of measurements, filters it, and returns an array of state estimates (`Xs`) and covariance matrices (`Ps`) for the entire data set. \n", "\n", "Here is a complete example drawing from the radar tracking problem above." ] @@ -3045,7 +3045,7 @@ "source": [ "### Design the Measurement Model\n", "\n", - "Now we need to design our measurement model. We are assuming that we have a sensor that receives a noisy bearing and range to multiple known locations in the landscape. The measurement model must convert the state $\\begin{bmatrix}x & y&\\theta\\end{bmatrix}^\\mathsf{T}$ into a range and bearing to the landmark. If $p$ is the position of a landmark, the range $r$ is\n", + "The sensor provides a noisy bearing and range to multiple known locations in the landscape. The measurement model must convert the state $\\begin{bmatrix}x & y&\\theta\\end{bmatrix}^\\mathsf{T}$ into a range and bearing to the landmark. If $p$ is the position of a landmark, the range $r$ is\n", "\n", "$$r = \\sqrt{(p_x - x)^2 + (p_y - y)^2}$$\n", "\n",