Correct some minor English issues.

Wrote a bunch of exercises and solutions for the Kalman filter.
2014-05-07 22:16:41 -07:00 · 2014-05-07 22:16:41 -07:00 · bcc62c7984
commit bcc62c7984
parent ef22d7e62a
2 changed files with 269 additions and 57 deletions
--- a/Gaussians.ipynb
+++ b/Gaussians.ipynb
@ -1,7 +1,7 @@
 {
 "metadata": {
  "name": "",
-  "signature": "sha256:167e76515c3a54031a76bf820666d9a7a52a0f919f127a68ea60c5e5e7821e63"
+  "signature": "sha256:5f19c6fe106aa81c2c579cadd757aca741e13f7982c7052c54831a08f4254eee"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
@ -156,7 +156,7 @@
     "cell_type": "markdown",
     "metadata": {},
     "source": [
-      "#### Interactive Gaussians\n",
+      "### Interactive Gaussians\n",
      "\n",
      "For those that are using this directly in IPython Notebook, here is an interactive version of the guassian plots. Use the sliders to modify $\\mu$ and $\\sigma^2$. Adjusting $\\mu$ will move the graph to the left and right because you are adjusting the mean, and adjusting $\\sigma^2$ will make the bell curve thicker and thinner."
     ]
--- a/Filters.ipynb
+++ b/Filters.ipynb
@ -1,7 +1,7 @@
 {
 "metadata": {
  "name": "",
-  "signature": "sha256:6ea7b0e562518e41af4e09c3d07bfdc49bd67a20f9d28c17a5b5a053444de18c"
+  "signature": "sha256:71935257a6cdd6e96f6fc2ae944dd43a6f4b1bd461e175fa7bb4745c9623dcf4"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
@ -15,7 +15,7 @@
      "#Kalman Filters\n",
      "\n",
      "\n",
-      "Now that we understand the histogram filter and gaussians we are prepared to implement a 1D Kalman filter. We will do this exactly as we did the histogram filter - rather than going into the theory we will just develop the code step by step. \n",
+      "Now that we understand the histogram filter and Gaussians we are prepared to implement a 1D Kalman filter. We will do this exactly as we did the histogram filter - rather than going into the theory we will just develop the code step by step. \n",
      "\n",
      "#Tracking A Dog\n",
      "\n",
@ -30,6 +30,8 @@
     "cell_type": "code",
     "collapsed": false,
     "input": [
+      "from __future__ import print_function, division\n",
+      "\n",
      "import numpy.random as random\n",
      "import math\n",
      "\n",
@ -140,7 +142,7 @@
     "cell_type": "markdown",
     "metadata": {},
     "source": [
-      "**note**:*numpy uses a random number generator to generate the normal distribution samples. The numbers I see as I write this are unlikely to be the ones that you see. If you run the cell above multiple times, you should get a slightly different result each time. I could use numpy.random.seed(some_value) to force the results to be the same each time. This would simplify my explanations in some cases, but would ruin the interactive nature of this chapter. To get a real feel for how normal distributions and Kalman filters work you will probably want to run cells several times, observing what changes, and what stays roughly the same.*\n",
+      "> **Note**: numpy uses a random number generator to generate the normal distribution samples. The numbers I see as I write this are unlikely to be the ones that you see. If you run the cell above multiple times, you should get a slightly different result each time. I could use *numpy.random.seed(some_value)* to force the results to be the same each time. This would simplify my explanations in some cases, but would ruin the interactive nature of this chapter. To get a real feel for how normal distributions and Kalman filters work you will probably want to run cells several times, observing what changes, and what stays roughly the same.\n",
      "\n",
      "So the output of the sensor should be a wavering blue line drawn over a dotted red line. The dotted red line shows the actual position of the dog, and the blue line is the noise signal produced by the simulated RFID sensor. Please note that the red dotted line was manually plotted - we do not yet have a filter that recovers that information! \n",
      "\n",
@ -301,7 +303,7 @@
     "cell_type": "markdown",
     "metadata": {},
     "source": [
-      "The result is either amazing or what you would expect, depending on your state of mind. I must admit I vacillate freely between the two! Note that the result of the multiplation is taller and narrow than the original gaussian. If we think of the gaussians as two measurement, this makes sense. If I measure twice and get the same value, I should be more confident in my answer than if I just measured once. \"Measure twice, cut once\" is a useful saying and practice due to this fact! \n",
+      "The result is either amazing or what you would expect, depending on your state of mind. I must admit I vacillate freely between the two! Note that the result of the multiplation is taller and narrow than the original Gaussian. If we think of the Gaussians as two measurement, this makes sense. If I measure twice and get the same value, I should be more confident in my answer than if I just measured once. \"Measure twice, cut once\" is a useful saying and practice due to this fact! \n",
      "\n",
      "Now let's multiply two gaussians (or equivelently, two measurements) that are partially separated. What do you think the result will be? Let's find out:"
     ]
@ -337,11 +339,11 @@
     "source": [
      "Another beautiful result! If I handed you a measuring tape and asked you to measure the distance from table to a wall, and you got 23m, and then a friend make the same measurement and got 25m, your best guess must be 24m. \n",
      "\n",
-      "That is fairly counter-intuitive, so let's consider it further. Perhaps a more reasonable assumption would be that either you or your coworker just made a mistake, and the true distance is either 23 or 25, but certainly not 24. Surely that is possible. However, suppose the two measurements you reported as 24.01 and 23.99. Surely you would agree that in this case the best guess for the correct value is 24?  Which interpretation we choose depends on the properties of the sensors we are using. Humans make galling mistakes, physical sensors do not. \n",
+      "That is fairly counter-intuitive, so let's consider it further. Perhaps a more reasonable assumption would be that either you or your coworker just made a mistake, and the true distance is either 23 or 25, but certainly not 24. Surely that is possible. However, suppose the two measurements you reported as 24.01 and 23.99. In that case you would agree that in this case the best guess for the correct value is 24?  Which interpretation we choose depends on the properties of the sensors we are using. Humans make galling mistakes, physical sensors do not. \n",
      "\n",
-      "This topic is fairly deep, and I will explore it once we have completed our Kalman filter. For now I will merely say that the Kalman filter requires the interpretation that measurements are accurate, with gaussian noise, and that a large error caused by misreading a measuring tape is not gaussian noise. So perhaps you would be justified in thinking that a histogram filter will perform better for the human readings, and the Kalman filter will perform better with sensor readings that have gaussian noise.\n",
+      "This topic is fairly deep, and I will explore it once we have completed our Kalman filter. For now I will merely say that the Kalman filter requires the interpretation that measurements are accurate, with Gaussian noise, and that a large error caused by misreading a measuring tape is not Gaussian noise. So perhaps you would be justified in thinking that a histogram filter will perform better for the human readings, and the Kalman filter will perform better with sensor readings that have gaussian noise.\n",
      "\n",
-      "For now I ask that you trust me. The math is correct, so we have no choice but to accept it and use it. We will see how the Kalman filter deals with movements vs error very soon. 24 is the correct answer to this problem."
+      "For now I ask that you trust me. The math is correct, so we have no choice but to accept it and use it. We will see how the Kalman filter deals with movements vs error very soon. In the meantime, accept that 24 is the correct answer to this problem."
     ]
    },
    {
@ -352,7 +354,7 @@
      "\n",
      "Recall the histogram filter uses a numpy array to encode our belief about the position of our dog at any time. That array stored our belief that the dog was in any position in the hallway using 10 positions. This was very crude, because with a 100m hallway that corresponded to positions 10m apart. It would have been trivial to expand the number of positions to say 1,000, and that is what we would do if using it for a real problem. But the problem remains that the distribution is discrete and multimodal - it can express strong belief that the dog is in two positions at the same time.\n",
      "\n",
-      "Therefore, we will use a single gaussian to reflect our current belief of the dog's position. Gaussians extend to infinity on both sides of the mean, so the single gaussian will cover the entire hallway. They are unimodal, and seem to reflect the behavior of real-world sensors - most errors are small and clustered around the mean. Here is the entire implementation of the sense function for a Kalman filter:"
+      "Therefore, we will use a single Gaussian to reflect our current belief of the dog's position. Gaussians extend to infinity on both sides of the mean, so the single Gaussian will cover the entire hallway. They are unimodal, and seem to reflect the behavior of real-world sensors - most errors are small and clustered around the mean. Here is the entire implementation of the sense function for a Kalman filter:"
     ]
    },
    {
@ -370,7 +372,7 @@
     "cell_type": "markdown",
     "metadata": {},
     "source": [
-      "Kalman filters are supposed to be hard! But this is very short and straightforward. All we are doing is multiplying the gaussian that reflects our belief of where the dog was with the new measurement. Perhaps this would be clearer if we used more specific names:"
+      "Kalman filters are supposed to be hard! But this is very short and straightforward. All we are doing is multiplying the Gaussian that reflects our belief of where the dog was with the new measurement. Perhaps this would be clearer if we used more specific names:"
     ]
    },
    {
@ -414,14 +416,18 @@
     "source": [
      "Because of the random numbers I do not know the exact values that you see, but the position should have converged very quickly to almost 0 despite the initial error of believing that the position was 2.0. Furthermore, the variance should have quickly converged from the intial value of 5.0 to 0.238.\n",
      "\n",
-      "By now the fact that we converged to a position of 0.0 should not be terribly suprising. All we are doing is computing new_position = old_position &ast; measurement, and the measurement is a normal distribution around 0, so we should get very close to 0 after 20 iterations. But the truly amazing part of this code is how the variance became 0.238 despite every measurement having a variance of 5.0. \n",
+      "By now the fact that we converged to a position of 0.0 should not be terribly suprising. All we are doing is computing $new\\_position = old\\_position * measurement$, and the measurement is a normal distribution around 0, so we should get very close to 0 after 20 iterations. But the truly amazing part of this code is how the variance became 0.238 despite every measurement having a variance of 5.0. \n",
      "\n",
      "If we think about the physical interpretation of this is should be clear that this is what should happen. If you sent 20 people into the hall with a tape measure to physically measure the position of the dog you would be very confident in the result after 20 measurements - more confident than after 1 or 2 measurements. So it makes sense that as we make more measurements the variance gets smaller.\n",
      "\n",
-      "Mathematically it makes sense as well. Recall the computation for the variance after the multiplication: $\\sigma^2 = \\frac{1}{\\frac{1}{{\\sigma}_1} + \\frac{1}{{\\sigma}_2}}$. We take the reciprocals of the sigma from the measurement and prior belief, add them, and take the reciprocal of the result. Think about that for a moment, and you will see that this will always result in smaller numbers as we proceed.\n",
-      "\n",
-      "\n",
-      "#Implementing Updates\n",
+      "Mathematically it makes sense as well. Recall the computation for the variance after the multiplication: $\\sigma^2 = 1/(\\frac{1}{{\\sigma}_1} + \\frac{1}{{\\sigma}_2})$. We take the reciprocals of the sigma from the measurement and prior belief, add them, and take the reciprocal of the result. Think about that for a moment, and you will see that this will always result in smaller numbers as we proceed."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "###Implementing Updates\n",
      "\n",
      "That is a beautiful result, but it is not yet a filter. We assumed that the dog was sitting still, an extremely dubious assumption. Certainly it is a useless one - who would need to write a filter to track nonmoving objects? The histogram used a loop of sense and update functions, and we must do the same to accomodate movement.\n",
      "\n",
@ -440,19 +446,17 @@
      "                \n",
      "In a nutshell, we shift the probability vector by the amount we believe the animal moved, and adjust the probability. How do we do that with gaussians?\n",
      "\n",
-      "It turns out that we just add gaussians. Think of the case without gaussians. I think my dog is at 7.3m, and he moves 2.6m to right, where is he now? Obviously, $7.3+2.6=9.9$. He is at 9.9m. Abstractly, the algorithm is *new_pos = old_pos + dist_moved*. It does not matter if we use floating point numbers or gaussians for these values, the algorithm must be the same. \n",
+      "It turns out that we just add gaussians. Think of the case without gaussians. I think my dog is at 7.3m, and he moves 2.6m to right, where is he now? Obviously, $7.3+2.6=9.9$. He is at 9.9m. Abstractly, the algorithm is $new\\_pos = old\\_pos + dist\\_moved$. It does not matter if we use floating point numbers or gaussians for these values, the algorithm must be the same. \n",
      "\n",
-      "How is addition for gaussians performed. It turns out to be very simple:\n",
-      "$$ N({\\mu}_1, {{\\sigma}_1}^2)+N({\\mu}_2, {{\\sigma}_2}^2) = N({\\mu}_1 + {\\mu}_2, {\\sigma}_1 + {\\sigma}_2)$$\n",
+      "How is addition for gaussians performed? It turns out to be very simple:\n",
+      "$$ N({\\mu}_1, {{\\sigma}_1}^2)+N({\\mu}_2, {{\\sigma}_2}^2) = N({\\mu}_1 + {\\mu}_2, {{\\sigma}_1}^2 + {{\\sigma}_2}^2)$$\n",
      "\n",
      "All we do is add the means and the variance separately! Does that make sense? Think of the physical representation of this abstract equation.\n",
      "${\\mu}_1$ is the old position, and ${\\mu}_2$ is the distance moved. Surely it makes sense that our new position is ${\\mu}_1 + {\\mu}_2$. What about the variance? It is perhaps harder to form an intuition about this. However, recall that with the *update()* function for the histogram filter we always lost information - our confidence after the update was lower than our confidence before the update. Perhaps this makes sense - we don't really know where the dog is moving, so perhaps the confidence should get smaller (variance gets larger). I assure you that the equation for gaussian addition is correct, and derived by basic algebra. Therefore it is reasonable to expect that if we are using gaussians to model physical events, the results must correctly describe those events.\n",
      "\n",
      "I recognize the amount of hand waving in that argument. Now is a good time to either work through the algebra to convince yourself of the mathematical correctness of the algorithm, or to work through some examples and see that it behaves reasonably. This book will do the latter.\n",
      "\n",
-      "So, here is our implementation of the update function:\n",
-      "\n",
-      "\n"
+      "So, here is our implementation of the update function:"
     ]
    },
    {
@ -498,7 +502,7 @@
      "    pos = sense(pos[0], pos[1], Z, sensor_error)\n",
      "    ps.append(pos[0])\n",
      "    \n",
-      "    print('SENSE: %.4f,\\t%.4f' % (pos[0], pos[1]))\n",
+      "    print('SENSE:  %.4f,\\t%.4f' % (pos[0], pos[1]))\n",
      "    print()\n",
      "    \n",
      "p1, = plt.plot(zs,c='r', linestyle='dashed')\n",
@ -519,7 +523,7 @@
      "    movement = 1 \n",
      "    movement_error = 2\n",
      "    \n",
-      "For the moment we are assuming that we have some other sensor that detects how the dog is moving. For example, there could be an inertial sensor clipped onto the dog's collar, and it reports how far the dog moved each time it is triggered. The details don't matter. The upshot is that we have a sensor, it has noise, and so we represent it with a guassian. Later we will learn what to do if we do not have a sensor for the *update()* step.\n",
+      "For the moment we are assuming that we have some other sensor that detects how the dog is moving. For example, there could be an inertial sensor clipped onto the dog's collar, and it reports how far the dog moved each time it is triggered. The details don't matter. The upshot is that we have a sensor, it has noise, and so we represent it with a Gaussian. Later we will learn what to do if we do not have a sensor for the *update()* step.\n",
      "\n",
      "For now let's walk through the code and output bit by bit.\n",
      "\n",
@ -564,6 +568,7 @@
      "Here we sense the dog's position, and store it in our array so we can plot the results later.\n",
      "\n",
      "Finally we call the sense function of our filter, save the result in our *ps* array, and print the updated position belief:\n",
+      "\n",
      "    pos = sense(pos[0], pos[1], Z, movement_error)\n",
      "    ps.append(pos[0])\n",
      "    print 'SENSE:', \"%.4f\" %pos[0], \", %.4f\" %pos[1]\n",
@ -576,14 +581,24 @@
      "\n",
      "Now the software just loops, calling *update()* and *sense()* in turn. Because of the random sampling I do not know exactly what numbers you are seeing, but the final position is probably between 9 and 11, and the final variance is probably around 3.5. After several runs I did see the final position nearer 7, which would have been the result of several measurements with relatively large errors.\n",
      "\n",
-      "Now look at the plot. The noisy measurements are plotted in with a dotted red line, and the filter results are in the solid blue line. Both are quite noisy, but notice how much noisier the measurements (red line) are. This is your first Kalman filter shown to work!\n",
+      "Now look at the plot. The noisy measurements are plotted in with a dotted red line, and the filter results are in the solid blue line. Both are quite noisy, but notice how much noisier the measurements (red line) are. This is your first Kalman filter shown to work!"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "###More Examples\n",
      "\n",
-      "\n",
-      "#More Examples\n",
-      "\n",
-      "Before I go on, I want to emphasize that this code fully implements a 1D Kalman filter. If you have tried to read the literatue, you are perhaps surprised, because this looks nothing like the complex, endless pages of math in those books. To be fair, the math gets a bit more complicated in multiple dimensions, but not by much. So long as we worry about *using* the equations rather than *deriving* them we can create Kalman filters without a lot of effort. Moreover, I hope you'll agree that you have a decent intuitive grasp of what is happening. We represent our beliefs with gaussians, and our beliefs get better over time because more measurement means more data to work with. \"Measure twice, cut once!\"\n",
-      "\n",
-      "So I didn't put a lot of noise in the signal, and I also 'correctly guessed' that the dog was at position 0. How does the filter perform in real world conditions? Let's explore and find out. I will start by injecting a lot of noise in the RFID sensor. I will inject an extreme amount of noise - noise that apparently swamps the actual measurement. What does your intution tell about how the filter will perform if the noise is allowed to be anywhere from -300 or 300. In other workds, an actual position of 1.0 might be reported as 287.9, or -189.6, or any other number in that range. Think about it before you scroll down."
+      "Before I go on, I want to emphasize that this code fully implements a 1D Kalman filter. If you have tried to read the literatue, you are perhaps surprised, because this looks nothing like the complex, endless pages of math in those books. To be fair, the math gets a bit more complicated in multiple dimensions, but not by much. So long as we worry about *using* the equations rather than *deriving* them we can create Kalman filters without a lot of effort. Moreover, I hope you'll agree that you have a decent intuitive grasp of what is happening. We represent our beliefs with Gaussians, and our beliefs get better over time because more measurement means more data to work with. \"Measure twice, cut once!\""
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "##### Example: Extreme Amounts of Noise\n",
+      "So I didn't put a lot of noise in the signal, and I also 'correctly guessed' that the dog was at position 0. How does the filter perform in real world conditions? Let's explore and find out. I will start by injecting a lot of noise in the RFID sensor. I will inject an extreme amount of noise - noise that apparently swamps the actual measurement. What does your intution tell about how the filter will perform if the noise is allowed to be anywhere from -300 or 300. In other words, an actual position of 1.0 might be reported as 287.9, or -189.6, or any other number in that range. Think about it before you scroll down."
     ]
    },
    {
@ -622,8 +637,14 @@
     "cell_type": "markdown",
     "metadata": {},
     "source": [
-      "In this example the noise is extreme yet the filter still outputs a nearly straight line! This is an astonishing result! What do you think might be the cause of this performance? If you are not sure, don't worry, we will discuss it latter.\n",
-      "\n",
+      "In this example the noise is extreme yet the filter still outputs a nearly straight line! This is an astonishing result! What do you think might be the cause of this performance? If you are not sure, don't worry, we will discuss it latter."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "#####Example: Bad Initial Estimate\n",
      "Now let's lets look at the results when we make a bad initial estimate of position. To avoid obscuring the results I'll reduce the sensor variance to 30, but set the initial position to 1000m. Can the filter recover from a 1000m initial error?"
     ]
    },
@ -663,9 +684,15 @@
     "cell_type": "markdown",
     "metadata": {},
     "source": [
-      "Again the answer is yes! Because we are relatively sure about our belief in the sensor ($\\sigma=30$) even after the first step we have changed our belief in the first position from 1000 to somewhere around 60.0 or so. After another 5-10 measurements we have converged to the correct value! So this is how we get around the chicken and egg problem of initial guesses. In practice we would probably just assign the first measurement from the sensor as the initial value, but you can see it doesn't matter much if we wildly guess at the initial conditions - the Kalman filter still converges very quickly.\n",
-      "\n",
-      "What about the worst of both worlds, large noise and a bad initial estimate:"
+      "Again the answer is yes! Because we are relatively sure about our belief in the sensor ($\\sigma=30$) even after the first step we have changed our belief in the first position from 1000 to somewhere around 60.0 or so. After another 5-10 measurements we have converged to the correct value! So this is how we get around the chicken and egg problem of initial guesses. In practice we would probably just assign the first measurement from the sensor as the initial value, but you can see it doesn't matter much if we wildly guess at the initial conditions - the Kalman filter still converges very quickly."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "#####Example: Large Noise and Bad Initial Estimate\n",
+      "What about the worst of both worlds, large noise and a bad initial estimate?"
     ]
    },
    {
@ -676,8 +703,8 @@
      "movement_error = 2\n",
      "pos = (1000,500)\n",
      "\n",
-      "dog = DogSensor(0, velocity=movement, noise=sensor_error)\n",
      "\n",
+      "dog = DogSensor(0, velocity=movement, noise=sensor_error) \n",
      "zs = []\n",
      "ps = []\n",
      "\n",
@ -755,18 +782,17 @@
     "cell_type": "markdown",
     "metadata": {},
     "source": [
-      "####Explaining the Results\n",
+      "###Explaining the Results - Multi-Sensor Fusion\n",
      "\n",
-      "So how does the Kalman filter do so well? I have glossed over one aspect of the filter as it becomes confusing to address too many points at the same time. In these example we do not have 1 sensor but 2. The first sensor is the RFID sensor that outputs the position measurement, and the second sensor measures our dog's movement using an intertial tracker. How does our filter perform if that tracker is also noisy? Let's see:\n",
-      "\n"
+      "So how does the Kalman filter do so well? I have glossed over one aspect of the filter as it becomes confusing to address too many points at the same time. In these examples we have two sensors even though we have only been talking about the RFID sensor. The second sensor measures our dog's movement using an intertial tracker. How does our filter perform if that tracker is also noisy? Let's see:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
-      "sensor_error = 30000\n",
-      "movement_sensor = 30000\n",
+      "sensor_error = 30\n",
+      "movement_sensor = 30\n",
      "pos = (0,500)\n",
      "\n",
      "dog = DogSensor(0, velocity=movement, noise=sensor_error)\n",
@ -774,14 +800,14 @@
      "zs = []\n",
      "ps = []\n",
      "\n",
-      "for i in range(1000):\n",
+      "for i in range(100):\n",
      "    Z = dog.sense()\n",
      "    zs.append(Z)\n",
      "    \n",
      "    pos = sense(pos[0], pos[1], Z, sensor_error)\n",
      "    ps.append(pos[0])\n",
      "\n",
-      "    pos = update(pos[0], pos[1], movement, movement_error)\n",
+      "    pos = update(pos[0], pos[1], movement+ random.randn()*2, movement_error)\n",
      "\n",
      "p1, = plt.plot(zs,c='r', linestyle='dashed')\n",
      "p2, = plt.plot(ps, c='b')\n",
@ -798,25 +824,119 @@
     "source": [
      "This result is worse than the example where only the measurement sensor was noisy. Instead of being mostly straight, this time the filter's output is distintly jagged. But, it still mostly tracks the dog. What is happening here?\n",
      "\n",
-      "This illustrates the effects of *multi-sensor fusion*. Suppose the dog is actually at 10.0, and we get subsequent measurement readings of -289.78 and 301.43.  From that information alone it is impossible to tell if the dog is standing still during very noisy measurements, or perhaps sprinting from -289 to 301 and being accurately measured. But we have a second source of information, his velocity. Even when the velocity is also noisy, it constrains what our beliefs might be. For example, suppose with the readings of -289.78 to 301.43 we get a velocity reading of 590. That matches the difference between the two positions quite well, so this will lead us to believe the RFID sensor and the velocity sensor. Now suppose we got a velocity reading of 1.7. This doesn't match our RFID reading very well. Finally, suppose the velocity reading was -678.8. This completely contradicts the RFID reading - we may not be sure from these few values which sensor is most inaccurate, but perhaps by now you will trust that the gaussians expressing our beliefs will correctly handle these cases. It's a bit hard to talk about while working with 1D problems, so we will take this topic up in great detail in the next chapter where we develop multidimensional Kalman filters. Remark\n",
+      "This illustrates the effects of *multi-sensor fusion*. Suppose the dog is actually at 10.0, and we get subsequent measurement readings of -289.78 and 301.43.  From that information alone it is impossible to tell if the dog is standing still during very noisy measurements, or perhaps sprinting from -289 to 301 and being accurately measured. But we have a second source of information, his velocity. Even when the velocity is also noisy, it constrains what our beliefs might be. For example, suppose with the readings of -289.78 to 301.43 we get a velocity reading of 590. That matches the difference between the two positions quite well, so this will lead us to believe the RFID sensor and the velocity sensor. Now suppose we got a velocity reading of 1.7. This doesn't match our RFID reading very well. Finally, suppose the velocity reading was -678.8. This completely contradicts the RFID reading - we may not be sure from these few values which sensor is most inaccurate, but perhaps by now you will trust that the Gaussians expressing our beliefs will correctly handle these cases. It's a bit hard to talk about while working with 1D problems, so we will take this topic up in great detail in the next chapter where we develop multidimensional Kalman filters.\n",
      "\n",
-      "Besides that issue, we are modelling the noise in our sensors using gaussians which model their real world performance. We are multiplying the gaussians (probabilities) when we get a new position measurement, adding the gaussians when we get a movement update. This is algorithmically correct (this is how the histogram filter works) and mathematically correct - why wouldn't it work? \n",
-      "\n",
-      "#### Summary\n",
-      "This takes some time to assimulate. To truly understand this you will probably have to work through this chapter several times. I encourage you to change the various constants and observe the results. Convince yourself that gaussians are a good representation of a unimodal belief of something like the position of a dog in a hallway. Then convince yourself that multiplying gaussians truly does compute a new belief from your prior belief and the new measurement. Finally, convince yourself that if you are measuring movement, that adding the gaussians correctly updates your belief. That is all the Kalman filter does. Even now I alternate between complacency and amazement at the results. \n",
-      "\n",
-      "If you understand this, you will be able to understand multidimensional Kalman filters and the various extensions that have been make on them. If you do not fully understand this, I strongly suggest rereading this chapter until you do understand it. Try implementing the filter from scratch, just by looking at the equations and reading the text. Change the constants. Maybe try to implement a different tracking problem, like tracking stock prices. Experimentation will build your intuition and understanding of how these marvelous filters work.\n",
-      "\n"
+      "Besides that aspect, we are modelling the noise in our sensors using Gaussians which model their real world performance. We are multiplying the Gaussians (probabilities) when we get a new position measurement, adding the Gaussians when we get a movement update. This is algorithmically correct (this is how the histogram filter works) and mathematically correct - why wouldn't it work if our model is correct? Think back to the Discrete Bayes filter that we developed and you'll realize that it is the same logic and algorithm.  "
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
-      "author notes:\n",
-      "    clean up the code - same stuff duplicated over and over - write a 'clean implemntation' at the end.\n",
+      "#####Exercise:\n",
+      "Implement the Kalman filter using IPython Notebook's animation features to allow you to modify the various constants in real time using sliders. Refer to the section **Interactive Gaussians** in the Gaussian chapter to see how to do this. You will use the *interact()* function to call a calculation and plotting function. Each parameter passed into *interact()* automatically gets a slider created for it. I have built the boilerplate for this; just fill in the required code."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "from IPython.html.widgets import interact, interactive, fixed\n",
+      "import IPython.html.widgets as widgets\n",
+      "def plot_kalman_filter(start_pos, sensor_noise, movement, movement_noise, noise_scale):\n",
+      "    # your code goes here\n",
+      "    pass\n",
+      "\n",
+      "interact(plot_kalman_filter,\n",
+      "         start_pos=(-10,10), \n",
+      "         sensor_noise=widgets.IntSliderWidget(value=5,min=0,max=100), \n",
+      "         movement=widgets.FloatSliderWidget(value=1,min=-2.,max=2.), \n",
+      "         movement_noise=widgets.FloatSliderWidget(value=5,min=0,max=100.),\n",
+      "         noise_scale=widgets.FloatSliderWidget(value=1,min=0,max=2.))"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "######Solution\n",
+      "One possible solution follows."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "\n",
+      "zs = np.zeros(100)\n",
+      "ps = np.zeros(100)\n",
+      "def plot_kalman_filter(start_pos, sensor_noise, movement, movement_noise,noise_scale):\n",
+      "    dog = DogSensor(start_pos, velocity=movement, noise=sensor_noise)\n",
+      "    random.seed(303)\n",
+      "    pos = (0,100)\n",
+      "\n",
+      "    for i in range(100):\n",
+      "        Z = dog.sense() + random.randn()*noise_scale\n",
+      "        zs[i] = Z\n",
+      "\n",
+      "        pos = sense(pos[0], pos[1], Z, sensor_error)\n",
+      "        ps[i] = pos[0]\n",
+      "\n",
+      "        pos = update(pos[0], pos[1], movement + random.randn()*movement_noise, movement_noise)\n",
+      "\n",
+      "    p1, = plt.plot(zs,c='r', linestyle='dashed')\n",
+      "    p2, = plt.plot(ps, c='b')\n",
+      "    plt.legend([p1,p2], ['measurement', 'filter'], 2)\n",
+      "    plt.show()\n",
+      "\n",
+      "interact(plot_kalman_filter,\n",
+      "         start_pos=(-10,10), \n",
+      "         sensor_noise=widgets.IntSliderWidget(value=5,min=0,max=100), \n",
+      "         movement=widgets.FloatSliderWidget(value=1,min=-2.,max=2.), \n",
+      "         movement_noise=widgets.FloatSliderWidget(value=2,min=0,max=100.),\n",
+      "         noise_scale=widgets.FloatSliderWidget(value=1,min=0,max=20.))"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "#####Exercise - Nonlinear Systems\n",
+      "\n",
+      "Our equations are linear: \n",
+      "$$\\begin{align*}new\\_pos&=old\\_pos+dist\\_moved\\\\\n",
+      "new\\_position&=old\\_position*measurement\\end{align*}$$\n",
+      "\n",
+      "Do you suppose that this filter works well or poorly with nonlinear systems?\n",
+      "\n",
+      "Implement a Kalman filter that uses the following *sin()* to generate the measurement value for i in range(100):\n",
+      "\n",
+      "    Z = math.sin(i/3.) # no noise, perfect data!\n",
      "    \n",
-      "    "
+      "Adust the variance and initial positions to see the effect. What is, for example, the result of a very bad initial guess?"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "#enter your code here."
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "###### Solution:"
     ]
    },
    {
@ -825,7 +945,7 @@
     "input": [
      "sensor_error = 30\n",
      "movement_error = 2\n",
-      "pos = (1000,500)\n",
+      "pos = (100,500)\n",
      "\n",
      "zs = []\n",
      "ps = []\n",
@ -834,7 +954,7 @@
      "for i in range(100):\n",
      "    pos = update(pos[0], pos[1], movement, movement_error)\n",
      "\n",
-      "    Z = math.sin(i/3.)*5.\n",
+      "    Z = math.sin(i/3.)*2\n",
      "    zs.append(Z)\n",
      "    \n",
      "    pos = sense(pos[0], pos[1], Z, sensor_error)\n",
@ -850,6 +970,98 @@
     "metadata": {},
     "outputs": []
    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "######Discussion\n",
+      "\n",
+      "Here we set a bad initial guess of 100. We can see that the filter never 'acquires' the signal. Note now the peak of the filter output always lags the peak of the signal by a small amount. More clearely we can see the large gap in height between the measurement and filter. \n",
+      "\n",
+      "Maybe we just didn't adjust things 'quite right'. After all, the output looks like a sin wave, it is just offset in $x$ and $y$. Let's test this assumption.\n",
+      "\n",
+      "#####Exercise - Noisy Nonlinear Systems\n",
+      "Implement the same system, but add noise to the measurement."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "#enter your code here"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "######Solution"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "sensor_error = 30\n",
+      "movement_error = 2\n",
+      "pos = (100,500)\n",
+      "\n",
+      "zs = []\n",
+      "ps = []\n",
+      "\n",
+      "\n",
+      "for i in range(100):\n",
+      "    pos = update(pos[0], pos[1], movement, movement_error)\n",
+      "\n",
+      "    Z = math.sin(i/3.)*2 + random.randn()*1.2\n",
+      "    zs.append(Z)\n",
+      "    \n",
+      "    pos = sense(pos[0], pos[1], Z, sensor_error)\n",
+      "    ps.append(pos[0])\n",
+      "\n",
+      "\n",
+      "p1, = plt.plot(zs,c='r', linestyle='dashed')\n",
+      "p2, = plt.plot(ps, c='b')\n",
+      "plt.legend([p1,p2], ['measurement', 'filter'], 3)\n",
+      "plt.show()"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "###### Discussion\n",
+      "This is terrible! The output is not at all like a sin wave, except in the grossest way. With linear systems we could add extreme amounts of noise to our signal and still extract a very accurate result, but here even modest noise creates a very bad result.\n",
+      "\n",
+      "Very shortly after practioners began implementing Kalman filters they realized the poor performance of them for nonlinear systems and began devising ways of dealing with it. Much of this book is devoted to this problem and its various solutions."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "### Summary\n",
+      "This information in this chapter takes some time to assimulate. To truly understand this you will probably have to work through this chapter several times. I encourage you to change the various constants and observe the results. Convince yourself that Gaussians are a good representation of a unimodal belief of something like the position of a dog in a hallway. Then convince yourself that multiplying Gaussians truly does compute a new belief from your prior belief and the new measurement. Finally, convince yourself that if you are measuring movement, that adding the Gaussians correctly updates your belief. That is all the Kalman filter does. Even now I alternate between complacency and amazement at the results. \n",
+      "\n",
+      "If you understand this, you will be able to understand multidimensional Kalman filters and the various extensions that have been make on them. If you do not fully understand this, I strongly suggest rereading this chapter. Try implementing the filter from scratch, just by looking at the equations and reading the text. Change the constants. Maybe try to implement a different tracking problem, like tracking stock prices. Experimentation will build your intuition and understanding of how these marvelous filters work."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "author notes:\n",
+      "    clean up the code - same stuff duplicated over and over - write a 'clean implemntation' at the end.\n",
+      "    \n",
+      "    "
+     ]
+    },
    {
     "cell_type": "code",
     "collapsed": false,