A lot of new g-h filter code and text added.
This commit is contained in:
parent
3b4c5a547c
commit
b995668db1
587
g-h_filter.ipynb
587
g-h_filter.ipynb
@ -1,20 +1,603 @@
|
||||
{
|
||||
"metadata": {
|
||||
"name": "",
|
||||
"signature": "sha256:34395a7b0d80648bd02f314a683ae3bea97a84fdd6f885616cbb80ae0c6681c8"
|
||||
"signature": "sha256:221bed83659f3d65f886890673eb366441306ce17460af258c7b7626ee0dfda6"
|
||||
},
|
||||
"nbformat": 3,
|
||||
"nbformat_minor": 0,
|
||||
"worksheets": [
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Building Intuition via Thought Experiments\n",
|
||||
"\n",
|
||||
"Imagine we live in a world without scales - the devices you stand on to weigh yourself. One day at work a coworker comes running up to you and announces her invention to you. After she explains, you eagerly stand on it and announce the results: \"172 lbs\". You are estatic - for the first time in your life you know what you weigh. More importantly, dollar signs dance in your eyes as you imagine selling this device to weight loss clinics across the world! This is fantastic!\n",
|
||||
"\n",
|
||||
"Another coworker hears the commotion and walks over to find out what has you so excited. You explain the invention and once again step onto the scale, and proudly proclaim the result: \"161 lbs.\" And then you hesitate, confused.\n",
|
||||
"\n",
|
||||
"\"What? It read 172 lbs just a few seconds ago\" you complain to your coworker. \n",
|
||||
"\n",
|
||||
"\"I never said it was accurate,\" she replies.\n",
|
||||
"\n",
|
||||
"Sensors are inaccurate. This is the motivation behind a huge body of work in filtering, and solving this problem is the topic of this book. I could just provide the solutions that have been developed over the last half century, but these solutions developed by asking very basic, fundamental questions into the nature of what we know and how we know it. Before we attempt the math, let's follow that journey of discovery, and see if it does not inform our intuition about filtering. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Try Another Scale\n",
|
||||
"\n",
|
||||
"Is there any way we can improve upon this result? The obvious, first thing to try is get a better sensor. Unfortunately, your co-worker informs you that she has built 10 scales, and they all operate with about the same accuracy. You have her bring out another scale, and you weigh yourself on one, and then on the other. The first scale (A) reads \"160lb\", and the second (B) reads \"170lbs\". What can we conclude about your weight?\n",
|
||||
"\n",
|
||||
"Well, what are our choices?\n",
|
||||
"\n",
|
||||
"* We could choose to only believe A, and assign 160lbs to our weight estimate. \n",
|
||||
"* we could choose to only believe B, and assign 170lbs to our weight.\n",
|
||||
"* We could choose a number less than either A or B\n",
|
||||
"* We could choose a number greater than either A or B\n",
|
||||
"* We could choose a number between A and B\n",
|
||||
"\n",
|
||||
"The first two choices are plausible, but we have no reason to favor one scale over the other. Why would we choose to believe A more than B? We have no reason for such a belief. The third and fourth choices are irrational. The scales are admittedly not very accurate, but there is no reason at all to choose a number outside of the range of what they measure. The final choice is the only reasonable one. If both scales are inaccurate, and as likely to give a result above my actual weight as below it, more often than not probably the answer is somewhere between A and B. \n",
|
||||
"\n",
|
||||
"In mathematics this concept is formalized as *expected value*, and we will cover it in depth later. For now ask youself what would be the 'usual' thing to happen if we made one million separate readings. Some of the times both scales will read too low, sometimes that will both read too high, and the rest of the time they will straddle the actual weight. If they straddle the actual weight then certainly we should choose a number between A and B. If they don't straddle then we don't know if they are both too high or low, but by choosing a number between A and B we at least mitigate the effect of the worst measurement. For example, suppose our actual weight is 180lbs. 160lbs is a big error. But if we choose a weight between 160lbs and 170lbs our estimate will be better than 160lbs. The same argument holds if both scales returned a value greater than the actual weight.\n",
|
||||
"\n",
|
||||
"We will deal with this more formally later, but for now I hope it is clear that our best estimate is just the average of A and B. $\\frac{160+170}{2} = 165$.\n",
|
||||
"\n",
|
||||
"Let's play 'what if' some more. What if we are now told that A is three times more accurate than B? Consider the 5 options we listed above. It still makes no sense to choose a number outside the range of A and B, so we will not consider those. It perhaps seems more compelling to choose A as our estimate - after all, we know it is more accurate, why not just use it instead of B? Can B possibly improve our knowledge over A alone?\n",
|
||||
"\n",
|
||||
"The answer, perhaps counterintuitively, is yes, it can. Consider this case. We know scale A is accurate to 1 lb. In other words, if we weight 170 lbs, it could report 169,170, or 171 lbs. We know that scale B is accurate to 9 lbs. We do a reading, and A=160, and B=170. What should we estimate our weight to be?\n",
|
||||
"\n",
|
||||
"Well, if we say 160 lbs we would be wrong, because B can only be 9 pounds off, and 170 lbs - 160 lbs is 10 lbs. 160 is not a possible measurement for B. In fact, the only number that satifies all of the constraints is 161 lbs. That is 1 lb within the reading of A, and 9 lbs within the reading of B. \n",
|
||||
"\n",
|
||||
"This is an important result. With two relatively inaccurate sensors we were able to deduce an extremely accurate result. Now sure, that was a specially constructed case, but it generalizes. What if A is accurate to 3 lbs, B is accurate to 11 lbs, and we get the measurements of A=160 lbs and B=170 lbs? The result can only be from 159 lbs to 163 lbs, which is better than the range of 157 lbs to 163 lbs that is the range of values that A alone allows.\n",
|
||||
"\n",
|
||||
"> So two sensors, even if one is less accurate than the other, is better than one.\n",
|
||||
"\n",
|
||||
"However, we have strayed from our problem. No customer is going to want to buy multiple scales, and besides, we initially started with an assumption that all scales were equally (in)accurate. \n",
|
||||
"\n",
|
||||
"So, what if I have one scale, but I weigh myself many times? We concluded that if we had two scales of equal accuracy we should average the results of their measurements. What if I weigh myself 1,000,000 times with one scale? We have already stated that the scale is equally likely to return a number too large as it is to return one that is too small. I will not prove it, but it can be proved that the average of a large number of weighings will be extremely close to my actual weight. Consider a simple case - the scale is accurate to within 1 lb. If I weigh 170, it will return one of 169, 170, and 171. The average of a bunch of 170 is 170, so we can exclude those. What is left is measurements of 169 and 171. But we know there will be as many 169s as there are 171s. The average of those will also be 170, and so the average of all must be 170, my true weight. It's not that hard to extend this to any arbitrary accuracy.\n",
|
||||
"\n",
|
||||
"Okay, great! We have an answer, and phone the UB Thin chain to tell them the good news. As you might expect, their CEO does not share your enthusiam. First, no one has the patience to weigh themselves a million, or even a hundred times. Second, people are far less interested in what they weigh than whether they are actually losing weight or not. \n",
|
||||
"\n",
|
||||
"So, let's play 'what if' again. What if you measured your weight once a day, and got the readings 170, 161, and then 169. Did you gain weight, lose weight, or is this all just noisy measurements? \n",
|
||||
"\n",
|
||||
"We really can't say. The first measurement was 170, and the last was 169, implying a 1 lb loss. But if the scale is only accurate to 10 lbs, that is explainable by noise - bad measurements. I could have actually gained weight; maybe my weight on day one was 165 lbs, and on day three it was 172. It is possible to get those weight readings with that weight gain. My scale tells me I am losing weight, and I am actually gaining weight! The CEO of UB Thin will not be interested in buying this.\n",
|
||||
"\n",
|
||||
"Shall we give up? No, let's play 'what if'. Suppose I take a different scale, and I get the following measurements: 169, 170, 169, 171, 170, 171, 169, 170, 169, 170. What does your intuition tell you? It is possible, for example, that you gained 1 lb each day, and the noisy measurements just happens to look like you stayed the same weight. Equally, you could have lost 1 lb a day and gotten the same readings. But is that likely? How likely is it to flip a coin and get 10 heads in a row? Not very likely. We can't prove it, but it seems pretty likely that my weight held steady.\n",
|
||||
"\n",
|
||||
"Another what if: what if the readings were 158.0, 164.2, 160.3, 159.9, 162.1, 164.6, 169.6, 167.4, 166.4, 171.0? Let's look at a chart of that and then answer some questions."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"%matplotlib inline\n",
|
||||
"from __future__ import division, print_function\n",
|
||||
"import matplotlib.pyplot as plt\n",
|
||||
"import matplotlib.pylab as pylab\n",
|
||||
"import numpy as np\n",
|
||||
"\n",
|
||||
"# make all plots a reasonable size\n",
|
||||
"pylab.rcParams['figure.figsize'] = 12,6\n",
|
||||
"\n",
|
||||
"weights = [158.0, 164.2, 160.3, 159.9, 162.1, 164.6, 169.6, 167.4, 166.4, 171.0]\n",
|
||||
"plt.plot(weights)\n",
|
||||
"plt.show()"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Does it 'seem' likely that I lost weight and this is just really noisy data? Not really. Does it seem likely that I held the same weight? Again, no. This data trends upwards over time; not evenly, but definitely upwards. Lets look at that in a chart. We can't be sure, but that surely looks like a weight gain, and a signifant weight gain at that. Let's test this assumption with some more plots. It is easier to 'eyeball' data sometimes.\n",
|
||||
"\n",
|
||||
"So let's look at two hypothesis. First, let's assume we held the same weight. To get that number, we agreed that we should just average all the measurements. Let's look at that."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"import numpy as np\n",
|
||||
"ave = np.sum(weights) / len(weights)\n",
|
||||
"plt.plot([0,9], [ave,ave], c='r')\n",
|
||||
"plt.plot(weights, c='b')\n",
|
||||
"plt.show()"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"That doesn't look very convincing. \n",
|
||||
"\n",
|
||||
"Now, let's assume we we gained weight. How much? I don't know, but numpy does! We just want to draw a line through the measurements that looks 'about' right. numpy has functions that will do this according to a rule called \"least squared fit\". Let's not worry about the details of that computation, and just plot the results."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"xs = range(len(weights))\n",
|
||||
"line = np.poly1d(np.polyfit(xs, weights, 1))\n",
|
||||
"plt.plot (xs, line(xs), c='r')\n",
|
||||
"\n",
|
||||
"plt.plot(weights, c='b')\n",
|
||||
"plt.show()"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This looks much better, at least to my eyes. It seems far more likely to be true that I gained weight than I didn't gain any weight. Did I actually gain 13 lbs? Who can say? That seems impossible to answer.\n",
|
||||
"\n",
|
||||
"\"But is it impossible?\" pipes up a coworker.\n",
|
||||
"\n",
|
||||
"Let's try something crazy. Let's assume that somehow I know I am gaining about one lb a day. It doesn't matter how I know that, right now, just assume I know it somehow. Maybe I am eating a 6000 calorie a day diet, which would result in such a weight gain. Maybe there is another way. Let's just see if we can make use of such information if it was available.\n",
|
||||
"\n",
|
||||
"The first measurement was 158. We have no way of knowing any different, so let's just accept that as our estimate. If our weight today is 158, what will it be tomorrow? Well, we think we are gaining weight at 1 lb/day, so our prediction is 159, like so:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"ax = plt.axes()\n",
|
||||
"ax.annotate('', xy=[1,159], xytext=[0,158],\n",
|
||||
" arrowprops=dict(arrowstyle='->', ec='r',shrinkA=6, lw=3,shrinkB=5))\n",
|
||||
"plt.scatter ([0], [158], c='b')\n",
|
||||
"plt.scatter ([1], [159], c='r')\n",
|
||||
"plt.show()"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Okay, but what good is this? Sure, we could just assume the 1 lb/day is accurate, and just predict our weight for 10 days, but then why use a scale at all if we don't incorporate its readings? So let's look at the next measurement."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"ax = plt.axes()\n",
|
||||
"ax.annotate('', xy=[1,159], xytext=[0,158],\n",
|
||||
" arrowprops=dict(arrowstyle='->', \n",
|
||||
" ec='r', lw=3, shrinkA=6, shrinkB=5))\n",
|
||||
"plt.scatter ([0,1], [158.0, 164.2], c='b')\n",
|
||||
"plt.scatter ([1], [159], c='r')\n",
|
||||
"plt.text (1.03, 159, \"prediction\")\n",
|
||||
"plt.text (1.03, 164.2, \"measurement\")\n",
|
||||
"plt.show()"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Here the measurements are in blue, and the estimates are in red. So we have a problem. Our prediction doesn't match our measurement. But, that is what we expected, right?. If the prediction was always exactly the same as the measurement, what value could it be? \n",
|
||||
"\n",
|
||||
"> The key insight to this entire book follows. Read it carefully!\n",
|
||||
"\n",
|
||||
"So what do we do? If we only take data from the measurement than the prediction will not affect the result. If we only take data from the prediction then the measurement will be ignored. If this is to work we need to take some kind of blend of the prediction and measurement. \n",
|
||||
"\n",
|
||||
"Blending two values - this sounds a lot like the two scale problem earlier. Using the same reasoning as before we can see that the only thing that makes sense is to choose a number between the prediction and the measurement. For example, an estimate of 165 makes no sense, nor does 157. Our estimates should like between 159 and 164.2. \n",
|
||||
"\n",
|
||||
"Should it be half way? Maybe, but in general it seems like we might know that our prediction is very accurate or very inaccurate. Probably the accuracy of our prediction differs from the accuracy of the scale. Recall what we did when A was much more accurate than B - we scaled the answer to be closer to A than B. \n",
|
||||
"\n",
|
||||
"Let's try a randomly chosen number: $\\frac{6}{10}$. Our estimate will be six tenth the prediction and the rest will be from the measurement. Let's just code that up and see the result."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"w = 160\n",
|
||||
"gain = 1\n",
|
||||
"scale = 6/10\n",
|
||||
"\n",
|
||||
"# we are storing the estimates so we can use them in the next problem.\n",
|
||||
"estimates = []\n",
|
||||
"\n",
|
||||
"for i in range (1, len(weights)):\n",
|
||||
" z = weights[i] # most filter literature uses 'z' for measurements\n",
|
||||
" w = (w+gain) * scale + (1-scale) * z\n",
|
||||
" estimates.append(w)\n",
|
||||
" plt.scatter(i, w)\n",
|
||||
"plt.plot(weights, c='r')\n",
|
||||
"plt.plot([0,9],[160,170],c='g')\n",
|
||||
"plt.show()"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"That is pretty good! The blue dots, showing our estimates, are not a straight line, but they are straighter than the measurements and somewhat close to the trend line we created. Also, it seems to get better over time. This may strike you as silly; of course the data will look good if we assume the conclusion, that our weight gain is around 1 lb.\n",
|
||||
"\n",
|
||||
"But, what if? What if instead of just leaving the weight gain at 1 lb, we compute it from our estimate. In other words, our first estimate is 162.28. We started at 158; this implies a weight gain not of 1, but 4.28 lbs. Can we use this information somehow? It kind of seems plausible. After all, the weight measurement itself is based on a real world measurement of our weight, so there is useful information. Our estimate of our weight gain may not be perfect, but it is surely better than just guessing our gain is 1 lb. Data is better than a guess, even if it is noisy.\n",
|
||||
"\n",
|
||||
"So, should we just set the new gain/day to 4.28 lbs? Hmm, sounds like our same problem again. We have two numbers (the gain/day for yesterday and for today), and want to combine them somehow to get tomorrow's gain. Let's use our same tool, and the only tool we have so far - take a combination of the two. This time I will use another arbitrarily chosen number, $\\frac{2}{3}$."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"import numpy.random as random\n",
|
||||
"w = 160\n",
|
||||
"gain = 1\n",
|
||||
"weight_scale = 6/10\n",
|
||||
"gain_scale = 2/3\n",
|
||||
"\n",
|
||||
"for i in range (1, len(weights)):\n",
|
||||
" z = weights[i]\n",
|
||||
" new_w = (w+gain) * weight_scale + z * (1-weight_scale)\n",
|
||||
" new_gain = new_w - w\n",
|
||||
" gain = new_gain * gain_scale + gain * (1-gain_scale)\n",
|
||||
" w = new_w\n",
|
||||
" plt.scatter(i, w,c='b')\n",
|
||||
"\n",
|
||||
"plt.plot(weights, c='r')\n",
|
||||
"plt.plot([0,9],[160,170],c='g')\n",
|
||||
"plt.show()"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"I think this is starting to look really good. Okay, we had no methodology whatsover for choosing our scaling factors of $\\frac{6}{10}$ and $\\frac{2}{3}$, and we 'luckily' choose 1 lb/day as our initial guess for the weight gain, but otherwise all of the reasoning followed from very reasonable assumptions. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### The g-h Filter\n",
|
||||
"\n",
|
||||
"This algorithm is known as the g-h filter. *g* and *h* refer to the two scaling factors that we used. *g* is the scaling we used for the measurement (weight in our example), and *h* is the scaling for the change in measurement over time (lbs/day in our example).\n",
|
||||
"\n",
|
||||
"This filter is the basis for a huge number of filters, including the Kalman filter. In other words, the Kalman filter is a form of the g-h filter. So is the Least Squares filter, which you may have heard of, and so is the Benedict-Bordner filter, which you probably have not heard of. Each filter has a different way of assigning values to *g* and *h*, but otherwise the algorithms are identical. For example, the $\\alpha$-$\\beta$ filter just assigns a constant to *g* and *h*, constained to a certain range of values. Other filters will vary *g* and *h* dynamically, and filters like the Kalman filter will vary them based on the number of dimensions in the problem. \n",
|
||||
"\n",
|
||||
"Let me repeat the key insights as they are so important. If you do not understand these you will not understand the rest of the book. If you do understand them, then the rest of the book will unfold naturally for you, as mathematically elaborations to various 'what if' questions we will ask about *g* and *h*.\n",
|
||||
"\n",
|
||||
"* Multiple measurements are more accurate than one measurement\n",
|
||||
"* Always choose a number part way between two measurements to create a more accurate estimate\n",
|
||||
"* Predict the next measurement based on the current estimate and how much we think it will change\n",
|
||||
"* The new estimate is then chosen as part way between the prediction and next measurement\n",
|
||||
"\n",
|
||||
"Perhaps this all seems quite magical to you. For example, did we get 'lucky' in choosing a weight gain of 1 lb/day? Each filter has it's own way of dealing this problem. I ask you to trust me that there is a very satisfactory answer for this problem in the Kalman filter. And how, exactly, do we get more accurate results by computing weight gain from our estimates? It kind of makes sense, but then again it doesn't. It feels like a magician's trick.\n",
|
||||
"\n",
|
||||
"The rest of this book can be considered an elaboration on those two questions. Now is perhaps a good time to develop the mathematics of *g-h* filters, but I would like to take a different approach. There is another feature of these filters we have barely touched upon - Bayesian statistics. I used terms like \"what are the chances that....\". Well, it turns out we can provide exact mathematical representations of questions like that, and use some startling and very beautiful math to derive extremely powerful filters that we use to fly our planes, navigate our ships, drive our cars, track and predict the economy, and so much more. \n",
|
||||
"\n",
|
||||
"So I will take the next few chapters to develop the probabilistic way of thinking about these filters, with nary a *g* or *h* to be seen. Yet suddenly this same algorithm will appear, this time with a formal mathematical edifice that allows us to create filters from multiple sensors, to accurately estimate the amount of error in our solution, and to control robots.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Exercise: Write Generic Algorithm\n",
|
||||
"\n",
|
||||
"In the example above, I explicitly coded this to solve the weighing problem that we've been discussing throughout the chapter. For example, the variables are named \"weight_scale\", \"gain\", and so on. I did this to make the algorithm easy to follow - you can easily see that we correctly implemented each step. But, this is specialized code. Rewrite it to work with any data. Use this function signature:\n",
|
||||
"\n",
|
||||
" def g_h_filter (data, x0, dx, g, h)\n",
|
||||
" '''blah'''\n",
|
||||
" \n",
|
||||
"Test it by passing in the same weight data as before, plot the results, and visually determine that it works.\n",
|
||||
" \n",
|
||||
"#### Solution and Discussion"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"def g_h_filter (data, x0, dx, g, h):\n",
|
||||
" x = x0\n",
|
||||
" results = []\n",
|
||||
" for z in data:\n",
|
||||
" x_est = (x+dx) * g + z * (1-g)\n",
|
||||
" dx_est = x_est - x\n",
|
||||
" dx = dx_est * gain_scale + dx * (1-gain_scale)\n",
|
||||
" x = x_est\n",
|
||||
" results.append(x_est)\n",
|
||||
" \n",
|
||||
" return results\n",
|
||||
"\n",
|
||||
"def plot_g_h_results (measurements, filtered_data):\n",
|
||||
" \n",
|
||||
" plt.scatter (range(len(filtered_data)), filtered_data)\n",
|
||||
" plt.plot(measurements)\n",
|
||||
" plt.show()\n",
|
||||
" \n",
|
||||
"\n",
|
||||
"data = g_h_filter (weights, 160., 1., 6/10, 2/3)\n",
|
||||
"plot_g_h_results (weights, data)"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Exercise: create measurement function\n",
|
||||
"\n",
|
||||
"Now let's write a function that generates noisy data for us. We want a function that we call with the starting value, the amount of change per step, the number of steps, and the amount of noise we want to add. It should return a list of the data. Use *numpy.random.randn()* to generate white noise.\n",
|
||||
"\n",
|
||||
"#### Solution and Discussion"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"def gen_data (x0, dx, count, noise_factor):\n",
|
||||
" return [x0 + dx*i + random.randn()*noise_factor for i in range (count)]"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"I am using a Python idiom that might be unfamiliar to you. You can skip to the next exercise if you understand this code, otherwise here is a quick description.\n",
|
||||
"\n",
|
||||
"This code uses an idiom called *list comprehensions*. This allows you insert code right inside of your list; when encountered, the code is run and the list will contain the data generated from the code. Let's look at a simple example:\n",
|
||||
"\n",
|
||||
" [i for i in range(10)]\n",
|
||||
" \n",
|
||||
"The result of this code is the list [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]. The general structure is\n",
|
||||
" [expr for_loop]\n",
|
||||
" \n",
|
||||
"where 'for_loop' is a for loop that uses some \n",
|
||||
"\n",
|
||||
"**blah blah fill this in**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now write code that uses *gen_data* to create 100 data points that starts at 5, has a derivative of 2, a noise scaling factor of 10, and uses g=0.8 and h=0.02. Set you initial guess for x to be 100.\n",
|
||||
"\n",
|
||||
"#### Solution and Discussion"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"zs = gen_data (x0=5, dx=2, count=100, noise_factor=10)\n",
|
||||
"data = g_h_filter (data=zs, x0=100., dx=2., g=0.8, h=0.02)\n",
|
||||
"plot_g_h_results (measurements=zs, filtered_data=data)"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The filter starts out with estimates that are far from the measured data due to the bad initial guess of 100. You can see that it 'rings' before settling in on the measured data. 'Ringing' means that the signal overshoots and undershoots the data in a sinusodial type pattern. This is a very common phenomena in filters, and a lot of work in filter design is devoted to minimizing ringing. That is a topic that we are not yet prepared to address, but I wanted to show you the phenomenon.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"####Exercise: Extreme Noise\n",
|
||||
"\n",
|
||||
"Rerun the same test, but this time use a noise factor of 100. Remove the initial condition ringing by changing the initial condition from 100 down to 5.\n",
|
||||
"\n",
|
||||
"##### Solution and Discussion"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"zs = gen_data (x0=5, dx=2, count=100, noise_factor=100)\n",
|
||||
"data = g_h_filter (data=zs, x0=5., dx=2., g=0.8, h=0.02)\n",
|
||||
"plot_g_h_results (measurements=zs, filtered_data=data)"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This doesn't look so wonderful to me. We can see that perhaps the filtered signal varies less than the noisey signal, but it is far from the straight line. If we were to plot just the filtered result no one would guess that the signal with no noise starts at 5 and increments by 2 at each time step. And while in locations the filter does seem to reduce the noise, in other places it seems to overshoot and undershoot.\n",
|
||||
"\n",
|
||||
"At this poit we don't know enough to really judge this. We added **a lot** of noise; maybe this is as good as filtering can get. However, the existance of the multitude of chapters beyond this one should suggest that we can do much better than this."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Excercise: Acceleration\n",
|
||||
"\n",
|
||||
"Write a new data generation function that adds in a constant acceleration factor to each data point. In other words, increment dx as you compute each data point so that the velocity (dx) is ever increasing. Set the noise to 0 and plot the results. Explain what you see.\n",
|
||||
"\n",
|
||||
"#####Solution and Discussion"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"def gen_data (x0, dx, count, noise_factor, accel=0):\n",
|
||||
" zs = []\n",
|
||||
" for i in range (count):\n",
|
||||
" zs.append (x0 + dx*i + random.randn()*noise_factor)\n",
|
||||
" dx += accel\n",
|
||||
" return zs\n",
|
||||
" \n",
|
||||
"zs = gen_data (x0=0, dx=0, count=20, noise_factor=0, accel = 2)\n",
|
||||
"data = g_h_filter (data=zs, x0=0, dx=0, g=0.8, h=0.02)\n",
|
||||
"plt.xlim([0,20])\n",
|
||||
"plot_g_h_results (measurements=zs, filtered_data=data)"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Each prediction lags behind the signal. If you think about what is happening this makes sense. Our model assumes that velocity is constant. The g-h filter computes the first derivative of $x$ (we use $\\dot{x}$ to denote the derivative) but not the second derivative $\\ddot{x}$. So we are assuming that $\\ddot{x}=0$. At each prediction step we predict the new value of x as $x + \\dot{x}*t$. But because of the acceleration the prediction must necessarily fall behind the actual value. We then try to compute a new value for $\\dot{x}$, but because of the $h$ factor we only partially adjust $\\dot{x}$ to the new velocity. On the next iteration we will again fall short.\n",
|
||||
"\n",
|
||||
"Note that there is no adjustment to $g$ or $h$ that we can make to correct this problem. This is called the *lag error* or *systemic error* of the system. It is a fundamental property of g-h filters. Perhaps your mind is already suggesting solutions or workarounds to this problem. As you might expect, a lot of research has been devoted to this problem, and we will be presenting various solutions to this problem in this book.\n",
|
||||
"> The 'take home' point is that the filter is only as good as the mathematical model used to express the system. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Exercise: Time Step\n",
|
||||
"\n",
|
||||
"Make time step large relative to taylor expansion."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"zs = gen_data (x0=5, dx=2, count=5000, noise_factor=1, accel=1)\n",
|
||||
"large_step = [zs[i] for i in range (0,len(zs), 50)]\n",
|
||||
"data = g_h_filter (data=large_step, x0=large_step[0], \n",
|
||||
" dx=large_step[1]-large_step[0], \n",
|
||||
" g=0.8, h=0.02)\n",
|
||||
"plot_g_h_results (measurements=large_step, filtered_data=data)"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": []
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Exercise: bad initial conditions\n",
|
||||
"\n",
|
||||
"#### Exercise: alter g\n",
|
||||
"\n",
|
||||
"What are effects?\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"#### Exercise: alter h\n",
|
||||
"\n",
|
||||
"What are effects?"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"collapsed": false,
|
||||
"input": [
|
||||
"#format the book\n",
|
||||
"from IPython.core.display import HTML\n",
|
||||
"def css_styling():\n",
|
||||
" styles = open(\"./styles/custom2.css\", \"r\").read()\n",
|
||||
" return HTML(styles)\n",
|
||||
"css_styling()"
|
||||
],
|
||||
"language": "python",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"prompt_number": ""
|
||||
}
|
||||
],
|
||||
"metadata": {}
|
||||
|
Loading…
Reference in New Issue
Block a user