Touched upon conditioning of variables.

Pointed out that height of students probably has two means if the
population includes males and females. Did not go into Gaussian
mixtures or conditioning of the data.
This commit is contained in:
Roger Labbe 2016-02-06 15:03:53 -08:00
parent 5587dd0fda
commit b15968e5b1

View File

@ -557,7 +557,7 @@
"Ignoring the squared terms for a moment, you can see that the variance is the *expected value* for how much the sample space ($X$) varies from the mean (squared, of course). We have the formula for the expected value $E[X] = \\sum\\limits_{i=1}^n p_ix_i$, and we will assume that any height is equally probable, so we can substitute that into the equation above to get\n",
"\n",
"$$\\mathit{VAR}(X) = \\frac{1}{n}\\sum_{i=1}^n (x_i - \\mu)^2$$\n",
"\n",
" \n",
"Let's compute the variance of the three classes to see what values we get and to become familiar with this concept.\n",
"\n",
"The mean of $X$ is 1.8 ($\\mu_x = 1.8$) so we compute\n",
@ -793,6 +793,17 @@
"print(np.std(Z))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before we continue I need to point out that I'm ignoring that on average men are taller than women. In general the height variance of a class that contains only men or women will be smaller than a class with both sexes. This is true for other factors as well. Well nourished children are taller than malnourished children. Scandinavians are taller than Italians. When designing experiments statisticians need to take these factors into account. \n",
"\n",
"I suggested we might be performing this analysis to order desks for a school district. For each age group there are likely to be two different means - one clustered around the mean height of the females, and a second mean clustered around the mean heights of the males. The mean of the entire class will be somewhere between the two. If we bought desks for the mean of all students we are likely to end up with desks that fit neither the males or females in the school! \n",
"\n",
"It's too early to understand why, but we will not normally be faced with these problems in this book. Consult any standard probability text if you need to learn techniques to deal with these issues."
]
},
{
"cell_type": "markdown",
"metadata": {},