fixing whitespace in Rmd so diff of errata is cleaner (#46)

* fixing whitespace in Rmd so diff of errata is cleaner

* reapply kwargs fix
This commit is contained in:
Jonathan Taylor
2025-04-03 12:25:17 -07:00
committed by GitHub
parent f132c18a1c
commit 6d7e40588b
12 changed files with 392 additions and 410 deletions

View File

@@ -8,7 +8,7 @@
We include our usual imports seen in earlier labs.
@@ -20,7 +20,7 @@ import statsmodels.api as sm
from ISLP import load_data
```
We also collect the new imports
needed for this lab.
@@ -52,7 +52,7 @@ true_mean = np.array([0.5]*50 + [0]*50)
X += true_mean[None,:]
```
To begin, we use `ttest_1samp()` from the
`scipy.stats` module to test $H_{0}: \mu_1=0$, the null
hypothesis that the first variable has mean zero.
@@ -62,7 +62,7 @@ result = ttest_1samp(X[:,0], 0)
result.pvalue
```
The $p$-value comes out to 0.931, which is not low enough to
reject the null hypothesis at level $\alpha=0.05$. In this case,
$\mu_1=0.5$, so the null hypothesis is false. Therefore, we have made
@@ -159,7 +159,7 @@ ax.legend()
ax.axhline(0.05, c='k', ls='--');
```
As discussed previously, even for moderate values of $m$ such as $50$,
the FWER exceeds $0.05$ unless $\alpha$ is set to a very low value,
such as $0.001$. Of course, the problem with setting $\alpha$ to such
@@ -181,7 +181,7 @@ for i in range(5):
fund_mini_pvals
```
The $p$-values are low for Managers One and Three, and high for the
other three managers. However, we cannot simply reject $H_{0,1}$ and
$H_{0,3}$, since this would fail to account for the multiple testing
@@ -211,8 +211,8 @@ reject, bonf = mult_test(fund_mini_pvals, method = "bonferroni")[:2]
reject
```
The $p$-values `bonf` are simply the `fund_mini_pvalues` multiplied by 5 and truncated to be less than
or equal to 1.
@@ -220,7 +220,7 @@ or equal to 1.
bonf, np.minimum(fund_mini_pvals * 5, 1)
```
Therefore, using Bonferronis method, we are able to reject the null hypothesis only for Manager
One while controlling FWER at $0.05$.
@@ -232,8 +232,8 @@ hypotheses for Managers One and Three at a FWER of $0.05$.
mult_test(fund_mini_pvals, method = "holm", alpha=0.05)[:2]
```
As discussed previously, Manager One seems to perform particularly
well, whereas Manager Two has poor performance.
@@ -242,8 +242,8 @@ well, whereas Manager Two has poor performance.
fund_mini.mean()
```
Is there evidence of a meaningful difference in performance between
these two managers? We can check this by performing a paired $t$-test using the `ttest_rel()` function
from `scipy.stats`:
@@ -253,7 +253,7 @@ ttest_rel(fund_mini['Manager1'],
fund_mini['Manager2']).pvalue
```
The test results in a $p$-value of 0.038,
suggesting a statistically significant difference.
@@ -278,8 +278,8 @@ tukey = pairwise_tukeyhsd(returns, managers)
print(tukey.summary())
```
The `pairwise_tukeyhsd()` function provides confidence intervals
for the difference between each pair of managers (`lower` and
`upper`), as well as a $p$-value. All of these quantities have
@@ -309,7 +309,7 @@ for i, manager in enumerate(Fund.columns):
fund_pvalues[i] = ttest_1samp(Fund[manager], 0).pvalue
```
There are far too many managers to consider trying to control the FWER.
Instead, we focus on controlling the FDR: that is, the expected fraction of rejected null hypotheses that are actually false positives.
The `multipletests()` function (abbreviated `mult_test()`) can be used to carry out the Benjamini--Hochberg procedure.
@@ -319,7 +319,7 @@ fund_qvalues = mult_test(fund_pvalues, method = "fdr_bh")[1]
fund_qvalues[:10]
```
The *q-values* output by the
Benjamini--Hochberg procedure can be interpreted as the smallest FDR
threshold at which we would reject a particular null hypothesis. For
@@ -346,8 +346,8 @@ null hypotheses!
(fund_pvalues <= 0.1 / 2000).sum()
```
Figure~\ref{Ch12:fig:BonferroniBenjamini} displays the ordered
$p$-values, $p_{(1)} \leq p_{(2)} \leq \cdots \leq p_{(2000)}$, for
the `Fund` dataset, as well as the threshold for rejection by the
@@ -376,7 +376,7 @@ else:
sorted_set_ = []
```
We now reproduce the middle panel of Figure~\ref{Ch12:fig:BonferroniBenjamini}.
```{python}
@@ -391,7 +391,7 @@ ax.scatter(sorted_set_+1, sorted_[sorted_set_], c='r', s=20)
ax.axline((0, 0), (1,q/m), c='k', ls='--', linewidth=3);
```
## A Re-Sampling Approach
Here, we implement the re-sampling approach to hypothesis testing
@@ -407,8 +407,8 @@ D['Y'] = pd.concat([Khan['ytrain'], Khan['ytest']])
D['Y'].value_counts()
```
There are four classes of cancer. For each gene, we compare the mean
expression in the second class (rhabdomyosarcoma) to the mean
expression in the fourth class (Burkitts lymphoma). Performing a
@@ -428,8 +428,8 @@ observedT, pvalue = ttest_ind(D2[gene_11],
observedT, pvalue
```
However, this $p$-value relies on the assumption that under the null
hypothesis of no difference between the two groups, the test statistic
follows a $t$-distribution with $29+25-2=52$ degrees of freedom.
@@ -457,8 +457,8 @@ for b in range(B):
(np.abs(Tnull) < np.abs(observedT)).mean()
```
This fraction, 0.0398,
is our re-sampling-based $p$-value.
It is almost identical to the $p$-value of 0.0412 obtained using the theoretical null distribution.
@@ -514,7 +514,7 @@ for j in range(m):
Tnull_vals[j,b] = ttest_.statistic
```
Next, we compute the number of rejected null hypotheses $R$, the
estimated number of false positives $\widehat{V}$, and the estimated
FDR, for a range of threshold values $c$ in
@@ -532,7 +532,7 @@ for j in range(m):
FDRs[j] = V / R
```
Now, for any given FDR, we can find the genes that will be
rejected. For example, with FDR controlled at 0.1, we reject 15 of the
100 null hypotheses. On average, we would expect about one or two of
@@ -548,7 +548,7 @@ the genes whose estimated FDR is less than 0.1.
sorted(idx[np.abs(T_vals) >= cutoffs[FDRs < 0.1].min()])
```
At an FDR threshold of 0.2, more genes are selected, at the cost of having a higher expected
proportion of false discoveries.
@@ -556,7 +556,7 @@ proportion of false discoveries.
sorted(idx[np.abs(T_vals) >= cutoffs[FDRs < 0.2].min()])
```
The next line generates Figure~\ref{fig:labfdr}, which is similar
to Figure~\ref{Ch12:fig-plugin-fdr},
except that it is based on only a subset of the genes.