Errata fixes (#45)

* fixes to Rmd from Daniela's errata
This commit is contained in:
Jonathan Taylor
2025-04-03 12:34:12 -07:00
committed by GitHub
parent 6d7e40588b
commit 132bda168d
8 changed files with 25 additions and 23 deletions

View File

@@ -836,7 +836,7 @@ A[1:4:2,0:3:2]
Why are we able to retrieve a submatrix directly using slices but not using lists? Why are we able to retrieve a submatrix directly using slices but not using lists?
Its because they are different `Python` types, and It's because they are different `Python` types, and
are treated differently by `numpy`. are treated differently by `numpy`.
Slices can be used to extract objects from arbitrary sequences, such as strings, lists, and tuples, while the use of lists for indexing is more limited. Slices can be used to extract objects from arbitrary sequences, such as strings, lists, and tuples, while the use of lists for indexing is more limited.
@@ -889,7 +889,8 @@ A[np.array([0,1,0,1])]
``` ```
By contrast, `keep_rows` retrieves only the second and fourth rows of `A` --- i.e. the rows for which the Boolean equals `TRUE`. By contrast, `keep_rows` retrieves only the second and fourth rows of `A` --- i.e. the rows for which the Boolean equals `True`.
```{python} ```{python}
A[keep_rows] A[keep_rows]
@@ -1152,7 +1153,7 @@ Auto_re.loc[lambda df: (df['year'] > 80) & (df['mpg'] > 30),
The symbol `&` computes an element-wise *and* operation. The symbol `&` computes an element-wise *and* operation.
As another example, suppose that we want to retrieve all `Ford` and `Datsun` As another example, suppose that we want to retrieve all `Ford` and `Datsun`
cars with `displacement` less than 300. We check whether each `name` entry contains either the string `ford` or `datsun` using the `str.contains()` method of the `index` attribute of cars with `displacement` less than 300. We check whether each `name` entry contains either the string `ford` or `datsun` using the `str.contains()` method of the `index` attribute of
of the dataframe: the dataframe:
```{python} ```{python}
Auto_re.loc[lambda df: (df['displacement'] < 300) Auto_re.loc[lambda df: (df['displacement'] < 300)

View File

@@ -102,7 +102,7 @@ matrices (also called design matrices) using the `ModelSpec()` transform from `
We will use the `Boston` housing data set, which is contained in the `ISLP` package. The `Boston` dataset records `medv` (median house value) for $506$ neighborhoods We will use the `Boston` housing data set, which is contained in the `ISLP` package. The `Boston` dataset records `medv` (median house value) for $506$ neighborhoods
around Boston. We will build a regression model to predict `medv` using $13$ around Boston. We will build a regression model to predict `medv` using $13$
predictors such as `rmvar` (average number of rooms per house), predictors such as `rm` (average number of rooms per house),
`age` (proportion of owner-occupied units built prior to 1940), and `lstat` (percent of `age` (proportion of owner-occupied units built prior to 1940), and `lstat` (percent of
households with low socioeconomic status). We will use `statsmodels` for this households with low socioeconomic status). We will use `statsmodels` for this
task, a `Python` package that implements several commonly used task, a `Python` package that implements several commonly used
@@ -252,7 +252,7 @@ We can produce confidence intervals for the predicted values.
new_predictions.conf_int(alpha=0.05) new_predictions.conf_int(alpha=0.05)
``` ```
Prediction intervals are computing by setting `obs=True`: Prediction intervals are computed by setting `obs=True`:
```{python} ```{python}
new_predictions.conf_int(obs=True, alpha=0.05) new_predictions.conf_int(obs=True, alpha=0.05)
@@ -286,7 +286,7 @@ def abline(ax, b, m):
``` ```
A few things are illustrated above. First we see the syntax for defining a function: A few things are illustrated above. First we see the syntax for defining a function:
`def funcname(...)`. The function has arguments `ax, b, m` `def funcname(...)`. The function has arguments `ax, b, m`
where `ax` is an axis object for an exisiting plot, `b` is the intercept and where `ax` is an axis object for an existing plot, `b` is the intercept and
`m` is the slope of the desired line. Other plotting options can be passed on to `m` is the slope of the desired line. Other plotting options can be passed on to
`ax.plot` by including additional optional arguments as follows: `ax.plot` by including additional optional arguments as follows:
@@ -539,7 +539,7 @@ and `lstat`.
The function `anova_lm()` can take more than two nested models The function `anova_lm()` can take more than two nested models
as input, in which case it compares every successive pair of models. as input, in which case it compares every successive pair of models.
That also explains why their are `NaN`s in the first row above, since That also explains why there are `NaN`s in the first row above, since
there is no previous model with which to compare the first. there is no previous model with which to compare the first.

View File

@@ -88,7 +88,7 @@ fit is $23.62$.
We can also estimate the validation error for We can also estimate the validation error for
higher-degree polynomial regressions. We first provide a function `evalMSE()` that takes a model string as well higher-degree polynomial regressions. We first provide a function `evalMSE()` that takes a model string as well
as a training and test set and returns the MSE on the test set. as training and test sets and returns the MSE on the test set.
```{python} ```{python}
def evalMSE(terms, def evalMSE(terms,
@@ -195,7 +195,7 @@ object with the appropriate `fit()`, `predict()`,
and `score()` methods, an and `score()` methods, an
array of features `X` and a response `Y`. array of features `X` and a response `Y`.
We also included an additional argument `cv` to `cross_validate()`; specifying an integer We also included an additional argument `cv` to `cross_validate()`; specifying an integer
$K$ results in $K$-fold cross-validation. We have provided a value $k$ results in $k$-fold cross-validation. We have provided a value
corresponding to the total number of observations, which results in corresponding to the total number of observations, which results in
leave-one-out cross-validation (LOOCV). The `cross_validate()` function produces a dictionary with several components; leave-one-out cross-validation (LOOCV). The `cross_validate()` function produces a dictionary with several components;
we simply want the cross-validated test score here (MSE), which is estimated to be 24.23. we simply want the cross-validated test score here (MSE), which is estimated to be 24.23.
@@ -243,8 +243,8 @@ np.add.outer(A, B)
``` ```
In the CV example above, we used $K=n$, but of course we can also use $K<n$. The code is very similar In the CV example above, we used $k=n$, but of course we can also use $k<n$. The code is very similar
to the above (and is significantly faster). Here we use `KFold()` to partition the data into $K=10$ random groups. We use `random_state` to set a random seed and initialize a vector `cv_error` in which we will store the CV errors corresponding to the to the above (and is significantly faster). Here we use `KFold()` to partition the data into $k=10$ random groups. We use `random_state` to set a random seed and initialize a vector `cv_error` in which we will store the CV errors corresponding to the
polynomial fits of degrees one to five. polynomial fits of degrees one to five.
```{python} ```{python}
@@ -264,7 +264,7 @@ cv_error
``` ```
Notice that the computation time is much shorter than that of LOOCV. Notice that the computation time is much shorter than that of LOOCV.
(In principle, the computation time for LOOCV for a least squares (In principle, the computation time for LOOCV for a least squares
linear model should be faster than for $K$-fold CV, due to the linear model should be faster than for $k$-fold CV, due to the
availability of the formula~(\ref{Ch5:eq:LOOCVform}) for LOOCV; availability of the formula~(\ref{Ch5:eq:LOOCVform}) for LOOCV;
however, the generic `cross_validate()` function does not make however, the generic `cross_validate()` function does not make
use of this formula.) We still see little evidence that using cubic use of this formula.) We still see little evidence that using cubic
@@ -273,8 +273,9 @@ using a quadratic fit.
The `cross_validate()` function is flexible and can take The `cross_validate()` function is flexible and can take
different splitting mechanisms as an argument. For instance, one can use the `ShuffleSplit()` funtion to implement different splitting mechanisms as an argument. For instance, one can use the `ShuffleSplit()`
the validation set approach just as easily as K-fold cross-validation. function to implement
the validation set approach just as easily as $k$-fold cross-validation.
```{python} ```{python}
validation = ShuffleSplit(n_splits=1, validation = ShuffleSplit(n_splits=1,
@@ -511,7 +512,7 @@ standard formulas given in
rely on certain assumptions. For example, rely on certain assumptions. For example,
they depend on the unknown parameter $\sigma^2$, the noise they depend on the unknown parameter $\sigma^2$, the noise
variance. We then estimate $\sigma^2$ using the RSS. Now although the variance. We then estimate $\sigma^2$ using the RSS. Now although the
formula for the standard errors do not rely on the linear model being formulas for the standard errors do not rely on the linear model being
correct, the estimate for $\sigma^2$ does. We see correct, the estimate for $\sigma^2$ does. We see
{in Figure~\ref{Ch3:polyplot} on page~\pageref{Ch3:polyplot}} that there is {in Figure~\ref{Ch3:polyplot} on page~\pageref{Ch3:polyplot}} that there is
a non-linear relationship in the data, and so the residuals from a a non-linear relationship in the data, and so the residuals from a

View File

@@ -334,7 +334,7 @@ The function `fit_path()` returns a list whose values include the fitted coeffic
path[3] path[3]
``` ```
In the example above, we see that at the fourth step in the path, we have two nonzero coefficients in `'B'`, corresponding to the value $0.114$ for the penalty parameter `lambda_0`. In the example above, we see that at the fourth step in the path, we have two nonzero coefficients in `'B'`, corresponding to the value $0.0114$ for the penalty parameter `lambda_0`.
We could make predictions using this sequence of fits on a validation set as a function of `lambda_0`, or with more work using cross-validation. We could make predictions using this sequence of fits on a validation set as a function of `lambda_0`, or with more work using cross-validation.
## Ridge Regression and the Lasso ## Ridge Regression and the Lasso
@@ -913,6 +913,6 @@ ax.set_ylim([50000,250000]);
``` ```
CV error is minimized at 12, CV error is minimized at 12,
though there is little noticable difference between this point and a much lower number like 2 or 3 components. though there is little noticeable difference between this point and a much lower number like 2 or 3 components.

View File

@@ -223,7 +223,7 @@ grid.fit(X_train, High_train)
grid.best_score_ grid.best_score_
``` ```
Lets take a look at the pruned true. Lets take a look at the pruned tree.
```{python} ```{python}
ax = subplots(figsize=(12, 12))[1] ax = subplots(figsize=(12, 12))[1]
@@ -509,7 +509,7 @@ np.mean((y_test - y_hat_boost)**2)
``` ```
In this case, using $\lambda=0.2$ leads to a almost the same test MSE In this case, using $\lambda=0.2$ leads to almost the same test MSE
as when using $\lambda=0.001$. as when using $\lambda=0.001$.

View File

@@ -42,7 +42,7 @@ roc_curve = RocCurveDisplay.from_estimator # shorthand
We now use the `SupportVectorClassifier()` function (abbreviated `SVC()`) from `sklearn` to fit the support vector We now use the `SupportVectorClassifier()` function (abbreviated `SVC()`) from `sklearn` to fit the support vector
classifier for a given value of the parameter `C`. The classifier for a given value of the parameter `C`. The
`C` argument allows us to specify the cost of a violation to `C` argument allows us to specify the cost of a violation to
the margin. When the `cost` argument is small, then the margins the margin. When the `C` argument is small, then the margins
will be wide and many support vectors will be on the margin or will will be wide and many support vectors will be on the margin or will
violate the margin. When the `C` argument is large, then the violate the margin. When the `C` argument is large, then the
margins will be narrow and there will be few support vectors on the margins will be narrow and there will be few support vectors on the

View File

@@ -1137,7 +1137,7 @@ img_preds = resnet_model(imgs)
Lets look at the predicted probabilities for each of the top 3 choices. First we compute Lets look at the predicted probabilities for each of the top 3 choices. First we compute
the probabilities by applying the softmax to the logits in `img_preds`. Note that the probabilities by applying the softmax to the logits in `img_preds`. Note that
we have had to call the `detach()` method on the tensor `img_preds` in order to convert we have had to call the `detach()` method on the tensor `img_preds` in order to convert
it to our a more familiar `ndarray`. it to a more familiar `ndarray`.
```{python} ```{python}
img_probs = np.exp(np.asarray(img_preds.detach())) img_probs = np.exp(np.asarray(img_preds.detach()))

View File

@@ -10,7 +10,7 @@
In this lab we demonstrate PCA and clustering on several datasets. In this lab we demonstrate PCA and clustering on several datasets.
As in other labs, we import some of our libraries at this top As in other labs, we import some of our libraries at this top
level. This makes the code more readable, as scanning the first few level. This makes the code more readable, as scanning the first few
lines of the notebook tell us what libraries are used in this lines of the notebook tells us what libraries are used in this
notebook. notebook.
```{python} ```{python}
@@ -837,7 +837,7 @@ ax.axhline(140, c='r', linewidth=4);
``` ```
The `axhline()` function draws a horizontal line line on top of any The `axhline()` function draws a horizontal line on top of any
existing set of axes. The argument `140` plots a horizontal existing set of axes. The argument `140` plots a horizontal
line at height 140 on the dendrogram; this is a height that line at height 140 on the dendrogram; this is a height that
results in four distinct clusters. It is easy to verify that the results in four distinct clusters. It is easy to verify that the