Errata fixes (#45)

* fixes to Rmd from Daniela's errata
This commit is contained in:
Jonathan Taylor
2025-04-03 12:34:12 -07:00
committed by GitHub
parent 6d7e40588b
commit 132bda168d
8 changed files with 25 additions and 23 deletions

View File

@@ -836,7 +836,7 @@ A[1:4:2,0:3:2]
Why are we able to retrieve a submatrix directly using slices but not using lists?
Its because they are different `Python` types, and
It's because they are different `Python` types, and
are treated differently by `numpy`.
Slices can be used to extract objects from arbitrary sequences, such as strings, lists, and tuples, while the use of lists for indexing is more limited.
@@ -889,7 +889,8 @@ A[np.array([0,1,0,1])]
```
By contrast, `keep_rows` retrieves only the second and fourth rows of `A` --- i.e. the rows for which the Boolean equals `TRUE`.
By contrast, `keep_rows` retrieves only the second and fourth rows of `A` --- i.e. the rows for which the Boolean equals `True`.
```{python}
A[keep_rows]
@@ -1152,7 +1153,7 @@ Auto_re.loc[lambda df: (df['year'] > 80) & (df['mpg'] > 30),
The symbol `&` computes an element-wise *and* operation.
As another example, suppose that we want to retrieve all `Ford` and `Datsun`
cars with `displacement` less than 300. We check whether each `name` entry contains either the string `ford` or `datsun` using the `str.contains()` method of the `index` attribute of
of the dataframe:
the dataframe:
```{python}
Auto_re.loc[lambda df: (df['displacement'] < 300)

View File

@@ -102,7 +102,7 @@ matrices (also called design matrices) using the `ModelSpec()` transform from `
We will use the `Boston` housing data set, which is contained in the `ISLP` package. The `Boston` dataset records `medv` (median house value) for $506$ neighborhoods
around Boston. We will build a regression model to predict `medv` using $13$
predictors such as `rmvar` (average number of rooms per house),
predictors such as `rm` (average number of rooms per house),
`age` (proportion of owner-occupied units built prior to 1940), and `lstat` (percent of
households with low socioeconomic status). We will use `statsmodels` for this
task, a `Python` package that implements several commonly used
@@ -252,7 +252,7 @@ We can produce confidence intervals for the predicted values.
new_predictions.conf_int(alpha=0.05)
```
Prediction intervals are computing by setting `obs=True`:
Prediction intervals are computed by setting `obs=True`:
```{python}
new_predictions.conf_int(obs=True, alpha=0.05)
@@ -286,7 +286,7 @@ def abline(ax, b, m):
```
A few things are illustrated above. First we see the syntax for defining a function:
`def funcname(...)`. The function has arguments `ax, b, m`
where `ax` is an axis object for an exisiting plot, `b` is the intercept and
where `ax` is an axis object for an existing plot, `b` is the intercept and
`m` is the slope of the desired line. Other plotting options can be passed on to
`ax.plot` by including additional optional arguments as follows:
@@ -539,7 +539,7 @@ and `lstat`.
The function `anova_lm()` can take more than two nested models
as input, in which case it compares every successive pair of models.
That also explains why their are `NaN`s in the first row above, since
That also explains why there are `NaN`s in the first row above, since
there is no previous model with which to compare the first.

View File

@@ -88,7 +88,7 @@ fit is $23.62$.
We can also estimate the validation error for
higher-degree polynomial regressions. We first provide a function `evalMSE()` that takes a model string as well
as a training and test set and returns the MSE on the test set.
as training and test sets and returns the MSE on the test set.
```{python}
def evalMSE(terms,
@@ -195,7 +195,7 @@ object with the appropriate `fit()`, `predict()`,
and `score()` methods, an
array of features `X` and a response `Y`.
We also included an additional argument `cv` to `cross_validate()`; specifying an integer
$K$ results in $K$-fold cross-validation. We have provided a value
$k$ results in $k$-fold cross-validation. We have provided a value
corresponding to the total number of observations, which results in
leave-one-out cross-validation (LOOCV). The `cross_validate()` function produces a dictionary with several components;
we simply want the cross-validated test score here (MSE), which is estimated to be 24.23.
@@ -243,8 +243,8 @@ np.add.outer(A, B)
```
In the CV example above, we used $K=n$, but of course we can also use $K<n$. The code is very similar
to the above (and is significantly faster). Here we use `KFold()` to partition the data into $K=10$ random groups. We use `random_state` to set a random seed and initialize a vector `cv_error` in which we will store the CV errors corresponding to the
In the CV example above, we used $k=n$, but of course we can also use $k<n$. The code is very similar
to the above (and is significantly faster). Here we use `KFold()` to partition the data into $k=10$ random groups. We use `random_state` to set a random seed and initialize a vector `cv_error` in which we will store the CV errors corresponding to the
polynomial fits of degrees one to five.
```{python}
@@ -264,7 +264,7 @@ cv_error
```
Notice that the computation time is much shorter than that of LOOCV.
(In principle, the computation time for LOOCV for a least squares
linear model should be faster than for $K$-fold CV, due to the
linear model should be faster than for $k$-fold CV, due to the
availability of the formula~(\ref{Ch5:eq:LOOCVform}) for LOOCV;
however, the generic `cross_validate()` function does not make
use of this formula.) We still see little evidence that using cubic
@@ -273,8 +273,9 @@ using a quadratic fit.
The `cross_validate()` function is flexible and can take
different splitting mechanisms as an argument. For instance, one can use the `ShuffleSplit()` funtion to implement
the validation set approach just as easily as K-fold cross-validation.
different splitting mechanisms as an argument. For instance, one can use the `ShuffleSplit()`
function to implement
the validation set approach just as easily as $k$-fold cross-validation.
```{python}
validation = ShuffleSplit(n_splits=1,
@@ -511,7 +512,7 @@ standard formulas given in
rely on certain assumptions. For example,
they depend on the unknown parameter $\sigma^2$, the noise
variance. We then estimate $\sigma^2$ using the RSS. Now although the
formula for the standard errors do not rely on the linear model being
formulas for the standard errors do not rely on the linear model being
correct, the estimate for $\sigma^2$ does. We see
{in Figure~\ref{Ch3:polyplot} on page~\pageref{Ch3:polyplot}} that there is
a non-linear relationship in the data, and so the residuals from a

View File

@@ -334,7 +334,7 @@ The function `fit_path()` returns a list whose values include the fitted coeffic
path[3]
```
In the example above, we see that at the fourth step in the path, we have two nonzero coefficients in `'B'`, corresponding to the value $0.114$ for the penalty parameter `lambda_0`.
In the example above, we see that at the fourth step in the path, we have two nonzero coefficients in `'B'`, corresponding to the value $0.0114$ for the penalty parameter `lambda_0`.
We could make predictions using this sequence of fits on a validation set as a function of `lambda_0`, or with more work using cross-validation.
## Ridge Regression and the Lasso
@@ -913,6 +913,6 @@ ax.set_ylim([50000,250000]);
```
CV error is minimized at 12,
though there is little noticable difference between this point and a much lower number like 2 or 3 components.
though there is little noticeable difference between this point and a much lower number like 2 or 3 components.

View File

@@ -223,7 +223,7 @@ grid.fit(X_train, High_train)
grid.best_score_
```
Lets take a look at the pruned true.
Lets take a look at the pruned tree.
```{python}
ax = subplots(figsize=(12, 12))[1]
@@ -509,7 +509,7 @@ np.mean((y_test - y_hat_boost)**2)
```
In this case, using $\lambda=0.2$ leads to a almost the same test MSE
In this case, using $\lambda=0.2$ leads to almost the same test MSE
as when using $\lambda=0.001$.

View File

@@ -42,7 +42,7 @@ roc_curve = RocCurveDisplay.from_estimator # shorthand
We now use the `SupportVectorClassifier()` function (abbreviated `SVC()`) from `sklearn` to fit the support vector
classifier for a given value of the parameter `C`. The
`C` argument allows us to specify the cost of a violation to
the margin. When the `cost` argument is small, then the margins
the margin. When the `C` argument is small, then the margins
will be wide and many support vectors will be on the margin or will
violate the margin. When the `C` argument is large, then the
margins will be narrow and there will be few support vectors on the

View File

@@ -1137,7 +1137,7 @@ img_preds = resnet_model(imgs)
Lets look at the predicted probabilities for each of the top 3 choices. First we compute
the probabilities by applying the softmax to the logits in `img_preds`. Note that
we have had to call the `detach()` method on the tensor `img_preds` in order to convert
it to our a more familiar `ndarray`.
it to a more familiar `ndarray`.
```{python}
img_probs = np.exp(np.asarray(img_preds.detach()))

View File

@@ -10,7 +10,7 @@
In this lab we demonstrate PCA and clustering on several datasets.
As in other labs, we import some of our libraries at this top
level. This makes the code more readable, as scanning the first few
lines of the notebook tell us what libraries are used in this
lines of the notebook tells us what libraries are used in this
notebook.
```{python}
@@ -837,7 +837,7 @@ ax.axhline(140, c='r', linewidth=4);
```
The `axhline()` function draws a horizontal line line on top of any
The `axhline()` function draws a horizontal line on top of any
existing set of axes. The argument `140` plots a horizontal
line at height 140 on the dendrogram; this is a height that
results in four distinct clusters. It is easy to verify that the