Errata fixes (#45)

* fixes to Rmd from Daniela's errata
2025-04-03 12:34:12 -07:00
parent 6d7e40588b
commit 132bda168d
8 changed files with 25 additions and 23 deletions
--- a/Ch02-statlearn-lab.Rmd
+++ b/Ch02-statlearn-lab.Rmd
@@ -836,7 +836,7 @@ A[1:4:2,0:3:2]


 Why are we able to retrieve a submatrix directly using slices but not using lists?
-Its because they are different `Python` types, and
+It's because they are different `Python` types, and
 are treated differently by `numpy`.
 Slices can be used to extract objects from arbitrary sequences, such as strings, lists, and tuples, while the use of lists for indexing is more limited.

@@ -889,7 +889,8 @@ A[np.array([0,1,0,1])]

 ```

- By contrast, `keep_rows` retrieves only the second and fourth rows  of `A` --- i.e. the rows for which the Boolean equals `TRUE`. 
+ By contrast, `keep_rows` retrieves only the second and fourth rows  of `A` --- i.e. the rows for which the Boolean equals `True`. 
+

 ```{python}
 A[keep_rows]
@@ -1152,7 +1153,7 @@ Auto_re.loc[lambda df: (df['year'] > 80) & (df['mpg'] > 30),
 The symbol `&` computes an element-wise *and* operation.
 As another example, suppose that we want to retrieve all `Ford` and `Datsun`
 cars with `displacement` less than 300. We check whether each `name` entry contains either the string `ford` or `datsun` using the  `str.contains()` method of the `index` attribute of 
-of the dataframe:
+the dataframe:

 ```{python}
 Auto_re.loc[lambda df: (df['displacement'] < 300)
--- a/Ch03-linreg-lab.Rmd
+++ b/Ch03-linreg-lab.Rmd
@@ -102,7 +102,7 @@ matrices (also called design matrices) using the `ModelSpec()`  transform from `

 We  will use the `Boston` housing data set, which is contained in the `ISLP` package.  The `Boston` dataset records  `medv`  (median house value) for $506$ neighborhoods
 around Boston.  We will build a regression model to predict  `medv`  using $13$
-predictors such as  `rmvar`  (average number of rooms per house),
+predictors such as  `rm`  (average number of rooms per house),
 `age`  (proportion of owner-occupied units built prior to 1940), and  `lstat`  (percent of
 households with low socioeconomic status).  We will use `statsmodels` for this
 task, a `Python` package that implements several commonly used
@@ -252,7 +252,7 @@ We can produce confidence intervals for the predicted values.
 new_predictions.conf_int(alpha=0.05)

 ```
-Prediction intervals are computing by setting `obs=True`:
+Prediction intervals are computed by setting `obs=True`:

 ```{python}
 new_predictions.conf_int(obs=True, alpha=0.05)
@@ -286,7 +286,7 @@ def abline(ax, b, m):
 ```
 A few things are illustrated above. First we see the syntax for defining a function:
 `def funcname(...)`. The function has arguments `ax, b, m`
-where `ax` is an axis object for an exisiting plot, `b` is the intercept and
+where `ax` is an axis object for an existing plot, `b` is the intercept and
 `m` is the slope of the desired line. Other plotting  options can be passed on to
 `ax.plot` by including additional optional arguments as follows:

@@ -539,7 +539,7 @@ and  `lstat`.

 The function `anova_lm()` can take more than two nested models
 as input, in which case it compares every successive pair of models.
-That also explains why their are `NaN`s in the first row above, since
+That also explains why there are `NaN`s in the first row above, since
 there is no previous model with which to compare the first.


--- a/Ch05-resample-lab.Rmd
+++ b/Ch05-resample-lab.Rmd
@@ -88,7 +88,7 @@ fit is $23.62$.

 We can also estimate the validation error for
 higher-degree polynomial regressions. We first provide a function `evalMSE()` that takes a model string as well
-as a training and test set and returns the MSE on the test set.
+as training and test sets and returns the MSE on the test set.

 ```{python}
 def evalMSE(terms,
@@ -195,7 +195,7 @@ object with the appropriate `fit()`, `predict()`,
 and `score()` methods,  an
 array of features `X` and a response `Y`. 
 We also included an additional argument `cv` to `cross_validate()`; specifying an integer
-$K$ results in $K$-fold cross-validation. We have provided a value 
+$k$ results in $k$-fold cross-validation. We have provided a value 
 corresponding to the total number of observations, which results in
 leave-one-out cross-validation (LOOCV). The `cross_validate()`  function produces a dictionary with several components;
 we simply want the cross-validated test score here (MSE), which is estimated to be 24.23.
@@ -243,8 +243,8 @@ np.add.outer(A, B)

 ```

-In the CV example above, we used $K=n$, but of course we can also use $K<n$. The code is very similar
-to the above (and is significantly faster). Here we use `KFold()` to partition the data into $K=10$ random groups. We use `random_state` to set a random seed and initialize a vector `cv_error` in which we will store the CV errors corresponding to the
+In the CV example above, we used $k=n$, but of course we can also use $k<n$. The code is very similar
+to the above (and is significantly faster). Here we use `KFold()` to partition the data into $k=10$ random groups. We use `random_state` to set a random seed and initialize a vector `cv_error` in which we will store the CV errors corresponding to the
 polynomial fits of degrees one to five.

 ```{python}
@@ -264,7 +264,7 @@ cv_error
 ```
 Notice that the computation time is much shorter than that of LOOCV.
 (In principle, the computation time for LOOCV for a least squares
-linear model should be faster than for $K$-fold CV, due to the
+linear model should be faster than for $k$-fold CV, due to the
 availability of the formula~(\ref{Ch5:eq:LOOCVform})  for LOOCV;
 however, the generic `cross_validate()`  function does not make
 use of this formula.)  We still see little evidence that using cubic
@@ -273,8 +273,9 @@ using a quadratic fit.


 The `cross_validate()` function is flexible and can take
-different splitting mechanisms as an argument. For instance, one can use the `ShuffleSplit()` funtion to implement
-the validation set approach just as easily as K-fold cross-validation.
+different splitting mechanisms as an argument. For instance, one can use the `ShuffleSplit()`
+function to implement
+the validation set approach just as easily as $k$-fold cross-validation.

 ```{python}
 validation = ShuffleSplit(n_splits=1,
@@ -511,7 +512,7 @@ standard formulas given in
 rely on certain assumptions. For example,
 they depend on the unknown parameter $\sigma^2$, the noise
 variance. We then estimate $\sigma^2$ using the RSS. Now although the
-formula for the standard errors do not rely on the linear model being
+formulas for the standard errors do not rely on the linear model being
 correct, the estimate for $\sigma^2$ does.  We see
 {in Figure~\ref{Ch3:polyplot} on page~\pageref{Ch3:polyplot}}  that there is
 a non-linear relationship in the data, and so the residuals from a
--- a/Ch06-varselect-lab.Rmd
+++ b/Ch06-varselect-lab.Rmd
@@ -334,7 +334,7 @@ The function `fit_path()` returns a list whose values include the fitted coeffic
 path[3]

 ```
-In the example above, we see that at the fourth step in the path, we have two nonzero coefficients in `'B'`, corresponding to the value $0.114$ for the penalty parameter `lambda_0`.
+In the example above, we see that at the fourth step in the path, we have two nonzero coefficients in `'B'`, corresponding to the value $0.0114$ for the penalty parameter `lambda_0`.
 We could make predictions using this sequence of fits on a validation set as a function of `lambda_0`, or with more work using cross-validation.

 ## Ridge Regression and the Lasso
@@ -913,6 +913,6 @@ ax.set_ylim([50000,250000]);
 ```

 CV error is minimized at 12,
-though there is little noticable difference between this point and a much lower number like 2 or 3 components.
+though there is little noticeable difference between this point and a much lower number like 2 or 3 components.


--- a/Ch08-baggboost-lab.Rmd
+++ b/Ch08-baggboost-lab.Rmd
@@ -223,7 +223,7 @@ grid.fit(X_train, High_train)
 grid.best_score_

 ```
-Let’s take a look at the pruned true.
+Let’s take a look at the pruned tree.

 ```{python}
 ax = subplots(figsize=(12, 12))[1]
@@ -509,7 +509,7 @@ np.mean((y_test - y_hat_boost)**2)
 ```


-In this case, using $\lambda=0.2$ leads to a almost the same test MSE
+In this case, using $\lambda=0.2$ leads to almost the same test MSE
 as when using $\lambda=0.001$.

 
--- a/Ch09-svm-lab.Rmd
+++ b/Ch09-svm-lab.Rmd
@@ -42,7 +42,7 @@ roc_curve = RocCurveDisplay.from_estimator # shorthand
 We now use the `SupportVectorClassifier()` function (abbreviated `SVC()`) from `sklearn` to fit the support vector
 classifier for a given value of the parameter `C`.  The
 `C` argument allows us to specify the cost of a violation to
-the margin.  When the `cost` argument is small, then the margins
+the margin.  When the `C` argument is small, then the margins
 will be wide and many support vectors will be on the margin or will
 violate the margin.  When the `C` argument is large, then the
 margins will be narrow and there will be few support vectors on the
--- a/Ch10-deeplearning-lab.Rmd
+++ b/Ch10-deeplearning-lab.Rmd
@@ -1137,7 +1137,7 @@ img_preds = resnet_model(imgs)
 Let’s look at the predicted probabilities for each of the top 3 choices. First we compute
 the probabilities by applying the softmax to the logits in `img_preds`. Note that
 we have had to call the `detach()` method on the tensor `img_preds` in order to convert
-it to our a more familiar `ndarray`.
+it to a more familiar `ndarray`.

 ```{python}
 img_probs = np.exp(np.asarray(img_preds.detach()))
--- a/Ch12-unsup-lab.Rmd
+++ b/Ch12-unsup-lab.Rmd
@@ -10,7 +10,7 @@
 In this lab we demonstrate PCA and clustering on several datasets.
 As in other labs, we import some of our libraries at this top
 level. This makes the code more readable, as scanning the first few
-lines of the notebook tell us what libraries are used in this
+lines of the notebook tells us what libraries are used in this
 notebook.

 ```{python}
@@ -837,7 +837,7 @@ ax.axhline(140, c='r', linewidth=4);

 ```

-The `axhline()`  function draws a horizontal line  line on top of any
+The `axhline()`  function draws a horizontal line on top of any
 existing set of axes. The argument `140` plots a horizontal
 line at height 140 on the dendrogram; this is a height that
 results in four distinct clusters. It is easy to verify that the