Elem Stat Learn corrections

Some Additional Corrections for Hastie, Tibshirani, and Friedman

Seminar participants: The web site maintained by the authors shows a lot of corrections to mistakes (some of which we've spotted, and some which we didn't). They do a good job of updating it, so be sure to check back there from time to time. I'm only going to put ones here which haven't been reported on the authors' web site (although since I'll inform them of what I find, these may show up there as well).

Trevor & Rob: I own the first printing of the book. But unless you made some changes that are not reported on your web site, I suppose that any errors identified below exist in the second printing as well.
The first set of items I posted on April 26 are in brown.
3 items posted on April 28 are in green.
Items posted on August 23 are in red.

p. 7, 4th line of How this Book is Organized: It should be Chapters 3 and 4 instead of 2 and 4.
p. 202: Seminar participants: I was wrong with my comment below --- perhaps the other errors on the page threw me off (you can see some other mistakes identified on the authors' web site). Anyway, the authors were kind enough to send me a corrected version of this page, and I distribted it at one of our meetings. I've seen the corrections for p. 202 that are on the authors' web site, but something still seems wrong. One correction indicated there is to include E_y in (7.15), presumably on the LHS. If this is the case, wouldn't one also have to include E_y in (7.16), in front of the 1st term on the RHS, and on the LHS of (7.20)? But I don't see the point of the correction on the web site that indicates to include E_y in (7.15), since it seems as though Err_in is defined as an expectation. It seems as though if one doesn't include another E_y in (7.15), then the other expressions on the page are okay, given one makes the change in (7.20) that is already noted on the web site. So, in summary, I question the 1st correction given on the authors' web site for p. 202, pertaining to (7.15).
p. 206, expression (7.31): A factor of sigma²_epsilon is missing from the 2nd term within the brackets.
p. 210, 2nd paragraph of Sec. 7.9: Comment: As it is, the paragraph is a bit confusing. Is the indicator function f meant to be a function from R^p to {0, 1} × {0, 1} × ... × {0, 1}, and is alpha₁ meant to be p-dimensional? Or is the indicator function f meant to be a function from R^p to {0, 1}, with the argument of the form alpha₀ + alpha₁ x₁ + ... + alpha_p x_p. It seems as though consideration switches back and forth between the general case and the special case of p = 1. Maybe the idea was to comment on the general case while have a running example of a p = 1 case, but this makes the first 3 sentences of the paragraph a bit hard to follow.
p. 212, line +8: Suggestion: To help the reader, and to be consistent with similar parts of book, it would be good to put (SRM) after structural risk minimization.
p. 223: In Ex. 7.4, I think it should be f hat instead of f in the training error.
p. 226, line -2: Comment: Although in other places in the book N is used instead of N-1 in variance estimates, in the example being considered, with N rather small and with 7 parameters being estimated, it can be noted that using N-7 in the denominator would result in a slightly fatter prediction band and thus perhaps capture a few of the points slightly outside the band shown in Figure 8.2.
p. 244, line below Algorithm 8.4 display: Suggestion: The line ends in a period, but I think a colon would be better.
p. 250, line -5: The book has "Viewed in the way", whereas it would make more sense to have Viewed in a similar way or Viewed in the same way or Viewed in this way.
p. 272, line -15: I suspect that the subscripts of L should be kk' and k'k. In the book the 2nd one is k'k'.
p. 295, line -15: The book has "M is a reasonable fraction of M." I suspect the 2nd M should be N.
p. 301, line +8 (not counting box for Algorithm 10.1): It should be line 2d (instead of line 2c).
p. 302, line +5: I don't see where "a factor of eight" comes from.
p. 314, line -12: The beginning of the sentence that starts "The left panel of The frequencies" seems a bit odd --- surely the 2nd The should not be capitalized, but it needs to be wordsmithed some more.
p. 314, line -9: I don't think spam should be listed with the other character strings, since spam is one of the possible values for the response instead of being a possible predictor (plus spam isn't included along the left in Figure 10.6).
p. 318, line +4 (expression (10.26)): Has the L tilde notation been defined?
p. 318, line +5: Has the R tilde notation been defined?
p. 318, line +10: Would it be better to refer to the boosted tree model as a weighted sum, instead of just a sum?
p. 326, line -6: It should be Figure 10.11 instead of "Figure 10.12.1."
p. 335, line +11: It seems as though StatLib is preferred over "STATLIB" (but this is truly of little importance I suppose).
p. 349: There is an extra ) in expression (11.3).
p. 392, line +1: Should be restriction instead of "restrictions."
p. 396, line -8: Instead of "the bibliographic notes" should it be the Computational Considerations section on p. 405?
p. 401, line 6: Instead of "the bibliographic notes" should it be the Computational Considerations section on p. 405?
p. 403: In the caption, "kower" should be lower.
p. 414: Should it be decreased towards zero, instead of "descreased to zero"?
p. 417, line +21: You claim that since the CV errors are averages, a standard error can be estimated. But doesn't the lack of independence complicate matters? It can be noted from Figure 13.4 that all of the test error values are more than one estimated standard error above the CV estimates, so it's a little unclear what the significance of the standard error estimates is. It seems clear that the CV estimates have a negative bias, and so (again) it's not clear what you're getting at even if the estimated standard errors are okay despite the lack of independence.
p. 438, line +1: It doesn't seem necessary to refer to Bayes' theorem.
p. 440: A small matter, but I would have put commas before and after the probability in expression (14.2).
p. 443: I hate to be picky again, but the sentence including expression (14.9) could be improved. Right before (14.9) I would put , that is, rules of the form, and right after (14.9) I would put a comma.
p. 447, line +14: Should be subsection instead of "section."
p. 453, line -15: Instead of "apartment" I think it should be that type of home is a categorical input (since apartment is just one of the possible values for type of home, and since you have "dummy variable for each level" I think type of home needs to be there instead of apartment).
p. 459, line -2: Don't some make a distinction between mode seeking and bump hunting? I've noticed in at least one other place that you seem to take the two terms as being equivalent, but isn't it true that one can have a bump that's not a mode?
p. 478, line +5: Isn't the claim that "(14.41) approaches zero" dependent on the supports having a nonempty intersection? (That is, if the densities don't overlap, there won't be convergence to zero.)
p. 512, p. 513, & p. 521: You have "(to appear)" instead of the page numbers for 2001 articles. Now I guess you could give the page numbers, but if you wanted to leave out the page numbers since they weren't known at the time when you wrote the book, you can at least fix the "to appear" on p. 513 to make it look like the others.
p. 514: The page numbers for the Friedman, Hastie, and Tibshirani (2000) reference should be 337-407 (instead of 337-307).