Some Notes Pertaining to Ch. 2 of E&T



After the first paragraph (in which I don't like the first sentence because it seems to suggest that bootstrapping is limited to assigning measures of accuracy to estimates), the results of a small experiment involving 16 mice are described. This R code can be used to obtain some of the results presented in the book. (Note that the value of Welch's test statistic produced by R, about 1.06, closely matches the 1.05 value given in the first new paragraph on p. 12 of E&T. The fact that they don't match exactly is due to sloppiness of the authors.) I find it a bit odd that E&T chose a two-sample problem to serve as an example in this chapter because the main focus of the chapter is on the accuracy of the sample mean as an estimate of a distribution mean. (However, the two-sample experiment will be used later in the book, and so I wanted to show how to obtain some of the simple results given about it in Ch. 2 now.)

The standard error of an estimator (which is a random variable) is just the standard deviation of the estimator. If an estimator is approximately normal (as many, but not all, estimators are if the sample size isn't too small), then if we know the value of the standard error of an estimator we can note that with a probability of around 0.95 the estimator will assume a value within two standard errors of it's mean. If the estimator is unbiased, or nearly unbiased, then we have that the estimator will assume a value within two standard errors of the estimand. Looking at it another way, we can be highly confident that that the unknown value of the estimand is within two standard errors of the observed estimate. (I'll use equivalent events to show this on the board.)

We often don't know the value of an estimator's standard error, but if we can estimate it well then we can still use the observed estimate plus and minus two estimated standard errors as the endpoints of a (perhaps crude) approximate 95% confidence interval for the estimand.

It should be noted that by itself the estimated standard error isn't necessarily a decent measure of an estimator's accuracy unless the estimator's bias is negligible. If we have nonnegligible bias then we should perhaps focus on the mean squared error as a measure of an estimator's accuracy, and we should take the bias into account when obtaining a confidence interval for the estimand. Even for unbiased estimators, perhaps due to the nonnormality of the estimator and/or the inaccuracy of the estimated standard error, we can often improve on the simple interval estimation method described above. Nevertheless, estimated standard errors are often used to provide a simple measure of an estimator's accuracy and produce "back of the envelope" approximate confidence intervals. It is therefore disappointing that estimates of standard errors are often hard to obtain unless the estimator is a sample mean. (Notes: (1) If we want to assume a specific parametric model then we can sometimes use methods covered in STAT 652 / CSI 672 to obtain estimated standard errors for estimators that aren't sample means, but when working with real data we should perhaps be hesitant to assume a specific parametric model, particularly when the sample size is small. (2) In class I'll discuss how to estimate standard errors for the sample mean and sample median, and discuss the estimation of the distribution median in a parametric model.)

Pages 12-13 of E&T describe how to obtain bootstrap samples, and use them to obtain bootstrap replicates of a statistic, and use them to obtain a bootstrap estimate of standard error. (I'll go over this material in class.)

Table 2.2 on p. 14 of E&T shows how the bootstrap estimate of standard error can vary as the number of bootstrap samples, B, used is changed. E&T suggest (see p. 13 and the last line on p. 14) that for bootstrap estimates of standard error, typical values used for B range from 50 to 200. But an examination of Table 2.2 suggests that if it isn't too much bother, perhaps using as many as 500 bootstrap samples would be good. (I'll discuss this in class. Also, I'll discuss how the values in the last column of Table 2.2 are obtained.) This R code can be used to obtain some results similar to those presented in Table 2.2, and also produce some similar results for a random sample of size 1,600 from the standard normal distribution.

Note that the last complete paragraph on p. 14 suggests how one might do a test of the null hypothesis that two medians are equal against the alternative that they are not equal. One could assume that the difference in sample medians is approximately normal and compare the standardized difference in sample medians to the quantiles of the standard normal distribution. (Notes: (1) One might want to use the Harrell-Davis estimator to estimate the medians instead of using the sample median. (2) We might revisit this situation later when I cover bootstrap hypothesis testing.) Such a test about two distribution medians is something that traditional statistical methods don't address unless one makes some assumptions about the underlying distributions that may not be realistic.