Some Notes Pertaining to Ch. 2 of E&T
After the first paragraph (in which I don't like the first sentence because it seems to suggest that
bootstrapping is limited to assigning measures of accuracy to estimates), the results of a small experiment
involving 16 mice are described.
This R code can be used to obtain some of the results presented in the book.
(Note that the value of Welch's test statistic produced by R, about 1.06, closely matches the 1.05 value given
in the first new paragraph on p. 12 of E&T. The fact that they don't match exactly is due to sloppiness of the
authors.)
I find it a bit odd that E&T chose a two-sample problem to serve as an example in this chapter because the main focus
of the chapter is on the accuracy of the sample mean as an estimate of a distribution mean.
(However, the two-sample experiment will be used later in the book, and so I wanted to show how to obtain some
of the simple results given about it in Ch. 2 now.)
The standard error of an estimator (which is a random variable) is just the standard deviation of
the estimator. If an estimator is approximately normal (as many, but not all, estimators are if the sample size
isn't too small), then if we know the value of the standard error of an estimator we can note that with a
probability of around 0.95 the estimator will assume a value within two standard errors of it's mean. If the
estimator is unbiased, or nearly unbiased, then we have that the
estimator will assume a value within two standard errors of the estimand.
Looking at it another way, we can be highly confident that that the unknown value of the estimand is within two
standard errors of the observed estimate. (I'll use equivalent events to show this on the board.)
We often don't know the value of an estimator's standard error, but if we can estimate it well then we can still
use the observed estimate plus and minus two estimated standard errors as the endpoints of a (perhaps crude)
approximate 95% confidence interval for the estimand.
It should be noted that by itself the estimated standard error isn't necessarily a decent measure of an estimator's
accuracy unless the estimator's bias is negligible. If we have nonnegligible bias then we should perhaps focus on
the mean squared error as a measure of an estimator's accuracy, and we should take the bias into
account when obtaining a confidence interval for the estimand. Even for unbiased estimators, perhaps due
to the nonnormality of the estimator and/or the inaccuracy of the estimated standard error, we can often improve
on the simple interval estimation method described above. Nevertheless, estimated standard errors are often used to
provide a simple measure of an estimator's accuracy and produce "back of the envelope" approximate confidence
intervals. It is therefore disappointing that estimates of standard errors are often hard to obtain unless the
estimator is a sample mean. (Notes: (1) If we want to assume a specific parametric model then we can
sometimes use methods covered in STAT 652 / CSI 672 to obtain estimated standard errors for estimators that aren't
sample means, but when working with real data we should perhaps be hesitant to assume a specific parametric model,
particularly when the sample size is small. (2) In class I'll discuss how to estimate standard errors for the
sample mean and sample median, and discuss the estimation of the distribution median in a parametric model.)
Pages 12-13 of E&T describe how to obtain
bootstrap samples, and use them to obtain
bootstrap replicates of a statistic, and use them to obtain a
bootstrap estimate of standard error. (I'll go over this material in class.)
Table 2.2 on p. 14 of E&T shows how the bootstrap estimate of standard error can vary as the
number of bootstrap samples, B, used is changed. E&T suggest (see p. 13 and the last line on p. 14)
that for bootstrap estimates of standard error,
typical values used for B range from 50 to 200. But an examination of Table 2.2 suggests that if it isn't
too much bother, perhaps using as many as 500 bootstrap samples would be good. (I'll discuss this in class. Also,
I'll discuss how the values in the last column of Table 2.2 are obtained.)
This R code can be used to obtain some results similar to those
presented in Table 2.2,
and also produce some similar results for a random sample of size 1,600 from the standard normal distribution.
Note that the last complete paragraph on p. 14 suggests how one might do a test of the null hypothesis that two
medians are equal against the alternative that they are not equal. One could assume that the difference in sample
medians is approximately normal and compare the standardized difference in sample medians to the quantiles of the
standard normal distribution. (Notes: (1) One might want to use the Harrell-Davis estimator to estimate the medians
instead of using the sample median. (2) We might revisit this situation later when I cover bootstrap hypothesis
testing.) Such a test about two distribution medians is something that traditional statistical methods don't address
unless one makes some assumptions about the underlying distributions that may not be realistic.