Some Notes Pertaining to Ch. 12 of E&T
Ch. 12 is the first of three consecutive chapters dealing with confidence intervals. (Ch. 22 also deals
with confidence intervals.)
In Sec. 12.1, we can see another use for estimated standard errors. (Previously in E&T they have been used primarily
as a measure of accuracy for point estimators.) Estimated standard errors can be used in approximate confidence
intervals for approximately normal estimators.
If an estimator is normally distributed and unbiased, then the probability that it takes a value within 1.6449 standard errors
of the estimand is 0.9 (and the
probability that it takes a value within 1.9600 standard errors
of the estimand is 0.95).
If an estimator is normally distributed and unbiased, and the sample size is large enough so that the estimator's
standard error can be estimated very well, then the probability that it takes a value within 1.6449
estimated standard errors
of the estimand is approximately 0.9 (and the
probability that it takes a value within 1.9600 estimated standard errors
of the estimand is approximately 0.95). This can be shown using Slutzky's theorem.
Also,
if an estimator is approximately normally distributed and is unbiased or has a negligible bias (relative to it's
standard error), then the probability that it takes a value within 1.6449 standard errors
of the estimand is approximately 0.9 (and the
probability that it takes a value within 1.9600 standard errors
of the estimand is approximately 0.95).
Finally,
if an estimator is approximately normally distributed and is unbiased or has a negligible bias (relative to it's
standard error), and the sample size is large enough so that the estimator's standard error can be estimated very
well, then the probability that it takes a value within 1.6449 estimated standard errors
of the estimand is approximately 0.9 (and the
probability that it takes a value within 1.9600 estimated standard errors
of the estimand is approximately 0.95).
While it may be rare to work with a normally distributed estimator in practice, many estimators are approximately normal,
and so the last two results are quite useful --- the method of (approximate) pivots leads to approximate confidence
intervals of the form given by (12.6). E&T refer to such intervals as standard confidence intervals. (Notes:
(1) I will write zα instead of
z(1-α).
(2) I typically use α/2 instead of α to get a confidence interval having approximate
coverage probability
1 - α instead of 1 - 2α.)
Estimators which are asymptotically normal are only approximately normal with a finite sample size, and for sample
sizes which aren't large enough, the approximation suffers.
In such cases it is often possible to use bootstrapping
(in a variety of ways) to obtain better approximate confidence intervals. Bootstrapping can also be used effectively
in some cases in which the estimator is not approximately normal. (Some estimators aren't even asymptotically
normal.)
There's not a lot of importance in Sec. 12.2. The main focus is on what E&T refer to as accurate confidence
intervals. They designate a confidence interval as accurate if the probability that it misses covering the estimand
by having the upper confidence bound being below the estimate is the same as the probability that it misses
by having the lower confidence bound being above the estimate, with both of these probabilities being (about)
α for a (approximate) confidence interval having a nominal coverage probability of 1 - 2α.
E&T seem to put a lot of emphasis on having what they refer to as an accurate confidence interval, but a lot of
others do not
do this. Some strive to have a confidence interval procedure which is valid (meaning that the actual coverage
probability is not less than the nominal coverage probability), and which produces intervals which are on the average
as short as possible.
Sec. 12.3 isn't too important as far as learning about bootstrapping is concerned. It establishes the viewpoint that
a confidence interval represents the plausible values for an unknown estimand by using aspects of hypothesis testing.
It may be a bit hard to follow upon an initial reading, but in class I can easily explain what they are getting at by
claiming that values above the interval and values below the interval are implausible.
(Basically, values below the lower confidence bound are implausible because if such a value was the true value of
θ, then it would be unlikely to obtain an estimate as large as the one observed,
and values above the upper confidence bound are implausible because if such a value was the true value of
θ, then it would be unlikely to obtain an estimate as small as the one observed.)
The second to the last sentence in the section indicates that a test of hypotheses can be carried out using a confidence
interval. I'll briefly discuss this in class.
(12.19) on p. 159 gives the formula for Student's one-sample t confidence interval. (It should be noted that
on p. 158 the estimator was specified to be the sample mean.) Especially for the case of normal random variables,
the t interval represents an improvement over (12.16) for the case of the estimator being the sample mean
(sometimes referred to as the z interval) because the t interval takes into account that an estimated
standard error is being used rather than the actual standard error.
While the t interval adjusted for the fact that the standard error is unknown, it doesn't adjust for
nonnormality, and thus is only an approximate interval when the underlying distribution of the observations is
nonnormal. Also, the t interval given in this section is for estimating a distribution mean, and we don't
have a lot of similar intervals for estimating other distribution measures except in a relatively small number of
special cases. Using bootstrapping we can adjust for nonnormaility and improve upon the t interval for
estimating a distribution mean, and we can also develop confidence intervals for other distribution measures without
assuming parametric models which may not accurately model the phenomenon of interest.
Sec. 12.5 is a relatively short, relatively simple, and somewhat important section. It describes the first of the
bootstrap confidence interval methods covered by E&T.
The bootstrap t method is based on the same pivot used for standard intervals (the pivot of (12.17) and
(12.18)), but instead of assuming a standard normal or T distribution, the sampling distribution of the pivot
--- or rather just the two quantiles of it that are needed to obtain a confidence interval by inverting the pivot ---
is approximated using bootstrapping. Bootstrap replicates of the pivot (see (12.20)) are created, and they are used
to obtain the estimates of the sampling distribution's quantiles that are needed for the confidence interval given by
(12.22). (In class I'll show how (12.22) can be obtained.)
I don't like to use (12.21) to determine the estimated percentiles that are needed --- doing so is to use a method which lacks
symmetry. E.g., if B = 1000, there are 49 replicates below the estimated 5th percentile and 50 replicates
above the estimated 95th percentile. I like to use (B + 1)α with B = 999 instead of
Bα with B = 1000. Using
(B + 1)α with B = 999 puts 49 replicates below the estimated 5th percentile and 49
replicates above the estimated 95th percentile. (Note: 100 or 200 (or 99 or 199) is not nearly large enough for
B --- one needs for B to be at least 10 times larger in order to be able to estimate the extreme
percentiles accurately enough.)
(12.20) is for the case of the estimator the pivot is based on being the plug-in estimator. However, one could use
some other estimate on the left in the numerator. But in that case the plug-in estimate still needs to be used on
the right in the numerator (because the plug-in estimate is the true value of the population measure of interest in
the bootstrap world). It should also be noted that unless the estimator the pivot is based on is the sample mean, so
that s*/sqrt(n) can be used as the estimated standard error, or
a linear estimator, so that the jackknife estimate of standard error obtained from the bootstrap sample will be good
to use, nested bootstrapping may be needed
in order to get the estimated standard error to use in (12.20). (We cannot use the bootstrap estimate of standard
error based on the original data in the replicates and still maintain the proper correspondence between things in the
bootstrap world and things in the real world. Estimating the standard error from each bootstrap sample
allows us to properly account for the variability in the estimated standard error --- to make the pivot similar to a
t statistic instead of a z statistic.)
The bootstrap t method is particularly good to use with location statistics, and in general, for large sample
sizes, it's coverage probability tends to be closer to the nominal level than is the case for standard confidence
intervals. A bootstrap t confidence interval can have confidence bounds which are not the same distance below
and above the point estimate. This is a big factor in why they can work appreciably better than standard intervals
when the sampling distribution of the estimator is not symmetric.
Here is
R code to obtain a bootstrap-t confidence interval for a situation
addressed in Sec. 12.5.
The first part of
this R code
can be used to obtain a bootstrap-t confidence interval for a situation
addressed in Sec. 12.6 but using the general method described in Sec. 12.5.
The first page of Sec. 12.6 indicates that there are two problems associated with the use of the bootstrap-t
confidence procedure: (1) typically there is no simple way to estimate the standard error of an estimator, and so
nested bootstrapping (which adds greatly to the run time) may be needed; and (2) sometimes the
bootstrap-t procedure behaves poorly (especially with small sample sizes). It turns out that using a
variance stabilizing transformation along with the
bootstrap-t confidence interval procedure can simultaneously address both of these concerns. However, since
the proper transformation is seldom easy to identify, one may have to rely on a data-driven computer-intensive
numerical method in order to carry out the transformation ploy. (Note: Typically trying to use a small amount of
data to fit a highly flexible method is somewhat dangerous. So my guess is that if the sample size is rather small
this complex alternative to a straightforward application of the bootstrap-t may not be trustworthy.)
To make a case for the use of variance stabilizing transformations, E&T consider the previously used law school data
for which the estimand of interest is the correlation coefficient. If the underlying distribution is a bivariate
normal distribution, then one can get decent performance using (12.24) and (12.25). (I'll work through in class how
(12.24) and (12.25) may be used to obtain a confidence interval for the correlation without using bootstrapping.
Here is a .pdf file that shows the main steps of employing the Fisher transformaion to obtain a
confidence interval for ρ.)
But if the underlying distribution is not a bivariate normal distribution, then (12.25) may not hold, and the
method of using (12.24) and (12.25) may not perform well. It turns out that a good solution is to use the
bootstrap-t procedure
with (12.24), which allows us to not have to assume that (12.25) holds (since the
bootstrap-t procedure
estimates the needed aspects of the sampling distribution of the "studentized" statistic based on the transformation
(12.24)). The
bootstrap-t intervals
based on the transformation are shorter than the
bootstrap-t intervals
which resulted from not using the transformation, and they didn't contain impossible values for the estimand, unlike
the intervals resulting from not using the transformation. (Note: While short confidence intervals are preferred,
no results are presented to indicate that the intervals based on the transformation are not too short --- we don't
know that the overall confidence interval procedure produces intervals having the correct coverage probability.
(Later, we will cover results in E&T that show that in some cases bootstrap confidence intervals really do
perform appreciably better than other
confidence intervals.))
In most situations, unlike the law school correlation example, one wouldn't know what transformation would be good to
use. Ideally one might seek a transformation which results in a pivot that is approximately normal and doesn't
involve an estimated standard error, since that may necessitate nested bootstrapping. But in most cases that is too
much to ask for, and it turns out to be better to focus on finding a transformation for which the transformed
estimator's variance does not depend on the value of the estimand, since the
bootstrap-t procedure
can deal with the nonnormality but suffers if the variance of the transformed estimator depends on the estimand.
Even with the relaxed goal of using a variance-stabilizing transformation, in most cases it still won't be clear what
transformation to use. In class I'll show you how a first-order Taylor series approximation of g(x)
leads to (12.29) on p. 167, and how if one uses a g which satisfies (12.26) on p. 164, the result would be
that if X is a random variable with mean
θ and standard deviation
s(θ), then
g(X)
has a variance which is close to 1 no matter what value
θ is.
(Note:
These answers for Problem 12.4 on p. 167 of E&T show part of my classroom
presentation.)
So while X's standard deviation depends on
θ,
g(X)'s standard deviation does not.
Letting the estimator of the estimand of interest assume the role of X, and assuming that the estimator has
negligible bias, if we knew how the standard deviation of the estimator depended on the value of the estimand, we
could apply (12.26) (or equivalently (12.27)) to determine a transformation to use that would approximately stabilize
the variance of the transformed estimator, which allows us to use a pivot which does not contain an estimated
standard error, which eliminates the need for nested bootstrapping!
Since we seldom will know how the standard error of an estimator varies with the value of the estimand, we don't have
the function s in (12.26) and (12.27), and so we can't use s to obtain the desired
transformation g. But we can use bootstrapping to approximate s, and ultimately obtain a confidence
interval for
θ, using the following steps.
- B1
bootstrap samples can be drawn from the orignal data. (Typically it's okay to use a number around 100 or 200 for
B1.)
- From each of the
B1
bootstrap samples,
B2 (a smaller number, say 50 or 100) can be drawn in order to get an estimate of the standard error
associated with the value of the bootstrap replicate of the estimate of θ.
- The collection of the
B1
(replicate of the estimate of θ, estimated standard error) pairs can be used to estimate the function s, and from the estimated
s numerical integration can be used to obtain and estimate of g, using (12.27).
- Finally,
B3 (being at least 1000) bootstrap samples are drawn from the orginal data, and from these
replicates of the g-transformed estimate of
θ
are obtained. These are used to estimate the needed percentiles of the simplified pivot with the standard error set
equal to 1; the estimated quantiles are used to obtain a confidence interval for g(θ); and then
the inverse transformation is applied to
g(θ)'s confidence bounds to obtain confidence bounds for
θ. (Note: Since s is a positive function, from (12.27) it is clear that g will be a
continuous monotone increasing function, and so g-1 will exist.)
Obviously, using transformations with the
bootstrap-t procedure
is a rather involved method for obtaining confidence intervals. However, not only can performance be improved over
the basic
bootstrap-t procedure, but the total number of bootstrap samples needed can be greatly reduced due to the fact
that a variance-stabilizing transformation eliminates the necessity for large-scale nested bootstrapping. (The first two steps indicated above give us a small-scale nested bootstrapping. For a large-scale nested bootstrapping, to carry out
the method of Sec. 12.5 when we don't have a formula for the estimated standard error, if we need 999 replicates of the
pivot, and 100 replicates of the estimate from each of the 999 bootstrap samples in order to estimate the standard
error for the denominators of the pivot replicates, a total of 999 + 999*100 = 100,899 bootstrap samples. But making use of the transformation scheme described above, one could use a total of only 200*100 + 999 = 20,999 bootstrap samples (so about
one fifth as many as before), or perhaps as few as just 100*100 + 999 = 10,999 bootstrap samples, and possibly obtain a better confidence interval.)
The last portion of
this R code can be used to obtain a variance-stabilized bootstrap-t confidence interval for a situation
addressed in Sec. 12.6, working with the law school data. (It also includes code for obtaining confidence intervals
based of the Fisher transformation of the correlation.)