Some Notes Pertaining to Ch. 14 of E&T



Introduction

Efron and Tibshirani indicate that none of the bootstrap confidence intervals covered so far consistently perform well. This chapter covers an improvement of the percentile interval, called the bias-corrected and accelerated interval (aka the BCA interval). Also introduced is the approximate bootstrap confidence interval (aka the ABC interval), which is an approximation of the BCA interval which can be obtained without doing any computer-intensive bootstrap resampling..

We will find that even the BCA and ABC intervals fail to perform well in some situations. Despite the fact that none of the bootstrap confidence intervals consistently perform well, you should not be too discouraged from using bootstrap confidence intervals! Some of the intervals work great in many situations, and even when they only work decently, they may still be better than your best alternative. For important estimation problems for which there is no clearcut best interval estimator, I recommend inventing specific estimation settings, for which you know the value of the estimand, and are hopefully similar to the real world situation that you are involved with. These invented estimation settings can be used in Monte Carlo studies, and it can be determined which of several (or many) interval estimators perform best. The best performer can be selected to be used in the real world setting of interest, and the Monte Carlo results will provide some indication of how reliable the chosen estimator may be.



Description of the BCA Method

The BCA interval is based on the plug-in estimator for the estimand. The confidence bounds of the BCA interval, like those of the percentile interval, are percentiles of the sampling distribution of the bootstrap replicates of the point estimator. For a 90% percentile interval, the 5th and 95th percentiles of the bootstrap replicate's distribution are used, but the BCA interval may use different percentiles.

The quantiles used by the BCA interval (which is given by (14.9) on p. 185) are given by (14.10). They depend on two values, referred to as the acceleration and the bias correction. The bias correction is given by (14.14), and is a function of the proportion of bootstrap replicates less than the original point estimate. (Note that the bias correction is more closely related to median bias than it is to the usual definition of bias.) If half of the replicates are less than the original estimate, and half are greater than the original estimate, then the bias correction is 0. If more than half of the replicates are less than the original point estimate, the bias correction is positive. (This seems sensible since more than half of the replicates being less than the original point estimate, which is the value of the estimand in the bootstrap world, suggests that the estimator tends to underestimate, and so the bias correction should be positive.) A positive bias correction will tend to make the percentiles used be larger than those used for the percentile interval. For example, instead of using the 5th and 95th percentiles for a 90% confidence interval like the percentile interval does, if the bias is positive the BCA interval might use the 6th and 96th percentiles. (An example given in the book is more severe: there a positive bias correction combined with a positive acceleration (the values given by (14.12)) lead to (as indicated by (14.13)) the 11th and 98.5th percentiles being used for a 90% confidence interval.) Similarly, if fewer than half of the replicates are less than the original point estimate, the bias correction is negative (which makes sense because the estimator seems to overestimate), and the percentiles used for the BCA interval will tend to be smaller than those used for the percentile interval.

One expression for the acceleration is given by (14.15) on p. 186. This expression, based on jackknife replicates of the estimator, somewhat resembles an estimated skewness of the estimator's sampling distribution. But E&T describe the acceleration as being related to the rate of change of the standard error of the estimator with respect to the true estimand value (measured on a standardized scale). It makes an allowance for the fact that in some interval estimation settings, the variance of the estimator may be a function of the value being estimated. (As an example, if the underlying distribution of the observations is normal, the standard deviation of the sample variance is directly proportional to the true variance. (In class, I can show you how this result can be easily obtained.) In contrast, the variance of the sample mean does not obviously depend upon the true value of the mean. (One could consider a situation where the variance of a normal distribution is a function of the distribution's mean, but unlike the case of the sample variance, for the mean we don't necessarily have that the variance of the estimator is a function of the value being estimated.) )

E&T indicate that it's not obvious why (14.15) is associated with the acceleration of the standard error. (While some of the grubby details of the BCA method are addressed in later chapters in the book, for this issue E&T refer the reader to a journal article. This being the case, it's perhaps best what we don't dwell on this issue. To me it seems better to make the main goals of the moment learning how to compute the BCA interval, understanding some of its properties, and gaining some insight into how it performs in practical situations. (In statistics, it definitely makes sense to "pick your battles" --- if one were to try to understand all of the details of each method utilized before moving on to something else, not as many methods can be covered.)) Some may wonder why this measure is referred to as the acceleration, and not the velocity. I can't address that --- it seems more like a velocity to me.



The ABC Method

Sec. 14.4 introduces approximate bootstrap confidence intervals (aka ABC intervals). The ABC method approximates the endpoints of BCA intervals using Taylor series expansions --- no bootstrap resampling is actually done! (Note: E&T delay a detailed explanation of the ABC method until Ch. 22.) The approximation is typically pretty good --- differences between ABC and BCA intervals are often larely due to the randomness involved in the BCA method, and if B is increased the differences may largely disappear. The ABC method requires that the point estimator that the interval is based on be smooth.

In order to use R's abcnon function to compute an ABC interval, the estimator must be represented in resampling form; i.e., expressed in terms of the resampling vector (which is the vector of the proportions of the occurrences of the original sample values in a bootstrap sample --- see p. 189 for more information and an example).



Properties

Like the percentile intervals, the BCA interval is transformation respecting, and so a BCA intervals for θ, logθ, and 1/θ will all be consitent with one another. This is not the case for all confidence interval methods. For example, if one uses the bootstrap t method to obtain a confidence interval for g(θ) and then applies the inverse transformation to the endpoints to obtain a confidence interval for θ, the result can be quite a bit different from what is obtained if one applies the bootstrap t method to θ directly. With bootstrap t intervals, performance may be improved by working with a transformed estimator, and then applying the inverse transformation to obtain the desired confidence interval. With the BCA method, we don't have to worry about finding a good transformation --- the resulting interval is invariant to the transformation chosen for the estimator.

The fact that BCA intervals are transformation respecting means that if you want a confidence interval for g(θ), you can choose to obtain a BCA interval for θ, and then properly transform the endpoints.

Another property of the BCA method concerns its accuracy. A central 1 - 2α confidence interval for θ with perfect accuracy has confidence bounds which satisfy (14.16) on p. 187 of E&T. But in most cases for which a specific parametric model cannot be assumed, we have to settle for approximate confidence intervals which don't satisfy (14.16) exactly. If a method results in confidence bounds which satisfy (14.18), it is said to be first-order accurate, and if a method results in confidence bounds which satisfy (14.17), it is said to be second-order accurate. (Note that in some situations, for some sample sizes a first-order accurate method can have better accuracy than a second-order accurate method. If n = 25, clo/sqrt(n) and cup/sqrt(n) for one method may be smaller in magnitude than clo/n and cup/n for another method. But if n is made sufficiently large, the second-order accurate method will be superior to the first-order accurate method.)

The table below compares properties of various methods for bootstrap confidence intervals. While it may seem as though the BCA method is superior to the others, for smallish sample sizes, one of the other methods may do better in certain situations. Also, it should be noted that an interval may have an overall coverage probability very close to the nominal coverage probability, but not be second-order accurate because it doesn't balance missing to the left with missing to the right well.

2nd-order accurate trans respect
standard no no
percentile no yes
BCA yes yes
ABC yes yes
bootstrap t yes no

While E&T seem to favor the BCA method, it seems to me (based on results displayed below) that the bootstrap t method can work very well in some settings. E&T do warn that BCA intervals can often be too short, and they also point out that ABC intervals can be shorter still if the resampling distribution of the replicates is a heavy-tailed distribution.



Comparisons

Four different settings will be considered here: (1) estimation of the mean of a normal distribution, (2) estimation of the mean of an exponential distribution, (3) estimation of the variance of a normal distribution, (4) estimation of exp(μ), where μ is the mean of a normal distribution.

estimation of distribution mean (normal distribution case)

Here, Student's t interval is the gold standard. The percentile method ought to work as good as the BCA method, because there is no need for bias correction or acceleration adjustments. (The percentile interval may work better than the BCA method (and the ABC method) since sometimes performance suffers when too many adjustments are attempted.)

This R code can be used to obtain results which allows a comparison of the endpoints of various confidence intervals based on the same specific sample of 25 observations. (One should feel good about a method that produces an interval close to Student's t interval.)

LCB UCB
Student's t -0.12 0.59
percentile -0.09 0.58
BCA -0.08 0.60
ABC -0.07 0.60
bootstrap t -0.11 0.63

This R code can be used to obtain Monte Carlo estimates of the coverage probabilities of Student's t interval, the percentile interval, the BCA interval, the ABC interval, and the bootstrap-t interval for estimating the mean of a normal distribution from a sample of 16 obsevations. The table belows shows estimates of the probability of missing to the left and the the probability of missing to the right, and the estimated coverage probability for each of the interval estimation methods considered.

miss left miss right coverage prob.
Student's t 0.0220 0.0224 0.9556
percentile 0.0376 0.0404 0.9220
BCA 0.0360 0.0424 0.9216
ABC 0.0348 0.0416 0.9236
bootstrap t 0.0232 0.0260 0.9508

Here are some similar results for n = 36. (Here's the R code.)

miss left miss right coverage prob.
Student's t 0.0228 0.0216 0.9556
percentile 0.0280 0.0284 0.9436
BCA 0.0308 0.0296 0.9394
bootstrap t 0.0232 0.0224 0.9544


estimation of distribution mean (exponential distribution case)

Here, Student's t interval is not the gold standard. But included in this comparison is an exact interval derived using the exponential parametric model (using a general method for parametric models covered in STAT 652 / CSI 672).

This R code can be used to obtain results which allows a comparison of the endpoints of various confidence intervals based on the same specific sample of 25 observations. (One should feel good about a method that produces an interval close to the exact interval.)

LCB UCB
Student's t 0.52 1.33
percentile 0.58 1.33
BCA 0.63 1.45
ABC 0.63 1.44
bootstrap t 0.60 1.61
exact 0.65 1.43

This R code can be used to obtain Monte Carlo estimates of the coverage probabilities of Student's t interval, the percentile interval, the BCA interval, the ABC interval, and the bootstrap-t interval for estimating the mean of a exponential distribution from a sample of 16 obsevations. The table belows shows estimates of the probability of missing to the left and the probability of missing to the right, and the estimated coverage probability for each of the interval estimation methods considered.

miss left miss right coverage prob.
Student's t 0.0840 0.0040 0.9120
percentile 0.0908 0.0108 0.8984
BCA 0.0676 0.0212 0.9112
ABC 0.0676 0.0208 0.9116
bootstrap t 0.0376 0.0088 0.9536
exact 0.0268 0.0232 0.9500

Here are some similar results for n = 36. (Here's the R code.)

miss left miss right coverage prob.
Student's t 0.0596 0.0084 0.9320
percentile 0.0608 0.0148 0.9244
BCA 0.0416 0.0260 0.9324
bootstrap t 0.0304 0.0188 0.9508
exact 0.0192 0.0268 0.9540


estimation of distribution variance (normal distribution case)

Here, an exact interval based on normal theory is the gold standard. The BCA interval ought to work better than the percentile interval, because there is a need for adjustments.

This R code can be used to obtain results which allows a comparison of the endpoints of various confidence intervals based on the same specific sample of size 25. (One should feel good about a bootstrap method that produces an interval close to the exact interval.)

LCB UCB
percentile 0.33 1.15
BCA 0.42 1.40
ABC 0.43 1.36
bootstrap t 0.34 1.69
exact 0.46 1.45

This R code can be used to obtain Monte Carlo estimates of the coverage probabilities of the exact interval, the percentile interval, the BCA interval, the ABC interval, and the bootstrap-t interval for estimating the mean of a normal distribution from a sample of 16 obsevations. The table belows shows estimates of the probability of missing to the left and the probability of missing to the right, and the estimated coverage probability for each of the interval estimation methods considered.

miss left miss right coverage prob.
exact 0.0212 0.0252 0.9536
percentile 0.1984 0.0016 0.8000
BCA 0.1072 0.0216 0.8712
ABC 0.1052 0.0244 0.8704
bootstrap t 0.0456 0.0016 0.9528

Here are some similar results for n = 36. (Here's the R code.)

miss left miss right coverage prob.
exact 0.0216 0.0284 0.9500
percentile 0.1252 0.0076 0.8672
BCA 0.0580 0.0272 0.9148
bootstrap t 0.0432 0.0124 0.9444


estimation of exp(μ) (normal distribution case)

The setting considered is described on p. 171 of E&T, only here different sample sizes are used. (E&T give some results based on samples of size 10 in Table 13.3 on p. 175.) Here, an exact interval based on normal theory is the gold standard. The BCA interval ought to work better than the percentile interval, because there is a need for adjustments.

This R code can be used to obtain results which allows a comparison of the endpoints of various confidence intervals based on the same specific sample of size 16. (One should feel good about a bootstrap method that produces an interval close to the exact interval.) Two different ways of using bootstrap t intervals were employed: below (N) denotes nested bootstrapping, and (AVS) denotes automatic variance stabilization.

LCB UCB
standard 0.67 1.90
percentile 0.84 2.08
BCA 0.87 2.23
ABC 0.86 2.18
bootstrap t (N) 0.81 2.34
bootstrap t (AVS) 0.87 2.17
exact 0.77 2.14

This R code can be used to obtain the results, based on samples of size 16, displayed below. Automatic variance stabilization was used with the bootstrap t method.

miss left miss right coverage prob.
exact 0.0304 0.0260 0.9436
standard 0.0724 0.0112 0.9164
percentile 0.0444 0.0400 0.9156
BCA 0.0484 0.0396 0.9120
ABC 0.0464 0.0412 0.9124
bootstrap t 0.0492 0.0400 0.9108

Here are some similar results for n = 36. (Here's the R code.) Note that while the coverage was bad for all of the bootstrap methods for n = 16, performance greatly improved when the sample size was increased to 36.

miss left miss right coverage prob.
exact 0.0200 0.0212 0.9588
standard 0.0448 0.0100 0.9452
percentile 0.0280 0.0272 0.9448
BCA 0.0308 0.0252 0.9440
ABC 0.0276 0.0268 0.9456
bootstrap t 0.0280 0.0264 0.9456