Some Notes Pertaining to Ch. 8 of E&T



Sec. 8.1 points suggests that bootstrapping is simple as long as the observed data can be viewed as coming from iid random variables, even if they are not univariate random variables. For example, the law school data correlation example of Sec. 6.3 is based on observations of two-dimensional random variables and the multivariate statistics example of Sec. 7.2 is based on observations of five-dimensional random variables. (As far as the programming goes, things get a bit more complicated when we don't have univariate observations, but it's really not a big deal. We can put the data in a matrix with an ID column, and resample the ID numbers to identify the bootstrap samples. But we still just create a set of B bootstrap replicates like we do with univariate data.)

Chapter 8 of E&T indicates how to adapt bootstrapping for more complicated data structures.



Sec. 8.2 deals with one-sample situations. E&T point out that a key (but simple) step is estimating the underlying distribution with the empirical distribution to create the foundation for the bootstrap world in which everything is done analogously to what occured in the real world. The beauty of the bootstrap world is that nature can be replicated many many times in order to study aspects of the sampling distribution of a statistic based on randomly drawn observations from the empirical distribution. If the original data's sample size if sufficiently large, we can feel safe in assuming that the bootstrap results can be applied to the real world.

(8.2) on p. 88 of E&T introduces some notation that will be useful with more complicated data structures.



The first part of Sec. 8.3 (on p. 88) shows how the notation introduced by (8.2) applies to the two-sample problem. The bootstrapping is carried out in an obvious way. The last sentence of the section points out that just because the resampling is done using two empirical distributions, the B bootstrap replicates can still be computed and the usual formula for the bootstrap estimate of the standard error (which is repeated just before the start of Sec. 8.4 on p. 90) can be applied.

It can be noted that if the statistic is a difference of two sample means, two sample medians, etc., one could alternatively get an estimated standard error of the difference if from each of the two original samples a set of bootstrap samples is created and then a variance is estimated from each set of bootstrap samples, the estimated variances summed, and the square root taken. This alternative estimate (which is hinted at in Problem 8.1 on p. 103 of E&T) could be different from the one described in E&T, but the difference should be slight if B is large. (The difference in the two estimated standard errors is that the one described in E&T at the end of Sec. 8.3 on pp. 89-90 has an estimated covariance involved, whereas the alternative I've described here does not. If we assume independence, then the true covariance should be zero, and so I think the alternative is better. Also, if we consider the difference in two sample means, the alternative should be closer to the usual estimated standard error.) Perhaps E&T seem to favor the method of Sec. 8.3 because of it's simplicity. This R code pertains to the two-sample example considered in Sec. 8.3, and allows for an investigation of the difference in the twos ways of using bootstrapping to estimate the standard error.



Sec. 8.5 and Sec. 8.6 pertain to two different ways of dealing with time series data. The method presented in Sec. 8.5 is based on the first-order autoregressive model, which may or may not be a good approximation of the actual situation that gave rise to the observed data. This model is described on p. 93 of E&T.

The unknown aspects of the model are μ, β, and the distribution of the distributances. μ can be estimated using the sample mean of the observed data. E&T indicate that to estimate β, (8.20) should be minimized. But I don't think it's easy to understand why that is true. So in class I will give a likelihood-based explanation, and show how the estimate of β given by (8.22) is obtained. (Notes: (1) Towards the top of p. 95, E&T state that the statistic of interest is given implicitly by (8.21). However, starting with (8.20) and using simple calculus, it's possible to obtain a closed-form expression for the estimator of interest. This .pdf file explains how maximum likelihood can be used to estimate β. (2) With regard to learning about bootstrapping, it is perhaps a bit bad that E&T give so many examples that make use of statistical methods that some of you have not been exposed to, since we have to take the time to go over these methods in class. On the other hand, it is perhaps good to go over some of these (nonbootstrapping) methods in order to learn more about statistics in general.)

Since the estimate of β corresponding to the material on p. 94 of E&T is a function of the zt, to use bootstrapping to make inferences pertaining to β, we need to generate bootstrap samples from which replicates of the estimate of β can be obtained. Since the Zt are not iid random variables, the replicates have to be obtained differently than usual. It can be noted that the disturbance terms of (8.15) and (8.16) are iid random variables. If we had the observed disturbances, we could resample those in the usual way to create bootstrap samples of disturbances, and then from those create samples of zt values using (8.15) with the estimate of β (based on the original data) taking the place of β (which is unknown). Since we don't have the observed disturbances, we can use the approximate disturbances given by (8.23) instead. (Note: To obtain the zt values to use with (8.23), we can use (8.19).) Once the approximate disturbances are obtained, we can resample them in the usual way to obtain bootstrap samples, and from these bootstrap samples, the estimate of β, and the value of z1 obtained from the original data using (8.19), we can obtain samples of zt* values from which the desired replicates can be obtained (using the simple formula that I'll derive in class). This R code can be used to obtain bootstrap time series (as decribed above, and on pp. 95-96 of E&T) from the lutenizing hormone data given on p. 92 of E&T, and from the bootstrap times series obtain an estimated standard error (of the estimoator of β) to compare with the one given on p. 97 of E&T. (Notes: (1) Since the random numbers produced by my code may be different from thoses used by E&T, we shouldn't expect the estimated standard error to exactly match the value of 0.116 given near the top of p. 97 of E&T. (2) On p. 92 (just above the start of Sec. 8.5), E&T suggest that some care in progamming may improve the efficiency of creating the desired bootstrap data, and Problem 8.6 on p. 104 pertains to this issue with regard to the lutenizing hormone time series analysis. Note in my R code how I use the columns of the matrix of bootstrap samples of the approximate disturbances to recursively create all of the bootstrap times series zt* values with a single loop as opposed to using a loop for each of the bootstrap samples. (3) Here are the lutenizing hormone levels from Table 8.1 on p. 92 of E&T.)

In class I'm not going to cover the material on pp. 97-98 of E&T pertaining to the second-order autoregressive model.



Sec. 8.6 of E&T pertains to another way to use bootstrapping with time series data. Rather than assume a model that may not be correct (as was done in Sec. 8.5), here bootstrap time series are obtained in a manner more similar to the simple method used for single samples from iid random variables. Blocks of l consecutive observations each are drawn from the original data to create the bootstrap time series. (Note: We need not restrict ourselves from drawing blocks created by dividing the original data up into nonoverlapping contiguous segments of length l. Rather, all possible sets of l consecutive observations are candidates for being drawn.) By creating the bootstrap time series this way, one isn't relying on a model which may be incorrect, and some of the correlation present in the original time series is retained in the bootstrap time series. However, the original correlation is not retained where the blocks are joined.

It's not clear how the value of l should be chosen. If l is set to be 1, then the time series created by bootstrapping will consist of independent observations and may be little like the original time series. If a rather large value is used for l, then the bootstrap times series will be quite similar to the original time series --- but maybe too similar to one another to be used to obtain replicates which can be used to model the true sampling distribution of the statistic of interest. (One would expect the estimated standard error to decrease as l is increased --- it's just not clear how to choose l in order to get the best estimate of the desired standard error.)

In this section of E&T, they use this moving blocks method to create bootstrap time series, but then assume an AR(1) model is appropriate and estimate the standard error of the estimator of β. This R code can be used to obtain bootstrap time series of the lutenizing hormone data using moving blocks, and estimate the standard error of the estimator of β.