Reading Guide
After a brief introduction, during the first class I'll begin discussing
bootstrapping. Two articles that I want you to read (but not
necessarily prior to the first lecture) are:
- Statistical data analysis in the computer age
- Efron, B., and Tibshirani, R.
- Science, Vol. 253 (1991), pp. 390-395
- A leisurely look at the bootstrap, the jackknife, and
cross-validation
- Efron, B. and Gong, G.
- The American Statistician, Vol. 37 (1983), pp. 36-48
I tried to put a link to an e-journal version of the first article on this web
page, but I couldn't get it to work. However, I think you can easily
read this article (and can print it) by going to the
JSTOR web site, and doing a
search. (From the home page of the
JSTOR site, click on SEARCH.
Then go down to item 2 of that page and click on Expand the journal
list, and check the box in front of the listing of the journal
Science under the General Science heading. Now go down
to item 3 of the same page and type 1991 into both of the Date
Range boxes. Next, go back up to item 1 and type Efron into the
author box (3rd box down) and computer age into the title box (4th box
down). Finally, click the Search button, and the desired article should
come up as the only one which matches the search constraints. Upon
clicking the View Article link, you should see the text of the
article.)
The second article doesn't appear to be available in e-journal format,
so I'll plan to give you a photocopy at the first class meeting.
Also, I'll give you some handouts of my lecture notes on bootstrapping.
Since bootstrapping and jackknifing will be two of the topics covered
this summer (with bootstrapping being a topic that will get more
emphasis than most topics), some of you may want to buy or borrow a book
that covers these subjects.
If so, I strongly recommend
- An Introduction to the
Bootstrap,
- by Efron and Tibshirani
- (Chapman & Hall, 1993),
-
(fairly easy to read, with an emphasis on understanding and applications,
rather than theory).
For future reference,
here are some other books
on bootstrapping (I don't recommend that you read them for this
course):
- Data Analysis by Resampling: Concepts and Applications,
- C. Lunneborg,
- Duxbury (2000),
- (chock full of nonstandard notation/terminology, but includes
information about software);
- Bootstrap Methods: A Practitioner's Guide,
- M. Chernick,
- Wiley (1999),
- (provides a good summary of the relevant literature and
contains a lot of references);
- Bootstrap Methods and their Applications,
- Davison and Hinkley,
- Cambridge (1997),
- (gives more mathematical details than the other books, and more
advanced and harder to read).
- Computational Statistics Handbook with MATLAB,
- Martinez and Martinez (both GMU graduates),
- Chapman & Hall/CRC (2002),
- (Contains information on bootstrapping, jackknifing, and other
computer-intensive statistical methods, along with appropriate MATLAB
code. (I haven't had time to look at this book much, but I thought the
MATLAB information may be of interest to some.))
A more advanced article on jackknifing, which I'll list here primarily
for your further reference (and which is available through
JSTOR), is:
- The jackknife --- a review
- Ruppert G. Miller, Jr.
- Biometrika, Vol. 61 (1974), pp. 1-15
During the first 2 or 3 weeks of the summer session, I may also
briefly discuss several topics from the first
chapter of the Miller book. So, when you can find the time, you should
read this chapter,
and to also read my related
comments about
the Miller book. Among topics that I may address
(see the
June 10 announcement for
further information) are:
- the Shapiro-Francia test (p. 15 of Miller),
- transformations and tests about the mean (pp. 16-18 of Miller),
- trimmed means (pp. 29-31 of Miller),
- dealing with correlated observations (pp. 34-36 of Miller).
During the 4th week of the course, and part of the 5th week, I plan to cover some material related
to Chapter 3 of Miller's book.
- I'll start with Sec. 3.3, but present some methods for dealing with
heteroscedasticity different from the transformation approach that
Miller describes.
- Next, I will cover monotone alternatives (subsection 3.1.3 of
Miller).
- Then I'll turn attention to the random effects model, and consider
point estimates and confidence intervals for the variance components,
including a confidence interval based on jackknifing. (Pertinent pages
from Miller's book are pages 95-98, the bottom portion of p. 100 and
the top portion
of p. 101, and pages 108 and 109.)
For the class on Thursday, July 11, I'll start by giving a brief
description of multiple regression modelling, and then I'll explain
tree-structured regression, using the last several overheads of my CART
presentation. Then I'll cover the 6 pages of the class notes about
trimmed means. I'll next start a presntation about robust regression.
It would be good to read over my paper, A Comparison of Regression
Methods, and also look at the small section of notes I gave you on
M-estimators. (I don't think I'll have time to go through everything I
have planned on this subject on 7/11, but I'll continue on 7/16. I
think you'll find my presentation easier to follow if you read over the
photocopied material first.)