Final exam review
There will be an in-class final exam on July 28. You should be prepared to take the exam on that date --- I don't
anticipate giving any Incompletes this summer.
The exam will be an open book (and notes) exam --- you can use
whatever books and notes that you wish to bring with you. Although the exam won't involve a lot of mathematics ---
it will be mostly discussion and multiple-choice questions --- you can bring a calculator and/or
computer if you wish to do so.
The exam has twelve 10 point sections to it, with some sections just being a single multiple choice question, and
other sections having multiple parts to them. You are to eliminate two of the twelve sections from consideration and
submit answers for ten of the sections.
Roughly half of the exam will pertain to classification, and half to regression. The best way to prepare for the
exam will be to keep up with the assigned reading, focusing on the material that I emphasize during lectures.
Here is a description of each of the twelve sections.
- A multiple choice question based on MSPE and/or irreducible error.
- A multiple choice question dealing with basis function expressions and model expressions related to MARS.
- A multiple choice question about identifying a good or bad regression method for a certain type of situation.
(It will be good to know the strengths and weaknesses of OLS polynomial models, OLS models using principal
components, local regression using loess,
CART, MARS, forward stagewise additive modeling using TreeNet, and projection pursuit regression.)
- A multiple choice question pertaining to C_p, bias, and model selection.
- Two (short answer) discussion questions pertaining to regression, comparing either regression using principle
components or ridge regression to some other method of regression.
- A (short answer) discussion question about bagging.
- A (short answer) discussion question pertaining to CART (for classification).
- Another (short answer) discussion question pertaining to CART (for classification).
- Two (short answer) discussion questions pertaining to classification methods based on density estimation.
- A four part section pertaining to a specific classification setting. Three of the parts are (short answer)
discussion questions pertaining to the effectiveness of various methods for classification.
(It will be good to be somewhat familiar with LDA, QDA, nearest neighbor methods, logistic regression,
and CART.) The fourth part
will deal with the Bayes classifer for the setting under consideration.
- Two (short answer) discussion questions pertaining to the strengths and weaknesses of three classification methods.
(It will be good to be somewhat familiar with LDA, QDA, nearest neighbor methods, logistic regression,
and CART.)
- A two-part problem dealing with the Bayes classifier for a specific situation.
In addition to the specific descriptions above, I'll add these general comments.
- Understand the Bayes classifier.
- As you read over material as you prepare for the exam, pay attention to the bias-variance tradeoff.
- As you read over material as you prepare for the exam, think about the role of sample sizes.
(Do some methods which may be inferior to another method with large sample sizes become relatively
more attractive with smaller
smaple sizes?)
- As you read over material as you prepare for the exam, pay attention to the general differences between local
and global methods. (How does increasing the sample sizes lead to improved performances for both types of methods?
(How does sample size affect both bias and variance for both types of methods?))
- Among the classification methods covered, CART and nearest neighbors methods get the most attention. (But
LDA comes up in more than one place.)