STAT 789: Advanced Topics in Statistics
Below I will outline what the course covered. (I'm creating this after
the course has been completed.) This outline will not show the order in
which the topics were covered. Also, it should be kept in mind that
some topics listed received a lot more attention than others.
computationally intensive methods
resampling methods
- bootstrapping
- bootstrap basics (including empirical distributions, plug-in
estimates, software (S-PLUS), and bootstrap history)
- estimate of bias (and improved estimate of bias)
- estimates of standard error, variance, and mean squared error
- confidence intervals
- "standard" intervals (based on asymptotic normality)
- bootstrap-t intervals
- percentile intervals
- BCa (bias-corrected and accelerated) intervals
- ABC (approximate bootstrap confidence) intervals
- estimate of prediction error (more than one method)
- jackknifing
- estimate of bias
- estimates of standard error and variance
- confidence intervals (based on pseudo-values)
- for variance of the main effect in a one-way random effects model
- for variance of the random effect in a two-way mixed effects model
- hypothesis testing (based on pseudo-values)
- for inequality of two distribution variances
cross-validation
- estimate of prediction error (for regression and classification)
- unbiased CV method for histogram bin width selection
density estimation
- criteria for measuring goodness
- methods
- histograms
- basics (including variance and bias)
- selection of bin width
- normal bin width reference rule
- Freedman-Diaconis rule
- oversmoothed bin width
- Sturges' rule
- unbiased cross-validation
- two or more dimensions
- adaptive histograms
- average shifted histograms
- frequency polygons
- kernel density estimators
- uniform kernel
- triangle kernel
- Epanechnikov kernel
- Gaussion kernel
classification, regression, and model/rule assessment and selection
classification
- basics and the optimal Bayes classifier
- methods
- discriminant analysis
- k nearest neighbors
- density estimation methods
- CART
regression
- simple regression
- ordinary least squares (OLS)
- least absolute deviations (LAD)
- M-regression (with an emphasis on asymptotic relative efficiency)
- Huber
- Tukey (bisquare/biweight)
- Andrews
- Hampel
- least median of squares (LMS)
- least trimmed squares (LTS)
- multiple regression
- polynomial approximation modelling (using OLS)
- tree-structured regression (using CART)
model/rule assessment and selection (emphasis on prediction
accuracy)
- resubstitution estimate
- Mallows' statistic (Cp)
- Akaike information criteria (AIC)
- Bayesian information criteria (BIC)
- test sample estimate
- cross-validation (including a discussion of bias)
- bootstrap methods
some topics in applied statistics that typically receive
insufficient coverage in a basic course
dealing with dependent observations
- using a proper design to adjust for a blocking effect (and what
happens if you don't)
- adjusting for a sequence effect w/o a full time series treatment
(and what happens if you don't)
- (although time series analysis was not emphasized, the AR1 model
received a fair amount of coverage, and the MA1 model was also
introduced)
dealing with heteroscedasticity
- one factor designs for continuous distributions
- Welch's test
- Alexander-Govern test
- Welch-Sidák simultaneous confidence intervals (Dunnett's T3
intervals are a special case)
- Games-Howell simultaneous confidence intervals
- nonrobustness of the standard F test (and the Tukey
studentized-range test)
- discussion of the transformation method (and the associated danger)
dealing with monotone alternatives
- one factor designs
- normal theory procedures
- Abelson-Tukey test
- modification of Abelson-Tukey test (incorporating a Satterthwaite
approximation) that makes adjustments for
heteroscedasticity
- nonparametric procedures
- Jonckheere-Terpstra test
- one factor test similar to Page's test
- two factor designs
- normal theory procedure: modification of Abelson-Tukey test
- nonparametric procedure: Page's test
estimation of variance components in ANOVA models
- estimation of variance components in a one factor random effects model
- point estimators
- standard unbiased method of moments estimators
- slightly improved estimators (including Hodges-Lehmann estimator)
- maximum likelihood estimators
- Klotz-Milton-Zacks estimators
- Stein estimators
- interval estimators (for main effect variance)
- approximate method assuming normality (and incorporating a
Satterthwaite approximation for the degrees of freedom)
- jackknife method (using pseudo-values)
- interval estimation of variance component for the random effect in
a two factor mixed effects model
- approximate method assuming normality (and incorporating a
Satterthwaite approximation for the degrees of freedom)
- jackknife method (using pseudo-values)
two sample tests about variances
- standard (nonrobust) F test
- robust alternatives
- Levene's s test
- Shorack's APF test (a minor variation on the approach used by Box
and Anderson)
- jackknife test (using pseudo-values)
trimmed means (with an emphasis on asymptotic relative efficiency
and selection of trimming proportion)
foundation material
measures of performance
- bias
- variance
- mean squared error (MSE) and mean absolute error (MAE)
- bias-variance decomposition of MSE (and bias-variance trade-off)
- coverage probability and expected length for confidence intervals
- prediction error
- for regression (average squared error)
- for classification (misclassification rate)
Monte Carlo methods