STAT 789 course content

STAT 789: Advanced Topics in Statistics

Below I will outline what the course covered. (I'm creating this after the course has been completed.) This outline will not show the order in which the topics were covered. Also, it should be kept in mind that some topics listed received a lot more attention than others.

computationally intensive methods

resampling methods

bootstrapping
- bootstrap basics (including empirical distributions, plug-in estimates, software (S-PLUS), and bootstrap history)
- estimate of bias (and improved estimate of bias)
- estimates of standard error, variance, and mean squared error
- confidence intervals
  - "standard" intervals (based on asymptotic normality)
  - bootstrap-t intervals
  - percentile intervals
  - BC_a (bias-corrected and accelerated) intervals
  - ABC (approximate bootstrap confidence) intervals
- estimate of prediction error (more than one method)
jackknifing
- estimate of bias
- estimates of standard error and variance
- confidence intervals (based on pseudo-values)
  - for variance of the main effect in a one-way random effects model
  - for variance of the random effect in a two-way mixed effects model
- hypothesis testing (based on pseudo-values)
  - for inequality of two distribution variances

cross-validation

estimate of prediction error (for regression and classification)
unbiased CV method for histogram bin width selection

density estimation

criteria for measuring goodness
- ISE
- IMSE and MISE
methods
- histograms
  - basics (including variance and bias)
  - selection of bin width
    - normal bin width reference rule
    - Freedman-Diaconis rule
    - oversmoothed bin width
    - Sturges' rule
    - unbiased cross-validation
  - two or more dimensions
- adaptive histograms
- average shifted histograms
- frequency polygons
- kernel density estimators
  - uniform kernel
  - triangle kernel
  - Epanechnikov kernel
  - Gaussion kernel

classification, regression, and model/rule assessment and selection

classification

basics and the optimal Bayes classifier
methods
- discriminant analysis
- k nearest neighbors
- density estimation methods
- CART

regression

simple regression
- ordinary least squares (OLS)
- least absolute deviations (LAD)
- M-regression (with an emphasis on asymptotic relative efficiency)
  - Huber
  - Tukey (bisquare/biweight)
  - Andrews
  - Hampel
- least median of squares (LMS)
- least trimmed squares (LTS)
multiple regression
- polynomial approximation modelling (using OLS)
- tree-structured regression (using CART)

model/rule assessment and selection (emphasis on prediction accuracy)

resubstitution estimate
Mallows' statistic (C_p)
Akaike information criteria (AIC)
Bayesian information criteria (BIC)
test sample estimate
cross-validation (including a discussion of bias)
bootstrap methods

some topics in applied statistics that typically receive insufficient coverage in a basic course

dealing with dependent observations

using a proper design to adjust for a blocking effect (and what happens if you don't)
adjusting for a sequence effect w/o a full time series treatment (and what happens if you don't)
- (although time series analysis was not emphasized, the AR1 model received a fair amount of coverage, and the MA1 model was also introduced)

dealing with heteroscedasticity

one factor designs for continuous distributions
- Welch's test
- Alexander-Govern test
- Welch-Sidák simultaneous confidence intervals (Dunnett's T3 intervals are a special case)
- Games-Howell simultaneous confidence intervals
- nonrobustness of the standard F test (and the Tukey studentized-range test)
- discussion of the transformation method (and the associated danger)

dealing with monotone alternatives

one factor designs
- normal theory procedures
  - Abelson-Tukey test
  - modification of Abelson-Tukey test (incorporating a Satterthwaite approximation) that makes adjustments for heteroscedasticity
- nonparametric procedures
  - Jonckheere-Terpstra test
  - one factor test similar to Page's test
two factor designs
- normal theory procedure: modification of Abelson-Tukey test
- nonparametric procedure: Page's test

estimation of variance components in ANOVA models

estimation of variance components in a one factor random effects model
- point estimators
  - standard unbiased method of moments estimators
  - slightly improved estimators (including Hodges-Lehmann estimator)
  - maximum likelihood estimators
  - Klotz-Milton-Zacks estimators
  - Stein estimators
- interval estimators (for main effect variance)
  - approximate method assuming normality (and incorporating a Satterthwaite approximation for the degrees of freedom)
  - jackknife method (using pseudo-values)
interval estimation of variance component for the random effect in a two factor mixed effects model
- approximate method assuming normality (and incorporating a Satterthwaite approximation for the degrees of freedom)
- jackknife method (using pseudo-values)

two sample tests about variances

standard (nonrobust) F test
robust alternatives
- Levene's s test
- Shorack's APF test (a minor variation on the approach used by Box and Anderson)
- jackknife test (using pseudo-values)

trimmed means (with an emphasis on asymptotic relative efficiency and selection of trimming proportion)

foundation material

measures of performance

bias
- bias correction
variance
mean squared error (MSE) and mean absolute error (MAE)
- bias-variance decomposition of MSE (and bias-variance trade-off)
coverage probability and expected length for confidence intervals
prediction error
- for regression (average squared error)
- for classification (misclassification rate)