Welcome to CSI 972 / STAT 972 and CSI 973 / STAT 973

Mathematical Statistics I and II

Instructor: James Gentle

If you send email to the instructor, please put "CSI 972" or "CSI 973" in the subject line.


This two-course sequence covers topics in statistical theory essential for advanced work in statistics.

Course Objectives:
At the end of this two-course sequence the student should be very familiar with the concepts of mathematical statistics, and should have the ability to read the advanced literature in the area. The student should learn a set of tools for doctoral research and should have the confidence to embark on such research.

The prerequisites for the first course include a course in mathematical statistics at the advanced calculus level, for example, at George Mason, CSI 672 / STAT 652, "Statistical Inference", and a measure-theory-based course in probability, for example, at George Mason, CSI 971 / STAT 971, "Probability Theory".

The first course begins with a brief overview of concepts and results in measure-theoretic probability theory that are useful in statistics. This is followed by discussion of some fundamental concepts in statistical decision theory and inference. The basic approaches and principles of estimation are explored, including minimum risk methods with various restrictions such as unbiasedness or equivariance, maximum likelihood, and functional methods such as the method of moments and other plug-in methods. Bayesian decision rules are then considered in some detail. The methods of minimum variance unbiased estimation are covered in detail. Topics include sufficiency and completeness of statistics, Fisher information, bounds on variances of estimators, asymptotic properties, and statistical decision theory, including minimax and Bayesian decision rules.

The second course begins where the first course ends. The second course covers the principles of hypothesis testing and confidence sets in more detail. We consider characterization of the decision process, the Neyman-Pearson lemma and uniformly most powerful tests, confidence sets, and unbiasedness in inference procedures. Additional topics include equivariance, robustness, and estimation of functions.

In addition to the classical results in mathematical statistics, the theory underlying Markov chain Monte Carlo, quasi-likelihood, empirical likelihood, statistical functionals, generalized estimating equations, the jackknife, and the bootstrap are addressed.

The use of computer software for symbolic computations and for Monte Carlo simulations is encouraged throughout the courses.

I have put together a set of notes to supplement the material in the text and the lectures. These notes are in the form of a book. It has a subject index that should be useful. (I am continually working on these notes, so they may change from week to week.)


The main ingredient for success in a course in mathematical statistics is the ability to work problems. (It's harder to identify the "main ingredient" for success in the field of statistics, but even for that, the ability to work problems is an important component.) The only way to enhance one's ability to work problems is to work problems. It is not sufficient to read, to watch, or to hear solutions to problems. One of the most serious mistakes students make in courses in mathematical statistics is to work through a solution that somebody else has done and to think they have worked the problem.

Some problems, proofs, counterexamples, and derivations should become "easy pieces" (see my other comments). An easy piece is something that is important in its own right, but also may serve as a model or template for many other problems. A student should attempt to accumulate a large bag of easy pieces. If development of this bag involves some memorization, that is OK, but things should just naturally get into the bag in the process of working problems and observing similarities among problems --- and by seeing the same problem over and over.


Student work in each course will consist of

  • a number of homework assignments
  • a midterm exam consisting of an in-class component and, possibly, a take-home component
  • a final exam consisting of an in-class component and, possibly, a take-home component


    Instantiations

    Fall, 2012: CSI 972;

    Fall, 2011: CSI 972; Spring, 2012: CSI 973.

    Fall, 2010: CSI 972; Spring, 2011: CSI 973.

    Fall, 2009: CSI 972; Spring, 2010: CSI 973.

    Fall, 2008: CSI 972.

    Fall, 2007: CSI 972; Spring, 2008: CSI 973.

    Fall, 2005: CSI 972; Spring, 2006: CSI 973.

    Fall, 2003: CSI 972; Spring, 2004: CSI 973.

    Fall, 2001: CSI 972; Spring, 2002: CSI 973.


    Some Useful References for Mathematical Statistics

    Texts on general mathematical statistics at the level of this course, more-or-less
    Texts in probability and measure theory and linear spaces roughly at the level of this course
    Texts that provide good background for this course
    Interesting monographs
    Interesting compendia of counterexamples
    An interesting kind of book is one with the word ``counterexamples'' in its title. Counterexamples provide useful limits on mathematical facts. As Gelbaum and Olmsted observed in the preface to their 1964 book, which was the first in this genre, ``At the risk of oversimplification, we might say that (aside from definitions, statements, and hard work), mathematics consists of two classes --- proof and counterexamples, and that mathematical discovery is directed toward two major goals --- the formulation of proofs and the construction of counterexamples.''
    Interesting set of essays
    Good compendium on standard probability distributions

    There is also a multi-volume/multi-edition set of books by Norman Johnson and Sam Kotz and co-authors, published by Wiley. The books have titles like "Discrete Multivariate Distributions". (The series began with four volumes in the 70's by Johnson and Kotz. I have those, but over the years they have been revised, co-authors have been added, and volumes have been subdivided. I am not sure what comprises the current set, but any or all of the books are useful.)

    Software for symbolic computation
    There is, of course, no substitute for the ability to understand and work through mathematical derivations and proofs, but just as data analysis software aids in understanding statistical methods, software for symbolic manipulation can aid in working through mathematical arguments. In addition to the role played by data analysis software in understanding applied statistics, the software is a major tool of professionals who do data analysis. Likewise, software for symbolic manipulation is becoming a major tool for professionals working in mathematical statistics.

    Some steps in work in "higher mathematics" depend on recognition of an expression as a particular form of some well-known object. This recognition is essentially a data-retrieval problem. The data may be stored in one's brain, in a table of integrals, or in some other place. It is a stretch to think of the recognition problem as one that requires a "higher intelligence", although certainly the ability to do it easily is an important component of general mathematical ability. Software packages for symbolic computation can sometimes help in the mechanical processes of solving mathematical problems.

    The main software packages for symbolic computation are Mathematica, Maple, Macsyma, and Reduce. There are relatively inexpensive student versions of all of these. Some University computer labs have one or more of the packages installed. The SCS science cluster has Mathematica. Most of the books listed below provide introductions to the software using relatively low-level applications for illustration.