Lectures: Thursday, 4:30-7:10pm; Robinson Hall A349
Some of the lectures will be based on notes posted on this website. Some lectures will be accompanied only by notes written on the board. This course is about modern, computationally-intensive methods in statistics. It emphasizes the role of computation as a fundamental tool of discovery in data analysis, of statistical inference, and for development of statistical theory and methods.
The general description of the course is available at mason.gmu.edu/~jgentle/csi771/
Prerequisites:
Text: Elements of Computational Statistics ISBN 978-1441930248.
List of probability distributions.
Computational Software:
The main computational software that I use is R.
R is open source and is free. It is installed on some GMU computers, but there are various binary executables available at the main R website, and it is best to load it on your own computer.
A good way to learn R is just to use it for progressively more complicated problems. While there are many books on R, the various PDF manuals that come with the installation (use "Help" on the GUI) should be sufficient.
Document Development Software:
The main document development software that I use is TeX.
TeX is owned by the American Mathematical Society. It is free. There are various implementations, and it is installed on some GMU computers. One version is MiKTeX. It is available at miktex.org, and it is best to load it on your own computer.
There are many books on TeX, but a good way to learn TeX is just to use it for progressively more complicated writing projects.
The primary means of communication outside of class will be by email.
Students must use their Mason email accounts to receive important University information, including messages related to this class. (You may, of course, forward email from your Mason email account to one that you check regularly.)
If you send email to the instructor, please put "CSI 771" or "STAT 751" in the subject line.
Although the likelihood of "getting caught" should not influence your ethical standards, you should be aware of the fact that web searches can often identify plagiarism, and that there is even specialized software to facilitate such searches. Whenever I encounter phrases in a student's work that seem to be inconsistent with the usual language that the student uses, I routinely search the web for documents containing those phrases.
Some good guidelines are here:
http://ori.dhhs.gov/education/products/plagiarism/
See especially the entry "26 Guidelines at a Glance".
Representing a rehash or restatement of earlier work as original work is wrong. Such self-plagiarism becomes a breech of academic honor, for example, when a paper submitted for credit in one instance is subsequently submitted for credit in another instance.
All academic accommodations must be arranged through the ODS.
Brief introduction to R.
I will rarely post lecture slides
R functions.
Random number generation in R.
Saving graphics files in R.
Assignments:
(1) Read Appendix A in text (pages 337-350).
(2) Choose two articles in JASA from 2010 to the present
that report Monte Carlo studies
and write brief descriptions of them
(about a page for each), telling specifically what questions were studied by
Monte Carlo.
Assignments: Read revised Chapter 1.
Work problems 1.3, 1.4, 1.11, 1.17, and 1.22 in revised Chapter 1
to turn in (as hardcopies). (These problem numbers correspond to the
current version of the revised Chapter 1. Although the numbers have changed,
the problems themselves are the same as the ones originally assigned.)
Assignments: Read Chapter 2 in text.
Work problems 2.1, 2.2, 2.4, 2.13, and 2.14 (here)
to turn in (as hardcopies).
(For the ``test'' in exercise 2.4, you may just use a q-q plot (which
just provides a visual assessment, see
pages 158ff), a KS test, or any other test you know of. There are functions
in R for q-q plots, qqnorm, the KS test, ks.test, the Shapiro-Wilk test
shapiro.test, and so on.)
Assignments: Read Chapters 3, 4.
Work problems 2.9, 3.3, 3.4, 4.1, and 4.2.
Assignment, due October 17: Prepare and write up your plan for your project.
This includes a brief description of the Monte Carlo study in the paper.
(What are the statistical methods being evaluated? What scenarios
were studied? What are the ``treatments'' (that is, methods) you will study?
What scenarios (that is, blocks in your experiment) will you study?)
I expect this write-up should be about 5 to 10 pages long.
Assignments:Read Chapter 10.
Work problem 5.7 to turn in October 31.
Assignment:
Work problems 10.1, 10.2, , 10.4, 10.5, 10.12 to turn in November 7.
Assignments: Read Chapter 6.
Work problems 6.6, 6.7, 6.9, 6.10 to turn in November 14.
Assignments: Read Chapter 9.
Work problems 9.1, 9.2, 9.9 to turn in November 21.
Nonparametric probability density function estimation. (Chapter 9).
Assignment:
Work problem 9.12 to turn in December 5.
Ten-Minute Presentations of Projects.
Each student will make a brief presentation on the project.
Computer slides are preferred for the presentation.