Note: I advise that you first read this web page from top to bottom without clicking on any of the links. After you've gone through this page once in a linear fashion, you may find the various links useful to jump to the pertinent parts of this page for more information, and to go to other web pages for additional information.

Is STAT 554 the right course for you?

This question, and other questions (e.g., about prerequisites, texts, and how to get a head start) are addressed below.
Some students may initially find it difficult to decide if STAT 554 is the proper course for them to take during the coming/current semester. STAT 554 is a bit unusual in that it is a graduate level course in applied statistics that doesn't assume that you've necessarily had a statistics class before, while at the same time presenting students with material that goes far beyond the level that is typically covered in undergraduate statistics classes. Because of this, some students may be wondering if the course will be too hard or wondering if the course will be too easy. Or some students may be wondering if they should take the course now or wait until later, or wondering if they need to take the course in order to pass the STAT 554 qualifying exam (this pertains to students in the IT Ph.D. program).

What is the prerequisite, and what if I haven't had such a course?

STAT 554 is a graduate level course in applied statistics that: Formally, the prerequisite is STAT 344 or MATH 351. (You can visit my STAT 344 web site to learn more about the course.) Among the most important things you need to know from elementary probability in order to follow along well in STAT 554 are: Note: In the past, some students who lacked a formal probability background did very well in this course. For example, my guess is that somebody with a biology background and who had a two semester introductory statistics sequence as an undergraduate or had a good one semester statistics course and has used statistical methods in lab work can earn an A in the course if they are a good student who is willing to work extremely hard (for one thing, by self-learning some the of preprequisite probability material during the first couple of weeks). Such a person might find some of the homework problems rather difficult, but since the bulk of one's grade depends on how well they learn the applied material that will be presented during the course of the semester, earning an A should not be impossible. (Actually, many students working in biology or enviromental science have earned an A in this course. I would guess that anyone who will need to do statistical analysis in their thesis or dissertation should choose this course over a less ambitious statistics course, because an easier course may not adequately prepare one to do their own analysis. In fact, many students may need to use more advanced methods that we don't have time to cover in STAT 554. However, it may be the case that this course provides such students with a sufficient background for them to learn additional methods on their own, take a more advanced course, or effectively work with a more experienced statistician acting as a consultant.) If you don't have the prerequisite, it comes down to you making a choice based on what you know about your abilities and how much time you'll devote to the course. Some without the prerequisite have earned an A, but others have failed.

Will this course be too hard for me? Also, how does it compare to other statistics courses?

If someone took one of the prerequisite courses and did well, then he/she should feel comfortable in STAT 554, knowing that a lot of the other students have a similar background. (If one got an A in STAT 344 then that person should be quite ready for STAT 554. If one got a A- or lower in STAT 344 then it may be wise to spend several weeks reviewing the most important parts of elementary probability that pertain to STAT 554.) But students should keep in mind that even if they have met the prerequisite, that STAT 554 is primarily a course for students intending to work as statisticians and is not a survey course for nonmajors --- and so the expectation is that students will put a lot of time into the course (15 to 20 hours a week would not be an unreasonable amount of time to spend on the course).

If you haven't had a probability course before, then please carefully read my comments above. Many students not meeting the formal prerequisite have earned a grade of A, while many others have dropped out or suffered a bad grade because they were completely overwhelmed by the material.


400-, 500-, and 600-level one semester statistics courses typically are one of four different types.
  1. Similar to a 100- or 200-level statistics class (but with a higher course number). Such courses are pretty much (but not completely, in all cases) a waste of time and do not adequately prepare you to do good statistical analyses (more info about this here).
  2. Consisting of a combination of the material covered in both STAT 554 and STAT 652 (GMU's course which focuses on the theory and methods of statistical inference for parametric statistical models --- you can visit my STAT 652 web site). While such a course may provide one with a wonderful overview of statistics, there is insufficient time in one semester for students in such a course to gain a working knowledge of applied statistics. (There wouldn't be time to cover any methods beyond the most common ones, and even for those there wouldn't be time for the "fine points" to be presented.)
  3. Similar to what is described immediately above except that it is also attempted to present elementary probility theory in the first half of the semester.
  4. Consisting of a combination of the material covered in STAT 554, STAT 655 (GMU's design of experiments/analysis of variance course), and STAT 656 (GMU's regression analysis course). While it is true that STAT 554 doesn't cover some of the advanced methods that some researchers need to use, it is also true that a course in applied statistics which attempts to cover too much rarely covers everything well. Since it may be the case that I'm already guilty of trying to cover too much in STAT 554, it seems foolhardy to try to cram twice as much material into the course. With STAT 554 I've chosen to attempt to provide students with an understanding of what statistics is about, to thoroughly train them to do a proper analysis in several common settings, and to provide them with an overview of how to handle more complex situations (stressing that it is important to have an understanding of the performance and limitations of each procedure employed and to know how to check to determine whether or not the assumptions of the procedures are adequately satisfied). Courses which attempt to cover too much material tend to be too "cook bookish" and students who take such a course may obtain a dangerously weak understanding of how to accurately perform the methods and interpret the results.

Will this course be too easy for me?

Most one semester statistics courses focus on methods that were developed assuming that the data arose from normal distributions. While some such methods are robust enough to work satisfactorily in some nonnormal situations, most satistics books contain scant accurate information about the robustness of common techniques. Books also do little with regard to comparing the performance of classical methods with nonparametric and robust statistical procedures. If nonparametric methods are covered at all, they tend to be in a chapter by themselves towards the end of the book and little information is presented pertaining to guidelines for choosing the best possible procedure in a given situation.

In this course we will investigate a variety of data analysis situations, and for each one we will typically cover several different statistical procedures that may be appropriate. I'll discuss strengths and weaknesses of each procedure, give guidelines for their effective use, and teach you how to obtain clues from the data that will allow you to make a good choice. I'll give warnings about when to avoid common approximations and discuss exact inference procedures to use instead.

The emphasis on alternative (nonparametric and robust) inference procedures is important since many normal theory methods can be quite inaccurate in small sample size situations. Even in large sample size situations (for which normal theory procedures may be reasonably robust) it is still the case that an alternative inference procedure may be a better choice. Some examples are given below. Unless you have already mastered the material in several good graduate-level courses in applied statistics, my guess is that you can get a lot out of this course. Even though some M.S. students may enter the program having had several 400- or 500-level courses in applied statistics, unless you have had a good course based on Miller's book, I would recommend that you not skip this course. If early on in the semester you find some of the material repeats what you've studied previously, challenge yourself to develop a deeper understanding of the material, and read ahead in order to spend more time with the more advanced material. If you buy all of the recommended books, there should be plently for you to read and learn.

What books should I buy, and what reading can I do prior to the beginning of the semester in order to get a head start?

Although the idea of learning about a variety of techniques for each type of problem and learning how to use the data to diagnose the situation and select a good procedure hopefully seems sensible to you (particularly if you know that some of the normal theory procedures perform horribly in some nonnormal situations), you may be surprised to learn that very few books contain adequate information that allows you to easily perform statistical analyses in this manner.

My lecture notes are based fairly heavily on Miller's book, Beyond ANOVA: Basics of Applied Statistics, Reissue edition, which does present a variety of methods for each data analysis situation covered and gives guidelines for their use. A strengh of Miller's book is that it gives good references that address issues that we won't have time to cover. However, Miller's book assumes a much stronger background than what is needed to meet the prerequisites for this course, and so you can really consider the lecture notes that I've produced to be the main text for the course. (I'll encourage you to at least start looking over Miller's book once we get to the 4th week of the semester, but don't worry if you find it difficult to follow. When the course is over, you will hopefully think of the book as a valuable addition to your bookshelf. Also, since I may not always present my lectures in the same order as the class notes are written, you may find it much easier to follow in class if you read ahead in the class notes before each lecture.)

Since Miller's book assumes that the reader has some knowledge of statistics, it starts at a point too advanced for what would be appropriate in this course. You may find one or both of the following books useful. In addition to using these books to supplement the book by Miller and my course notes throughout the semester, doing some reading in these more elementary books may be a good way to better prepare yourself for the beginning of the course. Another book which complements the material that I cover in the course is Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy by Wilcox. It is newer than the other books and covers some material that the older books don't cover. Furthermore, it is written at a more elementary level than the Miller book (Wilcox's book seems to be aimed at educating social scientists who may have a rather limited knowledge of statistics about some of the latest and greatest techniques), and so one can get a fair amount of knowledge from it without expending a huge effort. (Note: In some places the Wilcox book is sloppy (perhaps when he tried to simplify his presentation he gave up too much accuracy), and so readers should check out my comments about the Wilcox book.)

Click here for information about what is required and what is optional, and more advice about which books to buy. Basically I recommend that you buy (and read) as many of the books as you can afford (except some may wish to pass on the more elementary book by Zar).
Click here for information about software and the Minitab books in the bookstore.
Read this prior to going to the GMU bookstore to buy any books and/or software for STAT 554.

Is this a good time for me to take STAT 554?

If you are going to eventually take both STAT 544 (Applied Probability) and this course, STAT 554, then if you don't take them the same semester (which full-time students may do), it makes more sense to take 544 prior to 554. As long as you know enough probability, it's okay to take 554 before 544 --- but I think the better you know probability, the easier 554 will be (and the more you'll get out of the course).

(This pertains to students in IT and Statistics Ph.D. programs.) Is it necessary to take the course in order to pass the qualifying exam on Applied Statistics?

No, but my guess is that it'll be much easier to pass the qualifying exam once you've completed the course.

How do STAT 510 and STAT 535 compare to STAT 554?

I believe I'm accurate in claiming that STAT 510, which hasn't been offered recently, falls in between STAT 250 and STAT 554 in terms of difficulty and sophistication, but that STAT 510 is closer to STAT 250 than it is to STAT 554.

STAT 554 covers a very wide variety of methods that can be used to analyze experimental data (and so it's good for students in biology and enviromental science), while STAT 510 caters more to students in public policy and the social sciences, giving them a few simple methods to use, and introducing them to the SPSS software. STAT 510 has only a minimal prerequisite (MATH 108 or permission of instructor), and it goes at a much slower pace.

My advice is that if you need to know how to do serious data analysis, then take STAT 554 if you think you can handle it. But do be aware that STAT 554 can consume a lot of hours (many more than STAT 510 requires), and this is even more the case if you enter STAT 554 without the formal prerequisite. In the end you have to be the one to make an honest assessment of your ability and your willingness to work hard (but it should be noted that some programs do not allow STAT 510 to be taken in place of STAT 554).

Each fall my department also also offers another course dealing with applied statistics, STAT 535, which was specially designed for students in the sciences. If you are a graduate student in biology, enviromental science, chemistry, or some other field of science, you may be better off taking this course, especially if you lack the formal prerequisite for STAT 554 (although I will point out that many students who have lacked the prerequisite have done fine in STAT 554). STAT 535 does not go into as much detail as STAT 554 does, and will make much less use of probability theory. Also, it will have more time allocated to some useful advanced topics (presented at a reasonable level) that are not covered in STAT 554.