Some Comments about Chapter 4 of Samuels & Witmer
Note that even though the chapter is titled The Normal
Distribution, there is not just one normal distribution, but rather
a whole family of them.
I plan to cover this chapter rather quickly in class, and in particular
I don't want to spend a lot of time explaining the use of Table 3
on pp. 675-676 (since you should be able to figure it out on your own
given the examples in the book, and since software can be used to help
obtain such probabilities). In order to have more time to discuss more
advanced material, we'll have to not spend too much time on the simpler
stuff.
Section 4.1
- (p. 119, 2nd paragraph) Normal distributions are often (too often,
actually) used as a model for many phenomena. For example, observations
in a randomly selected sample of otter weights might be approximately
modeled as a sample from an appropriate normal distribution (i.e., one having
a suitable
mean and variance). (Also see Example 4.1 on p. 119 and Fig.
4.1 on p. 120.) If an assumption of (approximate) normality is
justified, an inference method derived by
assuming a sample from a normal distribution ought to be a good one to
use. (However, it should be kept in mind that even though statistical
procedures based on assumptions of normality are commonly used,
such an inference method
might not be so good to use if a normal model is not a good one in a
particular situation. We
don't expect to have natural phenomena follow normal distributions
exactly, but we need to be aware that if the parent distribution of the
sample is too much different from a normal distribution, inferences
based on an assumption of (approximate) normality may be appreciably
nonoptimal, especially if the sample size is rather small.) A
particular normal distribution, called the standard normal
distribution (which is just the normal distribution having mean 0
and variance 1), is also useful for approximating the sampling
distribution of an estimator or test statistic in many cases. (Sampling
distributions are the subject of Ch. 5.)
- (p. 121) If one takes many measurements of the same object, then
any variation in the values can be attributed to measurement
error. (See
Example 4.4 on p. 121.)
If one has a very precise way of measuring and takes one
measurement of many different objects, then the observed variation is
almost entirely due to differences among individuals in the population.
Often variation among observations in a sample are due to a combination
of measurement error and population differences. (See
Example 2.14 on pp. 23-24.)
In such a case,
one way to get a good idea about the relative contributions of the two
sources of variation is to make more than one measurement on the
individuals selected from the population.
Section 4.2
- (p. 122) While some books do use the notation indicated on the 4th
line of Sec. 4.2, I think more commonly the value of the variance is
given as the 2nd parameter value instead of the value of the standard
deviation. That is, in a lot of books, N(100, 25) is used to
denote the normal distribution with mean 100 and variance 25, while in
S&W this would indicate a standard deviation of 25.
Section 4.3
- (p. 125) More precisely, 95% of the probability mass is
within about 1.96 (as opposed to 2) standard deviations of the mean (which
means that a normal
random variable asssumes a value within about 1.96 standard deviations of its
mean with probability 0.95). (I put about here because the
precise value corresponding to 95% of the probability mass isn't exactly
1.96 --- but rounded to 4 decimal places, it's 1.9600.)
- (p. 129) Most books use
z0.05 instead of
Z0.05 (and in fact I cannot recall seeing an upper case
letter used like this in any other book). I plan to use lower case for
this, and I want you to use lower case too.
Some books use
z0.95 to indicate the value that separates the upper
5% of the probability mass from the lower 95%, but
z0.05 is more commonly used for this purpose.
Section 4.4
- (p. 133) The facts referred to on the bottom of p. 133 are
sometimes collectively called "the empirical rule." I've never
bothered to use the "empirical rule" to assess approximate normality,
since there are better methods to employ.
- (pp. 134-135, Normal Probability Plots) These plots are
called many different things. I tend to call them probit plots.
They are a special case of q-q plots (quantile-quantile plots).
They should be your chief method for assessing approximate normality.
- (p. 134) The last sentence on this page is a good one ---
using a histogram is not a good way to assess approximate normality.
- (p. 135) Even if the parent distribution of the data is normal, the
points in the plot are not expected to be on a perfectly straight line.
This point is made on p. 136.
- (p. 136, Fig. 4.26) The plots shown are all based on a
sample of size 20. For larger sample sizes, the straight line pattern
should be more prominent, but for smaller sample sizes we can generally
expect more wiggle. Plots made from samples for which the parent
distribution is approximately normal are hard to distinguish from those
made from samples for which the parent distribution is exactly normal.
Fortunately, in applied statistics, we only have to determine if the
parent distribution is approximately normal (we never expect to have
data from a normal distribution).
- (p. 138, Computer note) Before the widespread use of
computers, statisticians used special graph paper to make the kind of
plots which are now so easy to make using software. SPSS, Minitab, and
some other commonly used statistical software packages plot the data
values using the horizontal axis and the normal scores using the
vertical axis, and this is the way I'm used to doing it. (I'll show you
in class how the plots on p. 137 would look if they were redone the way
that SPSS makes such plots. The shapes indicated on p. 137 should not
be used to interpret the plots made using SPSS.)
- (pp. 138-139, Transformations for Nonnormal Data) While such
transformations are good for some applications, for certain kinds of
analyses (that I'll discuss later) they can lead to misleading results.
Section 4.5
- (p. 142 & p. 144) The last paragraph of the section on p. 144
explains when to use the continuity correction, and part (a) on p. 142
shows a typical application.
Section 4.6
- (p. 145) The second to the last paragraph of the section
states that methods developed specifically for samples from normal
distributions can work well even if the parent distribution of
the data is not normal. It should also be added that in some situations
other methods will perform much better than those based on a assumption
of normality --- the type and degree of nonnormality affects the
robustness of normal theory procedules to nonnormality.
- (p. 145) The last paragraph of the section suggests that skewed
distributions are better models for some phenomena that are normal
distributions.