The following comments have been posted by Allen Keesee re the indicated chapters in Rand R.: Wilcox' "Fundamentals of Modern Statistical Methods”:


 
Chapter 3:
 

32 -- 1st line -- should read, "The equation for the family of normal
curves" to be consistent with Wilcox later [pp 118, 2 places] for
example] decisions on whether the family is of equations or of curves.
 

32 -- 12th line from bottom (and p 42, 7th line fr bottom, and p55, 17 fr
bott, and other places) -- the practice of using "then" to start
the second clause of a sentence that has begun with "if" is (from an
English grammar class pt of view) I believe incorrect syntax. Nor (from
the examples I am familiar with) can it be written off as an attempt at
consistency with "prevalent" programming language structure since while
SAS uses the "if.... then" structure, neither C++ (just "if.. else if...
else..." etc., no "then" involved), nor S+ (if and if else) does.
Also NB that Wilcox does not always use "then" after "if": cf, p 65,
7th and 8th lines fr top).

[DR G cites New Am Heritage Dict to contrary]
 

36 -- Eqn at top this pg should 2 - 8 and 4 - 8, not 2 - 7 and 4 - 7 and that some of the other numbers are wrong because of this 7 vs 8 mix up;
[OK]
 

37 -- 5th and 6th lines down -- and 37) box-plot method of defining outliers -- Seems any across-the board rule re outlier characterization is inappropriate. Would think what constitutes an "outlier", because the term seems to carry the implication that any such observation is a candidate for down-weighting (e.g., via GLS, in the sense that outliers in a series of Y observations for a given X value will are down weighted when the mean-Y they are associated with is down-weighted) or outright exclusion from  consideration (e.g., via trimming means), should depend on the nature of the data being looked at.

There are many situations, that is, where what might by any formula such as those ref'd above (2 std devs, or "outer quartile + 1.5 * IQR") might be called an "outlier" was nevertheless of central import and far from being up for trimming or de-emphasis, might be at the heart of a certain calculation or judgment.
            Flood plane boundaries, it is my understanding are based on "100 yr high level " estimates -- and affect real estate values significantly. Admiral Bull Halsey, despite his theretofore brilliant record had his career cut short (and was almost court martialed) because in 1945 of having let the (I think it was ) Third Fleet run afoul (with significant losses of men and ships) of two major typhoons (one off Leyte, the other off Okinawa) in the
space of about 6 months -- i.e., there, the fact that being in the path of two such storms in 6 months might have been an "outlier-ish" event was in no way viewed as by the Navy grounds for exoneration from blame.
And Long Term Cap Mgmt from one account I have read owed its demise most directly (although with other contributing factors) to the fact that the main interest rate spread model put together by its chief model builder ruled out the possibility of any event which could be more than 6 std devs away from the mean.
 
 

38 -- first 2 lines -- not clear why having "extreme" values in far less than 25% of observations would not a) alter position of 25th and 75th quartiles and thus b) alter the definition of "outlier" since this def depends on the value of the IQR, and hence c) cause "masking".

 

38-44  --  Although it seems clear CLT deals only with size of "n", Wilcox repeatedly implies the number of samples (not clear whether "in addition to" or "instead of" their common individual size, "n") is what brings the result curve close to normal, Cf, p 51: "... if we could repeat an experiment billions of times, we would get fairly good agreement between the plot of weighted means and the normal curve."
 
 

45 -- 9th line fr bott -- "... convergence to normality is quicker when using medians." – Thus, "quicker" as used here actually means "with fewer observations".



 

Chapter 4:
 

53 -- 8th and 7th from bott -- Stmt re the mean being the "optimal" estimator "for any probability curve we might consider" implies that normalcy is a more prevalent condition than nonnormalcy,  an assumption unproven/dubious/unlikely- to-be-true.

 

56 -- 12th fr top -- Fig 4.2 line more like y = x -10 than y = x + 10.

  

58 -- 13 fr bott -- "frequentist approach" undefined/unexplained.

Per Everitt’s Stat Dict (cite available upon request) , 132-33,

"Frequentist inference -- an approach to statistics based on a frequency view of probability in which it is assumed that it  is possible to consider an infinite sequence of independent  repetitions of the same statistical experiment.

Significance tests, hypothesis tests and likelihood are the main tools associated with this form of inference….”

 

“Significance test -- a statistical procedure that when applied to a set of observations results in a p-value relative to some hypothesis. Examples include Student's t-test, z-test, and Wilcoxon's signed rank test." 305 [i.e., tests of significance, i.e., "TOS"s; comparison of a computed test stat [or, p-val] e to a table test stat like t or z or w.]

“Hypothesis testing -- A general term for the procedure of assessing whether sample data is consistent or otherwise with statements made about the population." 159 [i.e., confidence interval testing]



   

Chapter 5:

 

72 -- 2d para, line 10 -- shld be larger (area), not higher
 
 

74 -- 9 and 8 fr bott -- "Norm curves are bell-shaped, but there are infinitely many bell-shaped that are not normal." – Would be good to indicate here (rather than later as he does) what distinguishes.
 

 



 
 

Chapter 6:

 

95 -- 13 and 12 fr bott -- "... if we could repeat a study billions of times... "; still not clear why this concept of extensive repetition keeps being introduced since per CLT it is the size of the sample not the number of samples that induces normalcy.
 
 

94ff -- generally using "percentile bootstrap" and percentile t bootstrap" for the two methods is a poor choice of labels, not least since both, not just one, involve the T distribution, not to mention the fact that they are used inconsistently [Ex: bott 100:"... the percentile t bootstrap [versus] assuming normality {instead of "[versus] the percentile bootstrap}".; and top 103: "... discrepancy between the bootstrap [should be percentile t bootstrap] what we get assuming normality [should be percentile bootstrap]"; opt 115: "the percentile t bootstrap beats our reliance on the central limit theorem... " [not clear if CLT is being used as coda for Student's T or for "percentile [w/o t] bootstrap"; 115 9th fr top: refers (apparently) to percentile t bootstrap as "modified bootstrap".
The three methods discussed in 6 to assess the accuracy of hypotheses about means -- which itself is simply a means to the end of assessing the sameness or not of populations -- are

a)      the Student approach to calculation of the t  statistic;

b)      the Bootstrap approach to calculation of the t  statistic, with normalcy of the source population
assumed; and

c)      the Bootstrap approach to calculation of the t  statistic, with normalcy of the source population
not assumed.


Reasonable "abbreviative" labels for the three approaches could thus be

a)       Student's t;

b)       Bootstrap norm-assump t; and

c)       Bootstrap no-norm-assump t.
 
 

98 -- middle of page -- In noting the suggestion that increasing the number of observations increases the likelihood that bootstrapping will yield an accurate result, one should also keep in mind that the larger the sample size the more likely CLT is, as repeated resamplings are done from the one original sample, to suggest that the underlying population is normal even when it isn't.
 
 
 

102 -- Discussion of why lower end value should be used to define upper end cut-off not clear.
 
 

104 -- "... when using the percentile t bootstrap, the discrepancy between the actual and nominal Type I error probabilities goes to zero at the rate 1/sqrt(n) -- it goes to zero more quickly more quickly than when using the Student's T." -- It would have been helpful to have had the rate for Student's T specified as well.
 
 

107 -- 15-18 from top -- "... Type I errors are never made when there is heteroskedacity...because... a zero slope is virtually impossible; " Would have been better to have explained that basis of this belief  --  whose motivation is otherwise  not clear  --  i.e., why the impossibility of coincidence of heteroskedacity and zero slope  --  is view that there is unlikely to be a treatment that affects the variance but does not affect the slope as well..
  

110-13 -- In all his extended discussion of Pearson and why zero correlation does not mean independence, Wilcox never once mentions the simple, summary rule that "r" and rho are measures of linear association (aka, dependence) only.
 
 

112 -- 14 from top -- dependence does not have to mean variance of Y changes with value of X, as in quadratic shape for example, var Y can stay tight (even same for all X), but there still can be strict X.vs.Y. dependence.
 
 

113-4 -- Fig 6.6 suggestive not merely of "perhaps a" non-linear rel but quite possibly and specifically of a quadratic one, s. th. that Wilcox doesn't mention.
 
 

115*** -- 10th and 15th from top -- "Section 6.5" and "Section 6.6" do not exist.

 



 

Chapter 7:

117 -- 1 and 2 --  No indication of relationship between mixing and contamination; i.e., that a mixed normal is one but not the only kind of contaminated.

 

117 last 2 lines, 118 top 2 lines -- This sentence a good example of lack of clarity that can result from failure to use parallel structure.
 

118 -- line 8 -- After phrase "... 90 percent chance that..." the words "it will appear that" should be inserted. Ditto, line 9 after phrase "10 percent chance that".
 
 

118 -- 5th and 4th fr bott -- "little diff tween normal and mixed normal" is awkward; better is "between true normal and mixed normal".
 
 

121 -- 1st line -- after words "the means" should be inserted the words "if in fact they are different"; cf p 69 where Wilcox does in fact define "power" this way (as he should).
 
 
 

121 -- 4 fr bott -- 10% change is hardly "arbitrarily small".
 

121 -- last 2 at bott -- the ref to power being inversely related to population variance is in Chapter 5 (at p 71), not Chapter 4.
 
 

122 -- He uses the general phrases "departures from normality" (3 fr top) and "normality assumption is violated" (3d line of caption of Fig 7.3) when what he really is referring to is heavy-tailedness, not light-tailedness or any other deviation from normality direction such as skew of any sort.
 
 
 

123 -- 8 fr bott -- refers to normal as light tailed; per Dr. Gentle comment 2 wks ago that normal should be the standard of light and heavy, Wilcox' ref raises issue of what distribution it is that HE regards as the standard.
 
 
 

123 -- last sentence -- syntax completely wrong.
 

 

126 -- lines 8-11 fr top -- Fig 7.7 in no way "indicates" that "the prob of an obs being within one std dev of the mean is .999..."; similarly, the caption of Fig 7.7, "as indicated here" is quite wrong.
 
 

127 --  7 fr top -- "effect size" undefined.

[Everitt says "effect" "generally [refers to]... the change in a response variable produced by a change in one or more explanatory or factor variables".

[Vogt’s Stat Dict (cite avail upon request), at 94, says "effect size (ES)" is

"a) any of several measures of association or of the strength of a relation, such as Pearson's r or eta. ES often is thought of as a measure of practical significance.

"b) A statistic, often abbreviated D or delta, indicating the difference in outcome for [i] the average subject who received a treatment from [ii] the average subject who did not (or who received a different level of the treatment). This statistic is often used in meta-analysis. It is calculated by taking the difference between the control and experimental groups and dividing that by the standard deviation of the control group's scores -- or by the standard deviation of the scores of both groups combined.

"c) In statistical power analysis, ES is the degree to which the null hypothesis is false."]

Thus what VOGT calls effect size, version (b), is what Wilcox calls the standardized difference, a measure of effect size.
 

127 --  8 fr top -- Because effect size undefined, significance/relevance of "standardized difference" unclear. Like "common measure of the bletz is the cravis."

 

129 -- 2d and 3d fr top -- words "of a probability curve" repetitive and should be deleted.

 

130 -- 13 fr top --  He is now calling a "regression outlier" what up to this page he has simply called an outlier.
 

130 –  Saying  the slope is not 0 just because the non-0 value was generated by an "outlier" is a) an arbitrary denigration of the value of outliers, which in fact deserve to be so treated only if they can be shown to arise fr measure errors or certifiably unique circumstances; and  b) is inconsistent with Wilcox' own remarks about outliers on p 113.
 

134 -- 2d "key point" --  He is using  "probability curve" to mean at various places in the text both

a)      density curve/density function/distribution curve/distribution function, and

b)       population/distribution.

I.e., he is using the one term “prob curve” to mean two diff things. See on same issue, comment re p 139 below.
 
 


 

 

Chapter 8:

139  --  "... arbitrarily small departures from normality can have devastating consequences on [sic] the population mean, particularly the population variance.... nonnormality can result in very low power..."

            1) consequences for or consequences re, never consequences on;

            2) why is an effect on the variance a particular sort of effect on the mean?? Syntax akin to "I like apples, particularly carrots"??

            3) Nonnormality can result in any kind (more, less, unchanged) of power; only heavy tailed non norm can result in lowered power. Hence this statement misleading.

 

139  --  as at 134 (and 140, 143, 146, etc.) , he is using "probability curve" apparently as a synonym for density curve, distribution curve, density function, distribution function, population, and distribution, which is OK except that he has used the terms pop and dist heretofore; no explic of reason for change in nomenclature. And at 141, ln 7, he reverts to "distributions", and at 141, ln 21, to "population". The fact that the terms "prob curve" and "distrib" and "pop" are being used to refer to the same thing should be noted.

 

 

139  --  1st full para confusedly laid out; what the phrase "first criterion" at ln 17 refers to is not clear.

 

144  --  ln 10  --  Comment that "outliers" can cause the sample mean to be "inaccurate" is imprecise (i.e., "inaccurate" in what respect, and judged against what standard ?) and to the extent interpretable, conclusory.

 

146  --  lns 6 and 7  --  "... unlikely to provide accurate information about the population mean..."  but (unless the product of a measurement error) likely to provide accurate information about the mechanism generating the values that in turn generate the mean itself; so whether the outliers should be in or out depends on what one is interested in, typicality or mechanism, or something else.

 

146  --  15 and 16  --  "... a normal... curve OR a curve with relatively light tails...": contradicts his statement on p 123 (8 fr bott) that a normal curve IS a curve with light tails.

 

151  --  Fig 8.4  -- 

            a) Box upper left: is inappropriate illustration of 1/1 slope;

b) Box lower right: hardly any point in offering biweight as example with zero explanation here (or on pp 152 or 211            where it is mentioned again) of what it is/how it is calculated; (Vogt simply says "BW" it is a system that             down-weights outliers; Everitt doesn't mention it at all).

 

152  --  "... we need to limit the influence of extreme values...": as usual no discussion of why, of need to distinguish between virtually-certainly measurement-error or other one-time-aberrational "extremes" vs. reasonably-possible-to-repeat extremes, purpose of analysis, whether interested in typicality or mechanism, etc.

 

156  --  He is not adding "an" outlier; he is adding 8 outliers, i.e., converting 40% of the entire sample to outlier status;

 

156-7  --  Re whole discussion here re 20% trim vs. M-estimator, given that in order to have 20% outperform M, 

a)      seems must have situation where 15-20% of sample are "outliers" but

b)      with that high a percent of the sample in a single "far-from-the- category, it is oxymoronic to call the values outliers.

            Thus, it would appear that where there is a "true" outlier situation  --  say up to 5%, 7%, perhaps a maximum of 10%, of the values at an "extreme"  --  , M is clearly the better choice because it will have a smaller standard error in those circumstances. Where there is a larger percentage of values "far" from the mean, classification (controlling for some other variable at work) would be called for. The question thus comes down to, is there any circumstance when the 20% would be preferred over M in the small-percent-far-from-mean case or classification in the large-percent-far-from-mean case?

 

 

157  --  idea that it might be OK to ignore 20% + of the data is bizarre unless it was known absolutely to be measurement error or otherwise one-time aberrational.

 

158  --  " an arbit small change in curve can cause arbit large change in mean val...": is this correct??