diabetic mice data


 MTB > # Let's first examine the data a bit.

 MTB > dotplot c1-c3;
 SUBC> same.
                    :
           .:  . . .:...  .    .  ..   .        .                .
          +---------+---------+---------+---------+---------+-------Normals 
 
           .  :... . :. .:       :         .     .  .
          +---------+---------+---------+---------+---------+-------Alloxan 
                . .
            ::: :.:  : .     ..                  .
          +---------+---------+---------+---------+---------+-------Allox+in
          0       120       240       360       480       600
 
 MTB > desc c1-c3
 
                 N     MEAN   MEDIAN   TRMEAN    STDEV   SEMEAN
 Normals        20    186.1    124.5    169.6    158.8     35.5
 Alloxan        18    181.8    139.5    172.6    144.8     34.1
 Allox+in       19    112.9     82.0     97.8    105.8     24.3
 
               MIN      MAX       Q1       Q3
 Normals      14.0    655.0     92.0    274.7
 Alloxan      13.0    499.0     70.3    276.0
 Allox+in     18.0    465.0     44.0    133.0
 
 MTB > let k90 = 1
 MTB > exec 'qqnorm'
 Executing from file: qqnorm.MTB
 
  C92     -                                                        *
          -
          -                                       *
       1.2+                              *
          -                          *
          -                         *
          -                 *    *
          -             **
       0.0+           **
          -           2
          -          **
          -        *
          -      *
      -1.2+   *
          -   *
          -
          -  *
            +---------+---------+---------+---------+---------+------C90     
            0       120       240       360       480       600
 
 MTB > exec 'skku'
 Executing from file: skku.MTB
 
 skewness 1.62836
 kurtosis 2.94431

 MTB > let k90 = 2
 MTB > exec 'qqnorm'
 Executing from file: qqnorm.MTB
 
  C92     -                                                   *
          -
          -                                                *
       1.2+
          -                             *          *
          -                             *
          -                   *
          -                  2
       0.0+              * *
          -            * *
          -          *
          -        *
          -      **
      -1.2+
          -      *
          -
          -  *
            +---------+---------+---------+---------+---------+------C90     
            0       100       200       300       400       500
 
 MTB > exec 'skku'
 Executing from file: skku.MTB
 
 skewness 1.13609
 kurtosis 0.371562

 MTB > let k90 = 3
 MTB > exec 'qqnorm'
 Executing from file: qqnorm.MTB
 
  C92     -                                                *
          -
          -                         *
       1.2+                        *
          -                *
          -              *
          -           *  *
          -           2
       0.0+         *
          -        2
          -      * *
          -     *
          -     *
      -1.2+    *
          -   *
          -
          -   *
            +---------+---------+---------+---------+---------+------C90     
            0       100       200       300       400       500
 
 MTB > exec 'skku'
 Executing from file: skku.MTB
 
 skewness 2.30892
 kurtosis 6.40979

 MTB > # Because the sample sizes are nearly the same, the normal theory
 MTB > # procedures ought to be fairly accurate for tests of the general
 MTB > # k sample problem.  (If the null hypothesis is true, the distributions
 MTB > # are identical, and in addition to the variances being equal, there
 MTB > # would be tremendous cancellation of skewness.  So if the null
 MTB > # hypothesis is true, the test statistic's actual sampling distribution
 MTB > # shouldn't be too different from what it is in the case of iid normal
 MTB > # random variables.  If differences in skewness and variance lead to a
 MTB > # rejection, then fine ... if there are differences in the distributions
 MTB > # we want to get a rejection of the null hypothesis of identical dist'ns.)
 MTB > # Still, although the normal theory procedures are fairly robust for
 MTB > # validity, it could be that some of the nonparamteric procedures are
 MTB > # more powerful.

 MTB > # (Note: From a previous session, I've already got the three samples stacked
 MTB > # into c5, with the groups indicated by c6.  Also, I have already named the
 MTB > # Minitab columns.  (I'm using a Minitab worksheet I had saved previously.))

 MTB > oneway c5 c6;
 SUBC> tukey 0.1.
 
 ANALYSIS OF VARIANCE ON albumen 
 SOURCE     DF        SS        MS        F        p
 tr group    2     64357     32178     1.67    0.197
 ERROR      54   1037470     19212
 TOTAL      56   1101827
                                    INDIVIDUAL 95 PCT CI'S FOR MEAN
                                    BASED ON POOLED STDEV
  LEVEL      N      MEAN     STDEV  --+---------+---------+---------+----
      1     20     186.1     158.8               (---------*---------)
      2     18     181.8     144.8             (----------*----------)
      3     19     112.9     105.8  (----------*---------)
                                    --+---------+---------+---------+----
 POOLED STDEV =    138.6             60       120       180       240
 
 Tukey's pairwise comparisons
 
     Family error rate = 0.100
 Individual error rate = 0.0407
 
 Critical value = 2.97
 
 Intervals for (column level mean) - (row level mean)
 
                1         2
 
      2       -90
               99
 
      3       -20       -27
              166       165
 
 
 MTB > # Since all of the confidence intervals indicated above contain 0,
 MTB > # we can conclude that the p-value for the Tukey-Kramer test exceeds
 MTB > # 0.10.  After trying various values with the tukey subcommand, I
 MTB > # concluded that the p-value is about 0.23 or 0.24.

*******************************************************************************
                       *** StatXact ***

To do this test in StatXact use
   Basic_Statistics > ANOVA...
Then click Var1 into the Value box (assuming the 57 observations from the three
samples are stacked into Var1) and click Var2 into the Factor 1 box (assuming 
Var2 has 20 1s, followed by 18 2s, followed by 19 3s).  Finally, click OK.  The
resulting p-value is in agreement with Minitab's p-value.

********************************************************************************

 MTB > mood c5 c6
 
 Mood median test of albumen 
 
 Chisquare = 3.32   df = 2   p = 0.191
 
                                         Individual 95.0% CI's
 tr group   N<=    N>   Median    Q3-Q1  ---+---------+---------+---------+---
        1    10    10      125      183              (-+------------------)
        2     7    11      139      206        (---------+-------------)
        3    13     6       82       89   (-----+-------)
                                         ---+---------+---------+---------+---
                                           60       120       180       240
 Overall median = 122
 
 MTB > # If one incorporates the adjustment factor given near the middle of p. 10-6
 MTB > # of the class notes, the value of the adjusted statistic is about 3.26 and
 MTB > # the corresponding p-value is 0.196.

***********************************************************************************
                         *** StatXact ***

To do this test in StatXact use
   Nonparametics > K Independent Samples > Median...
Then click Var1 into the Response box (assuming the 57 observations from the three
samples are stacked into Var1) and click Var2 into the Population box (assuming 
Var2 has 20 1s, followed by 18 2s, followed by 19 3s).  Next, click to select Exact
under Compute, and then click OK.  The resulting asymptotic p-value is in close
agreement with Minitab's p-value (0.190 for StatXact vs. 0.191 for Minitab).  But
the preferred exact p-value is about 0.202.

***********************************************************************************

 MTB > krus c5 c6
 
 LEVEL    NOBS    MEDIAN  AVE. RANK   Z VALUE
     1      20    124.50       32.2      1.05
     2      18    139.50       32.4      1.06
     3      19     82.00       22.4     -2.11
 OVERALL    57                 29.0
 
 H = 4.44  d.f. = 2  p = 0.109
 H = 4.45  d.f. = 2  p = 0.109 (adj. for ties)

****************************************************************************************
                            *** StatXact ***

To do this test in StatXact use
   Nonparametics > K Independent Samples > Kruskal-Wallis...
Then click Var1 into the Response box (assuming the 57 observations from the three
samples are stacked into Var1) and click Var2 into the Population box (assuming 
Var2 has 20 1s, followed by 18 2s, followed by 19 3s).  The sample sizes are too large 
for an exact p-value to be obtained, so the Monte Carlo option will be used.  Click to
select Exact using Monte Carlo under Compute, and then click Options.  Next click the 
Monte Carlo tab (when the Options box appears).  Change the Crude Monte Carlo Sample Size
from 10000 to 1000000.  Change the Random Number Seed from Clock to Fixed, and use the 
default seed of 23456.  Then click OK to close the Options box, and finally click OK to
run the K-W test routine.  The resulting asymptotic p-value is in close agreement with
Minitab's p-value (0.108 for StatXact vs. 0.109 for Minitab).  The Monte Carlo estimate of 
the exact p-value is about 0.108, and since the interval estimate for the exact p-value
is about (0.107, 0.109) we can be fairly confident that the p-value rounds to 0.11.

****************************************************************************************

 MTB > # The average ranks for the three groups are about 32.15, 32.42, and 22.45.
 MTB > # Using these values it can be determined that the value of the rank analog
 MTB > # of the Tukey-Kramer test statistic is about 2.582.  It can be concluded
 MTB > # that the (approx.) p-value exceeds 0.10.

****************************************************************************************
                           *** StatXact ***
Although StatXact doesn't do this test, it can be of some help.  If the data values are
in Var1 and the indicators of group/population are in Var2, one can put the ranks of the
data values in Var3 using 
   DataEditor > Compute Scores...
Click Var1 into the Response box, type Var3 in the Target Variable box.  Make sure that 
Wilcoxon (Mid-Rank) is selected under Score, and click OK.  Now
  Basic_Statistics > Descriptive Statistics...
can be used to obtain the average ranks.  Click Var3 into the Selected Variables box, and
click Var2 into the By Variable 1 box.  (This will result in summaries for each of the 3
groups.)  Under Central Tendency click to select Mean, and under Summary click to select Sum.
Then click OK.  Using a calculator to divide the sums by the sample sizes avoids rounding
error in the reported means (which correspond to the average ranks).
                              
****************************************************************************************

 MTB > # Doing M-W tests on all pairs of the three samples results in
 MTB > # the following values of u_ij:
 MTB > #   u_12 = 179,
 MTB > #   u_13 = 254,
 MTB > # & u_23 = 231.5.
 MTB > # From these it can be obtained that the value of the Steel-Dwass
 MTB > # test statistic is about 2.600.  It can be concluded (using tables
 MTB > # of critical values from studentized range distributions) that the
 MTB > # (approx.) p-value exceeds 0.10 (since the test statistic value is
 MTB > # less than the 0.10 critical value).

**********************************************************************************************
                               *** StatXact ***

Although StatXact doesn't do this test, it can be of some help.  In the CaseData editor, copy
and paste values in Var1 (assuming that's where the data values are) and Var2 (assuming that's
where the indicators of group/population are) to create 6 new columns (Vars).  In the first two
new columns, copy and paste the data values and group indicators for samples 1 and 2.  In the
next two columns, copy and paste the data values and group indicators for samples 1 and 3.  And
then in the last two new columns, copy and paste the data values and group indicators for samples
2 and 3.  Then use
   Nonparametrics > Two Independent Samples > Wilcoxon-Mann-Whitney...
to do a Wilcoxon rank sum test on samples 1 and 2.  In the output, the value under Observed is the
sum of the ranks for sample 1.  To get the value of the M-W U statistic, subtract off n_1(n_1 + 1)/2.
Proceed in a similar manner to get the other two M-W statistic values needed for the Steel-Dwass test,
remembering to subtract off n_2(n_2 + 1)/2 instead of n_1(n_1 + 1)/2 when you do the test on samples  
2 and 3.

**********************************************************************************************
                               *** StatXact ***

Other tests can be done with StatXact.  One can do the normal scores test described in 
the class notes (using van der Waerden scores) using
   Nonparametics > K Independent Samples > Normal-Scores...
Using the Monte Carlo option (with 1000000 Monte Carlo trials) one gets a p-value of about 
0.15.  (The distribution skewness makes the K-W test a better performer.  But the normal 
scores test yields a smaller p-value than the one-way ANOVA F test.  (The extreme observations
resulting from the skewness makes the F test a bit conservative.))

One can also use Savage scores, using
   Nonparametics > K Independent Samples > Savage...
Using the Monte Carlo option (with 1000000 Monte Carlo trials) one gets a p-value of about 
0.15.  (I'll confess that I had expected the Savage scores test to yield a smaller p-value
since it typically does well for moderately skewed distributions.)

One can do a k-sample permutation test, using
   Nonparametics > K Independent Samples > ANOVA with General Scores...
Using the Monte Carlo option (with 1000000 Monte Carlo trials) one gets a p-value of about 
0.20 (close to the ANOVA F test result).
 
To do a percentile modifed rank test one first needs to create the desired scores.  Suppose
we want to use 
    -24, -23, -22,  ..., -3, -2, -1, 0, 0, 0, ..., 0, 0, 1, 2, 3, ..., 22, 23, 24.
To create these scores, first bring the CaseData editor to the front, and then use
   DataEditor > Compute Score...
Click Var1 into the response box and type Var3 in the Target Variable box.  Select Wilcoxon
under Score, and then click OK.  Now in the next column of the CaseData editor (Var4) type
-24 in Var4 next to 1 in Var3, type -23 in Var4 next to 2 in Var3, and so on, eventually
typing -1 in Var4 next to 24 in Var3.  Next type 24 in Var4 next to 57 in Var3, type 23 in
Var4 next to 56 in Var3, ..., and type 1 in Var4 next to 34 in Var3.  Fill in the other Var4 
spots with nine 0s.  (Note: If noninteger midranks are encountered in Var3, also use midranks 
appropriately in Var4.)  Now use
   Nonparametics > K Independent Samples > ANOVA with General Scores...
Click Var4 (the percentile modified rank scores) in as the Response.  Using the Monte Carlo 
option (with 1000000 Monte Carlo trials) one gets a p-value of about 0.12.

Finally, one can use StatXact to do the extension of the median test described on p. 10-9 of
of the class notes.  One can use m = 3, with t_1 = t_2 = t_3 = N/3 = 19.  The midranks in Var3
of the CaseData editor can be used to determine the observations in the lower third, middle
third, and upper third of the ordered combined sample.  These lead to the following 3 by 3
table.

                                lower middle upper
                                ------------------
                   sample 1     |  4 |   8  |  8 |   20
                                ------------------
                   sample 2     |  5 |   5  |  8 |   18
                                ------------------
                   sample 3     | 10 |   6  |  3 |   19
                                ------------------

                                  19    19    19

Use
   File > New...
and then select Table Data from the available file types and click OK.  Then request 1 table
with 3 rows and 3 columns, and click OK.  Next fill in the counts shown above into the 3 by 3
table.  Then use
   Nonparametrics > Unordered R x C Table > Pearson's Chi-square...
Select Exact under Compute and click OK.  The exact p-value is about 0.175 (whereas the chi-square
approximation results in an approximate p-value of about 0.165).