eye data analysis


 *** Note: StatXact info is inserted below.


 MTB > # I'll enter the data.

 MTB > set c1
 DATA> 4.6 4.9 5.0 5.7 6.3 6.8 7.4 7.9
 DATA> end
 MTB > set c2
 DATA> 4.7 5.0 5.1 5.8 6.4 6.6 7.1 8.3
 DATA> end
 MTB > set c3
 DATA> 5.6 5.9 6.6 6.7 6.8 7.4 8.3 9.6
 DATA> end
 MTB > set c4
 DATA> 6.0 6.8 8.1 8.4 8.6 8.9 9.8 11.5
 DATA> end
 MTB > name c1 'Age 15' c2 'Age 20' c3 'Age 25' c4 'Age 30'

 MTB > # Now let's look at the data and some summary statistics.

 MTB > print c1-c4
 
  ROW  Age 15  Age 20  Age 25  Age 30
 
    1     4.6     4.7     5.6     6.0
    2     4.9     5.0     5.9     6.8
    3     5.0     5.1     6.6     8.1
    4     5.7     5.8     6.7     8.4
    5     6.3     6.4     6.8     8.6
    6     6.8     6.6     7.4     8.9
    7     7.4     7.1     8.3     9.8
    8     7.9     8.3     9.6    11.5
 
 MTB > dotplot c1-c4;
 SUBC> same.
 
               . :    .   .  .   .   .
           ---+---------+---------+---------+---------+---------+---Age 15  
 
               . ..    .   ..  .       .
           ---+---------+---------+---------+---------+---------+---Age 20  
 
                     . .    .:   .     .        .
           ---+---------+---------+---------+---------+---------+---Age 25  
 
                        .    .        . .. .     .           .
           ---+---------+---------+---------+---------+---------+---Age 30  
            4.5       6.0       7.5       9.0      10.5      12.0
 
 MTB > desc c1-c4
 
                 N     MEAN   MEDIAN   TRMEAN    STDEV   SEMEAN
 Age 15          8    6.075    6.000    6.075    1.226    0.433
 Age 20          8    6.125    6.100    6.125    1.221    0.432
 Age 25          8    7.113    6.750    7.113    1.308    0.462
 Age 30          8    8.512    8.500    8.512    1.697    0.600
 
               MIN      MAX       Q1       Q3
 Age 15      4.600    7.900    4.925    7.250
 Age 20      4.700    8.300    5.025    6.975
 Age 25      5.600    9.600    6.075    8.075
 Age 30      6.000   11.500    7.125    9.575
 
 MTB > # Before trying the J-T test and the rank test in the spirit of the
 MTB > # Abelson-Tukey test (described on p. 88 of Beyond ANOVA), I'm curious
 MTB > # as to what the ANOVA F test and the K-W test will give as p-values.

 MTB > stack c1-c4 c5;
 SUBC> subs c6.
 MTB > name c5 'min dist' c6 'age gr'
 MTB > oneway c5 c6
 
 ANALYSIS OF VARIANCE ON min dist
 SOURCE     DF        SS        MS        F        p
 age gr      3     31.31     10.44     5.50    0.004
 ERROR      28     53.09      1.90
 TOTAL      31     84.40
                                    INDIVIDUAL 95 PCT CI'S FOR MEAN
                                    BASED ON POOLED STDEV
  LEVEL      N      MEAN     STDEV  --------+---------+---------+--------
      1      8     6.075     1.226  (--------*-------)
      2      8     6.125     1.221   (-------*-------)
      3      8     7.113     1.308           (-------*--------)
      4      8     8.512     1.697                       (-------*-------)
                                    --------+---------+---------+--------
 POOLED STDEV =    1.377                  6.0       7.2       8.4

 MTB > krus c5 c6
 
 LEVEL    NOBS    MEDIAN  AVE. RANK   Z VALUE
     1       8     6.000       11.4     -1.78
     2       8     6.100       11.8     -1.63
     3       8     6.750       17.8      0.46
     4       8     8.500       25.0      2.96
 OVERALL    32                 16.5
 
 H = 11.11  d.f. = 3  p = 0.011
 H = 11.13  d.f. = 3  p = 0.011 (adj. for ties)
 
 MTB > # To get the M-W U statistic values needed for the J-T test, I can do a
 MTB > # bunch of M-W tests.  The output gives the sum of the ranks for the 1st
 MTB > # sample, but by subtracting 1+2+3+4+5+6+7+8 = 36 from the sum of the ranks, 
 MTB > # I'll get the number of comparisons (out of 8 times 8 = 64) for which the
 MTB > # observation in the 1st sample is larger than the observation in the 2nd
 MTB > # sample.

 MTB > mann c2 c1
 
 Mann-Whitney Confidence Interval and Test
 
 Age 20     N =   8     Median =       6.100
 Age 15     N =   8     Median =       6.000
 Point estimate for ETA1-ETA2 is       0.100
 95.9 pct c.i. for ETA1-ETA2 is (-1.500,1.500)
 W = 70.5
 Test of ETA1 = ETA2  vs.  ETA1 n.e. ETA2 is significant at 0.8336
 The test is significant at 0.8335 (adjusted for ties)
 
 Cannot reject at alpha = 0.05
 
 MTB > mann c3 c1
 
 Mann-Whitney Confidence Interval and Test
 
 Age 25     N =   8     Median =       6.750
 Age 15     N =   8     Median =       6.000
 Point estimate for ETA1-ETA2 is       1.000
 95.9 pct c.i. for ETA1-ETA2 is (-0.600,2.400)
 W = 81.0
 Test of ETA1 = ETA2  vs.  ETA1 n.e. ETA2 is significant at 0.1893
 The test is significant at 0.1886 (adjusted for ties)
 
 Cannot reject at alpha = 0.05
 
 MTB > mann c4 c1
 
 Mann-Whitney Confidence Interval and Test
 
 Age 30     N =   8     Median =       8.500
 Age 15     N =   8     Median =       6.000
 Point estimate for ETA1-ETA2 is       2.350
 95.9 pct c.i. for ETA1-ETA2 is (0.700,4.001)
 W = 93.5
 Test of ETA1 = ETA2  vs.  ETA1 n.e. ETA2 is significant at 0.0087
 The test is significant at 0.0086 (adjusted for ties)
 
 MTB > mann c3 c2
 
 Mann-Whitney Confidence Interval and Test
 
 Age 25     N =   8     Median =       6.750
 Age 20     N =   8     Median =       6.100
 Point estimate for ETA1-ETA2 is       0.900
 95.9 pct c.i. for ETA1-ETA2 is (-0.500,2.400)
 W = 83.0
 Test of ETA1 = ETA2  vs.  ETA1 n.e. ETA2 is significant at 0.1278
 The test is significant at 0.1272 (adjusted for ties)
 
 Cannot reject at alpha = 0.05
 
 MTB > mann c4 c2
 
 Mann-Whitney Confidence Interval and Test
 
 Age 30     N =   8     Median =       8.500
 Age 20     N =   8     Median =       6.100
 Point estimate for ETA1-ETA2 is       2.300
 95.9 pct c.i. for ETA1-ETA2 is (0.600,3.900)
 W = 93.0
 Test of ETA1 = ETA2  vs.  ETA1 n.e. ETA2 is significant at 0.0101
 
 MTB > mann c4 c3
 
 Mann-Whitney Confidence Interval and Test
 
 Age 30     N =   8     Median =       8.500
 Age 25     N =   8     Median =       6.750
 Point estimate for ETA1-ETA2 is       1.500
 95.9 pct c.i. for ETA1-ETA2 is (-0.600,3.000)
 W = 85.5
 Test of ETA1 = ETA2  vs.  ETA1 n.e. ETA2 is significant at 0.0742
 The test is significant at 0.0740 (adjusted for ties)
 
 Cannot reject at alpha = 0.05
 
 MTB > # We have that
 MTB > #    u_21 = 34.5,   u_31 = 45,     u_41 = 57.5,
 MTB > #                   u_32 = 47,     u_42 = 57,
 MTB > #                                  u_43 = 49.5,
 MTB > # which gives us that b = 290.5 (where b is the observed
 MTB > # value of the JT test statistic).  Since the table in
 MTB > # G&C doesn't have values for 4 samples of size 8, I'll
 MTB > # use the normal approximation to get a p-value.  Under
 MTB > # the null hypothesis of identical distributions, the
 MTB > # expected value is 192 and the variance is 2656/3.
 MTB > # Since we reject for large values of b, using a continuity
 MTB > # correction, the approximate p-value is
 MTB > #    1 - Phi( ( 290.5 + 1/2 - 192 )/sqrt( 2656/3 )
 MTB > #  = Phi( (192 - 1/2 - 290.5 )/sqrt( 2656/3 ) ).

 MTB > let k1 = (192 - 1/2 - 290.5)/sqrt(2656/3)
 MTB > cdf k1 k2;
 SUBC> norm 0 1.
 MTB > name k1 'z' k2 'p-value'
 MTB > print k1 k2
 
 z        -3.32722
 p-value  0.000438631

 MTB > # So the (approx) p-value is about 0.0004 (about 10 times smaller
 MTB > # than the p-value from the nondirectional F test p-value).

*************************************************************************************
                           *** StatXact info ***

Having put the data values in Var1 (of the CaseData editor) and the group indicators
(8 1s, followed by 8 2s, followed by 8 3s, followed by 8 4s) in Var2, use
   Nonparametrics > K Independent Samples > Jonckheere-Terpstra...
Click Var1 into the Response box, Var2 into the Population box, select Exact under
Compute, and click OK.

The test statistic value and mean match what I have above, but the standard deviation
makes an adjustment for ties that I didn't bother to do ... and StatXact doesn't use a
continuity correction.  StatXact's asymptotic p-value is about 0.00046, which is close 
to what I got with Minitab.  But the preferable exact p-value is about 0.00034.

*************************************************************************************

 MTB > # Now I'll do an approximate version of the rank test in the spirit of the 
 MTB > # Abelson-Tukey test (described on p. 88 of Beyond ANOVA).
 MTB > rank c5 c7
 MTB > unstack c7 c11-c14;
 SUBC> subs c6.
 MTB > print c11-c14
 
  ROW    C11    C12    C13   C14
 
    1    1.0    2.0    7.0    11
    2    3.0    4.5   10.0    18
    3    4.5    6.0   14.5    24
    4    8.0    9.0   16.0    27
    5   12.0   13.0   18.0    28
    6   18.0   14.5   21.5    29
    7   21.5   20.0   25.5    31
    8   23.0   25.5   30.0    32
 
 MTB > # These columns contain the midranks for the four age groups.
 MTB > let k11 = mean(c11)
 MTB > let k12 = mean(c12)
 MTB > let k13 = mean(c13)
 MTB > let k14 = mean(c14)
 MTB > let k15 = k11 + 2*k12 + 3*k13 + 4*k14
 MTB > # k15 contains the value of the L statistic.
 MTB > print k11-k15
 
 K11      11.3750
 K12      11.8125
 K13      17.8125
 K14      25.0000
 K15      188.437

 MTB > # Under the null hypothesis, the mean of the statistic is 165
 MTB > # and the variance is 55.  So the approximate p-value from
 MTB > # an upper-tailed test is
 MTB > #    1 - Phi( ( 188.437 - 165 )/sqrt(55) )
 MTB > #  = Phi( (165 - 188.437)/sqrt(55) ).

 MTB > let k1 = (165 - k15)/sqrt(55)
 MTB > cdf k1 k2;
 SUBC> norm 0 1.
 MTB > print k1 k2
 
 z        -3.16031
 p-value  0.000788093
 
 MTB > # So the approximate p-value is about 0.0008.

 MTB > save 'eyes'
 Saving worksheet in file: eyes.MTW

*************************************************************************************
                           *** StatXact info ***

This test isn't on StatXact's menus.  But we can "trick" StatXact into doing the test.

Having put the data values in Var1 (of the CaseData editor) and the group indicators
(8 1s, followed by 8 2s, followed by 8 3s, followed by 8 4s) in Var2, we can set up
two additional columns.  Use 
   DataEditor > Compute Scores...
to put the Wilcoxon ranks/midranks into Var3.  Then use
   DataEditor > Transform Variable...
Type Var4 into the Target Variable box.  In the big box after the = sign, type Var2/8
and click OK.  (Note: In general, the values in Var4 need to be i/n_i.  E.g., if the
1st group was 8 observations, the 2nd group 5 observations, and the 3rd (and last) 
group 4 observations, Var4 should have 8 values of 1/8 = 0.125, followed by 5 values 
of 2/5 = 0.2, followed by 4 values of 3/4 = 0/75.)

Now use
   Nonparametrics > K Independent Samples > Linear-by-linear Association...
Click Var3 into the Response box, Var4 into the Population box, select Exact under
Compute, and click OK.

The test statistic value and mean match what I have above, but the standard deviation
makes an adjustment for ties that I didn't bother to do.  StatXact's asymptotic p-value
is about 0.00078, which is close to what I got using Minitab.  But the preferable exact
p-value is about 0.00045.

*************************************************************************************