*** Note: StatXact info is inserted below. MTB > # I'll enter the data. MTB > set c1 DATA> 4.6 4.9 5.0 5.7 6.3 6.8 7.4 7.9 DATA> end MTB > set c2 DATA> 4.7 5.0 5.1 5.8 6.4 6.6 7.1 8.3 DATA> end MTB > set c3 DATA> 5.6 5.9 6.6 6.7 6.8 7.4 8.3 9.6 DATA> end MTB > set c4 DATA> 6.0 6.8 8.1 8.4 8.6 8.9 9.8 11.5 DATA> end MTB > name c1 'Age 15' c2 'Age 20' c3 'Age 25' c4 'Age 30' MTB > # Now let's look at the data and some summary statistics. MTB > print c1-c4 ROW Age 15 Age 20 Age 25 Age 30 1 4.6 4.7 5.6 6.0 2 4.9 5.0 5.9 6.8 3 5.0 5.1 6.6 8.1 4 5.7 5.8 6.7 8.4 5 6.3 6.4 6.8 8.6 6 6.8 6.6 7.4 8.9 7 7.4 7.1 8.3 9.8 8 7.9 8.3 9.6 11.5 MTB > dotplot c1-c4; SUBC> same. . : . . . . . ---+---------+---------+---------+---------+---------+---Age 15 . .. . .. . . ---+---------+---------+---------+---------+---------+---Age 20 . . .: . . . ---+---------+---------+---------+---------+---------+---Age 25 . . . .. . . . ---+---------+---------+---------+---------+---------+---Age 30 4.5 6.0 7.5 9.0 10.5 12.0 MTB > desc c1-c4 N MEAN MEDIAN TRMEAN STDEV SEMEAN Age 15 8 6.075 6.000 6.075 1.226 0.433 Age 20 8 6.125 6.100 6.125 1.221 0.432 Age 25 8 7.113 6.750 7.113 1.308 0.462 Age 30 8 8.512 8.500 8.512 1.697 0.600 MIN MAX Q1 Q3 Age 15 4.600 7.900 4.925 7.250 Age 20 4.700 8.300 5.025 6.975 Age 25 5.600 9.600 6.075 8.075 Age 30 6.000 11.500 7.125 9.575 MTB > # Before trying the J-T test and the rank test in the spirit of the MTB > # Abelson-Tukey test (described on p. 88 of Beyond ANOVA), I'm curious MTB > # as to what the ANOVA F test and the K-W test will give as p-values. MTB > stack c1-c4 c5; SUBC> subs c6. MTB > name c5 'min dist' c6 'age gr' MTB > oneway c5 c6 ANALYSIS OF VARIANCE ON min dist SOURCE DF SS MS F p age gr 3 31.31 10.44 5.50 0.004 ERROR 28 53.09 1.90 TOTAL 31 84.40 INDIVIDUAL 95 PCT CI'S FOR MEAN BASED ON POOLED STDEV LEVEL N MEAN STDEV --------+---------+---------+-------- 1 8 6.075 1.226 (--------*-------) 2 8 6.125 1.221 (-------*-------) 3 8 7.113 1.308 (-------*--------) 4 8 8.512 1.697 (-------*-------) --------+---------+---------+-------- POOLED STDEV = 1.377 6.0 7.2 8.4 MTB > krus c5 c6 LEVEL NOBS MEDIAN AVE. RANK Z VALUE 1 8 6.000 11.4 -1.78 2 8 6.100 11.8 -1.63 3 8 6.750 17.8 0.46 4 8 8.500 25.0 2.96 OVERALL 32 16.5 H = 11.11 d.f. = 3 p = 0.011 H = 11.13 d.f. = 3 p = 0.011 (adj. for ties) MTB > # To get the M-W U statistic values needed for the J-T test, I can do a MTB > # bunch of M-W tests. The output gives the sum of the ranks for the 1st MTB > # sample, but by subtracting 1+2+3+4+5+6+7+8 = 36 from the sum of the ranks, MTB > # I'll get the number of comparisons (out of 8 times 8 = 64) for which the MTB > # observation in the 1st sample is larger than the observation in the 2nd MTB > # sample. MTB > mann c2 c1 Mann-Whitney Confidence Interval and Test Age 20 N = 8 Median = 6.100 Age 15 N = 8 Median = 6.000 Point estimate for ETA1-ETA2 is 0.100 95.9 pct c.i. for ETA1-ETA2 is (-1.500,1.500) W = 70.5 Test of ETA1 = ETA2 vs. ETA1 n.e. ETA2 is significant at 0.8336 The test is significant at 0.8335 (adjusted for ties) Cannot reject at alpha = 0.05 MTB > mann c3 c1 Mann-Whitney Confidence Interval and Test Age 25 N = 8 Median = 6.750 Age 15 N = 8 Median = 6.000 Point estimate for ETA1-ETA2 is 1.000 95.9 pct c.i. for ETA1-ETA2 is (-0.600,2.400) W = 81.0 Test of ETA1 = ETA2 vs. ETA1 n.e. ETA2 is significant at 0.1893 The test is significant at 0.1886 (adjusted for ties) Cannot reject at alpha = 0.05 MTB > mann c4 c1 Mann-Whitney Confidence Interval and Test Age 30 N = 8 Median = 8.500 Age 15 N = 8 Median = 6.000 Point estimate for ETA1-ETA2 is 2.350 95.9 pct c.i. for ETA1-ETA2 is (0.700,4.001) W = 93.5 Test of ETA1 = ETA2 vs. ETA1 n.e. ETA2 is significant at 0.0087 The test is significant at 0.0086 (adjusted for ties) MTB > mann c3 c2 Mann-Whitney Confidence Interval and Test Age 25 N = 8 Median = 6.750 Age 20 N = 8 Median = 6.100 Point estimate for ETA1-ETA2 is 0.900 95.9 pct c.i. for ETA1-ETA2 is (-0.500,2.400) W = 83.0 Test of ETA1 = ETA2 vs. ETA1 n.e. ETA2 is significant at 0.1278 The test is significant at 0.1272 (adjusted for ties) Cannot reject at alpha = 0.05 MTB > mann c4 c2 Mann-Whitney Confidence Interval and Test Age 30 N = 8 Median = 8.500 Age 20 N = 8 Median = 6.100 Point estimate for ETA1-ETA2 is 2.300 95.9 pct c.i. for ETA1-ETA2 is (0.600,3.900) W = 93.0 Test of ETA1 = ETA2 vs. ETA1 n.e. ETA2 is significant at 0.0101 MTB > mann c4 c3 Mann-Whitney Confidence Interval and Test Age 30 N = 8 Median = 8.500 Age 25 N = 8 Median = 6.750 Point estimate for ETA1-ETA2 is 1.500 95.9 pct c.i. for ETA1-ETA2 is (-0.600,3.000) W = 85.5 Test of ETA1 = ETA2 vs. ETA1 n.e. ETA2 is significant at 0.0742 The test is significant at 0.0740 (adjusted for ties) Cannot reject at alpha = 0.05 MTB > # We have that MTB > # u_21 = 34.5, u_31 = 45, u_41 = 57.5, MTB > # u_32 = 47, u_42 = 57, MTB > # u_43 = 49.5, MTB > # which gives us that b = 290.5 (where b is the observed MTB > # value of the JT test statistic). Since the table in MTB > # G&C doesn't have values for 4 samples of size 8, I'll MTB > # use the normal approximation to get a p-value. Under MTB > # the null hypothesis of identical distributions, the MTB > # expected value is 192 and the variance is 2656/3. MTB > # Since we reject for large values of b, using a continuity MTB > # correction, the approximate p-value is MTB > # 1 - Phi( ( 290.5 + 1/2 - 192 )/sqrt( 2656/3 ) MTB > # = Phi( (192 - 1/2 - 290.5 )/sqrt( 2656/3 ) ). MTB > let k1 = (192 - 1/2 - 290.5)/sqrt(2656/3) MTB > cdf k1 k2; SUBC> norm 0 1. MTB > name k1 'z' k2 'p-value' MTB > print k1 k2 z -3.32722 p-value 0.000438631 MTB > # So the (approx) p-value is about 0.0004 (about 10 times smaller MTB > # than the p-value from the nondirectional F test p-value). ************************************************************************************* *** StatXact info *** Having put the data values in Var1 (of the CaseData editor) and the group indicators (8 1s, followed by 8 2s, followed by 8 3s, followed by 8 4s) in Var2, use Nonparametrics > K Independent Samples > Jonckheere-Terpstra... Click Var1 into the Response box, Var2 into the Population box, select Exact under Compute, and click OK. The test statistic value and mean match what I have above, but the standard deviation makes an adjustment for ties that I didn't bother to do ... and StatXact doesn't use a continuity correction. StatXact's asymptotic p-value is about 0.00046, which is close to what I got with Minitab. But the preferable exact p-value is about 0.00034. ************************************************************************************* MTB > # Now I'll do an approximate version of the rank test in the spirit of the MTB > # Abelson-Tukey test (described on p. 88 of Beyond ANOVA). MTB > rank c5 c7 MTB > unstack c7 c11-c14; SUBC> subs c6. MTB > print c11-c14 ROW C11 C12 C13 C14 1 1.0 2.0 7.0 11 2 3.0 4.5 10.0 18 3 4.5 6.0 14.5 24 4 8.0 9.0 16.0 27 5 12.0 13.0 18.0 28 6 18.0 14.5 21.5 29 7 21.5 20.0 25.5 31 8 23.0 25.5 30.0 32 MTB > # These columns contain the midranks for the four age groups. MTB > let k11 = mean(c11) MTB > let k12 = mean(c12) MTB > let k13 = mean(c13) MTB > let k14 = mean(c14) MTB > let k15 = k11 + 2*k12 + 3*k13 + 4*k14 MTB > # k15 contains the value of the L statistic. MTB > print k11-k15 K11 11.3750 K12 11.8125 K13 17.8125 K14 25.0000 K15 188.437 MTB > # Under the null hypothesis, the mean of the statistic is 165 MTB > # and the variance is 55. So the approximate p-value from MTB > # an upper-tailed test is MTB > # 1 - Phi( ( 188.437 - 165 )/sqrt(55) ) MTB > # = Phi( (165 - 188.437)/sqrt(55) ). MTB > let k1 = (165 - k15)/sqrt(55) MTB > cdf k1 k2; SUBC> norm 0 1. MTB > print k1 k2 z -3.16031 p-value 0.000788093 MTB > # So the approximate p-value is about 0.0008. MTB > save 'eyes' Saving worksheet in file: eyes.MTW ************************************************************************************* *** StatXact info *** This test isn't on StatXact's menus. But we can "trick" StatXact into doing the test. Having put the data values in Var1 (of the CaseData editor) and the group indicators (8 1s, followed by 8 2s, followed by 8 3s, followed by 8 4s) in Var2, we can set up two additional columns. Use DataEditor > Compute Scores... to put the Wilcoxon ranks/midranks into Var3. Then use DataEditor > Transform Variable... Type Var4 into the Target Variable box. In the big box after the = sign, type Var2/8 and click OK. (Note: In general, the values in Var4 need to be i/n_i. E.g., if the 1st group was 8 observations, the 2nd group 5 observations, and the 3rd (and last) group 4 observations, Var4 should have 8 values of 1/8 = 0.125, followed by 5 values of 2/5 = 0.2, followed by 4 values of 3/4 = 0/75.) Now use Nonparametrics > K Independent Samples > Linear-by-linear Association... Click Var3 into the Response box, Var4 into the Population box, select Exact under Compute, and click OK. The test statistic value and mean match what I have above, but the standard deviation makes an adjustment for ties that I didn't bother to do. StatXact's asymptotic p-value is about 0.00078, which is close to what I got using Minitab. But the preferable exact p-value is about 0.00045. *************************************************************************************