tuna data


 MTB > # Analysis of Hunter L Values and Consumer Panel Scores for Nine Lots
 MTB > # of Canned Tuna



 MTB > # First I'll input the data.

 MTB > set c1
 DATA> 44.4 45.9 41.9 53.3 44.7 44.1 50.7 45.2 60.1
 DATA> end
 MTB > set c2
 DATA> 2.6 3.1 2.5 5.0 3.6 4.0 5.2 2.8 3.8
 DATA> end
 MTB > name c1 'Hunter L' c2 'c. score'
 MTB > print c1 c2
 
  ROW  Hunter L  c. score
 
    1      44.4       2.6
    2      45.9       3.1
    3      41.9       2.5
    4      53.3       5.0
    5      44.7       3.6
    6      44.1       4.0
    7      50.7       5.2
    8      45.2       2.8
    9      60.1       3.8
 

 MTB > # I'll produce a scatter plot that will give us some idea about the nature
 MTB > # of the relationship between Hunter L value and consumer score.

 MTB > plot c2 c1
          -
          -                              *
       5.0+                                     *
          -
  c. score-
          -
          -
       4.0+           *
          -                                                         *
          -             *
          -
          -                *
       3.0+
          -              *
          -     *      *
          -
          -
            ----+---------+---------+---------+---------+---------+--Hunter L
             42.0      45.5      49.0      52.5      56.0      59.5
 
 MTB > # There seems to be a positive association, but not a really strong one.


 MTB > # Now I'll do a test for association using Kendall's tau.  Because of the small
 MTB > # sample size, one can easily determine that t = 4/9 (about 0.444) and go to
 MTB > # Table L in the Appendix of G&C and obtain that the p-value for a two-tailed
 MTB > # test is about 0.12.  (StatXact gives that the exact p-value is about 0.119.)

 MTB > # Just for fun I'll see how a normal approximation works.  First I'll use 
 MTB > # the z statistic given near the middle of p. 11-6 of the class notes.

 MTB > let k1 = 3*sqrt(9*8/(2*23))*(4/9)
 MTB > cdf k1 k2;
 SUBC> norm 0 1.
 MTB > let k2 = 2*(1 - k2)
 MTB > name k1 'z' k2 'p-value'
 MTB > print k1 k2
 
 z        1.66812
 p-value  0.0952928

 MTB > # Now I'll apply the continuity correction indicated right below the 
 MTB > # z formula on p. 11-6 of the class notes.

 MTB > let k1 = 3*sqrt(9*8/(2*23))*(4/9 - 1/36)
 MTB > cdf k1 k2
 MTB > let k2 = 2*(1 - k2)
 MTB > print k1 k2
 
 z        1.56386
 p-value  0.117851

 MTB > # This approximate p-value of about 0.118 is pretty close to the exact p-value
 MTB > # of about 0.119, even though the sample size is rather small.  StatXact gives
 MTB > # an approximate p-value of about 0.0425, which is only about one third of the
 MTB > # correct value.  (StatXact's approximate value is based on a normal approximation
 MTB > # w/o a cont. correction.  But they use a different formula for the variance.  It
 MTB > # isn't clear to me why they do this, since obviously using the formula from G&C
 MTB > # seems to work a lot better, particularly when a continuity correction is applied.)

 MTB > # The estimate of the variance given a bit below the top of p. 11-7 of the class
 MTB > # notes leads to an estimated standard error of about 0.160375.  Using the confidence
 MTB > # interval formula given at the bottom of p. 11-7 produces (0.130, 0.759) as an
 MTB > # approximate 95% c.i. for tau.  This differs from the interval supplied by StatXact
 MTB > # which results from using their ASE1 value in place of the estimated standard error 
 MTB > # given by G&C (and given on p. 11-7 of the class notes).

 
-------------------------------------------------------------------------------------
                          *** StatXact information ***

To do the test based on Kendall's tau, use
   Nonparametrics > Ordinal Response > Kendall's tau and Somer's D...
Assuming Var1 and Var2 contain the two columns of data, click them into the boxes for
Varible 1 and Variable 2.  Then select Exact under Compute, and click OK.

-------------------------------------------------------------------------------------



 MTB > # Although Minitab doesn't do the test based on Spearman's rho, it can be 
 MTB > # used as shown below to obtain the value of the statistic.

 MTB > rank c1 c11
 MTB > rank c2 c12
 MTB > corr c11 c12
 
 Correlation of C11 and C12 = 0.600
 
 MTB > # From Table M of G&C it can be determined that the p-value for a two-tailed test is
 MTB > # about 0.096.  Using StatXact, an exact p-value of about 0.097 is obtained.  (The
 MTB > # slight discrepancy is due to the book rounding the one-tailed value of about 0.0484
 MTB > # to 0.048.  If the table in the book had used more accuracy and gave 0.0484, then
 MTB > # doubling that value would give us about 0.097 for a two-tailed test p-value.)

 MTB > # I'll try the normal and t approximations given on p. 11-13 of the class notes. 

 MTB > # First the normal approx. given near the middle of p. 11-13 of notes.

 MTB > let k1 = sqrt(8)*0.6
 MTB > cdf k1 k2;
 SUBC> norm 0 1.
 MTB > let k2 = 2*(1 - k2)

 MTB > # Now the t approx. given at the bottom of p. 11-12.

 MTB > let k3 = sqrt(7)*0.6/sqrt( 1 - 0.6**2 )
 MTB > cdf k3 k4;
 SUBC> t 7.
 MTB > let k4 = 2*(1 - k4)
 MTB > name k3 't' k4 't p-v'
 MTB > print k1-k4
 
 z        1.69706
 p-value  0.0896859
 t        1.98431
 t p-v    0.0875434

 MTB > # The t approx. p-value of about 0.088 is in agreement with
 MTB > # StatXact's asymptotic p-value.  (StatXact uses the same t
 MTB > # approximation.  The slight difference in values is due to
 MTB > # one (or both) of the software packages being a bit inaccurate
 MTB > # in obtaining t distribution probabilities.)

 
-------------------------------------------------------------------------------------
                          *** StatXact information ***

To do the test based on Spearman's coefficient, use
   Nonparametrics > Ordinal Response > Spearman's Correlation...
Assuming Var1 and Var2 contain the two columns of data, click them into the boxes for
Varible 1 and Variable 2.  Then select Exact under Compute, and click OK.

-------------------------------------------------------------------------------------



 MTB > # We can also do a test using Pearson's sample correlation coefficient.  
 MTB > # We can get the value of the estimated correlation as shown below.

 MTB > corr c1 c2
 
 Correlation of Hunter L and c. score = 0.571
 
 MTB > # A quick way to get a p-value based on the normal theory t statistic is 
 MTB > # to do a regression and look at the p-value for the slope parameter.
 MTB > # (With a simple regression, the t test about the slope is equivalent to 
 MTB > # the t test based on Pearson's sample correlation coefficient.)

 MTB > regress c2 1 c1
 
 The regression equation is
 c. score = - 1.02 + 0.0972 Hunter L
 
 Predictor       Coef       Stdev    t-ratio        p
 Constant      -1.024       2.540      -0.40    0.699
 Hunter L     0.09718     0.05278       1.84    0.108
 
 MTB > # The p-value is about 0.108.  StatXact can be used to get an exact p-value
 MTB > # using the exact permutation-based null distribution instead of a null
 MTB > # distribution based on an assumption of normality.  StatXact's exact p-value
 MTB > # is about 0.10.

 
-------------------------------------------------------------------------------------
                          *** StatXact information ***

To do the test based on Pearson's coefficient, use
   Nonparametrics > Ordinal Response > Pearson's Correlation...
Assuming Var1 and Var2 contain the two columns of data, click them into the boxes for
Varible 1 and Variable 2.  Then select Exact under Compute, and click OK.

-------------------------------------------------------------------------------------



 MTB > save 'tuna'
 Saving worksheet in file: tuna.MTW