MTB > # I'll first use Minitab to do approximate chi-square tests, and then MTB > # I'll give instructions for doing an exact version of Pearson's test MTB > # using StatXact. MTB > # I'll put the observed counts into c1. (Since all of the expected MTB > # counts equal 18, there is no need to store them in a column.) I'll MTB > # put the value of Pearson's statistic into k1, the asym. GLR test MTB > # statistic into k2, and the corresponding p-values into k3 and k4. MTB > set c1 DATA> 29 19 18 25 17 10 15 11 DATA> end MTB > let k1 = sum( (c1 - 18)*(c1 - 18)/18 ) MTB > let k2 = 2*sum( c1*loge( c1/18 ) ) MTB > cdf k1 k3; SUBC> chisquare 7. MTB > let k3 = 1 - k3 MTB > cdf k2 k4; SUBC> chisquare 7. MTB > let k4 = 1 - k4 MTB > name k1 'q' k2 'GLR stat' k3 'Q p-val' k4 'GLRp-val' MTB > print k1-k4 q 16.3333 GLR stat 16.1381 Q p-val 0.0222052 GLRp-val 0.0238480 MTB > # Since the tests are only approximate, I'll report the p-values using MTB > # just two significant digits. For Pearson's test we have 0.022, and MTB > # for the GLR test we have 0.024. _______________________________________________________________________________ ------------------------------------------------------------------------------- To do an exact version of Pearson's chi-square test using StatXact, I need to enter the data in a 1 by 8 table, and also give a "score" for each column of the table corresponding to either the null hypothesis probability or the expected count under the null hypothesis. I can create an appropriate 1 by 8 table using File > New > Table Data and upon clicking OK, in the Table Settings box that appears I indicate that I want 1 Table, with 1 Row and 8 Columns, and under Scores I click to check the Column box, and then I click OK. Next I enter the counts (see the data set) 29 through 11 across Row 1. Then I move down and enter 0.125 into each of the column Score boxes. Once the 8 counts and the 8 scores are all entered, I select the desired test using Nonparametrics > One-Sample Goodness-of-Fit > Chi-Square and in the Chi-Square Test box that appears, I click on Probability under Column Scores and I click on Exact under Compute, and then I click OK. After about a second or two (or less time if you have a fast computer) the output should appear. The test statistic value of 16.33 is in agreement with what I got using Minitab. The asymptotic p-value, which oddly is indicated to be a "2-Sided P-Value," is given to be 0.02224, which is a bit different from the value of about 0.02221 obtained with Minitab. Using R to obtain an aymptotic p-value, I get 0.02224, in agreement with StatXact, and so we might conclude that Minitab's value is a tad inaccurate. The exact p-value is given to be 0.02231. Upon rounding to two significant digits, the asymptotic and exact p-values are in agreement, and if we use three significant digits, they differ only slightly (0.0222 vs. 0.0223). So in this case for which the expected counts are all relatively large, the approximate p-value is very good. ------------------------------------------------------------------------------------------------ We can also use StatXact to do the K-S test based on a discrete uniform distribution. (Usually the K-S test is thought of as being used with continuous distributions.) To do the K-S test, we need to put the data in the case data editor (instead of the table data editor used for the chi-square test). To do this use File > New... and then select Case Data and click okay. Then enter the values 1, 2, 3, ..., 7, and 8 in the 1st column, and the counts 29, 19, 18, ..., 15, and 11 in the 2nd column. Then use Nonparametrics > One-Sample Goodness-of-Fit > Kolmogorov... Use the arrows to click Var 1 into the Response box, and Var 2 into the Frequency box. Under Distribution, select Uniform Discrete for Type, and enter 1 and 8 as the Minimum and Maximum values. It doesn't seem to be able to do an exact computation of the p-value, so under Compute select Exact Using Monte Carlo, and then click OK. In perhaps 30 seconds to a minute you should get some output (even though while you're waiting you might get the impression that something is wrong). Assuming you had previously set things up so that 1,000,000 Monte Carlo trials with a seed of 23456 were used, in the 1st column of the output you should get a Monte Carlo estimate of the exact p-value as being 0.004608. Because an interval estimate of the exact p-value is (0.004434, 0.004782), it might be best to state that the exact p-value is estimated to be less than 0.005. The p-value from the K-S test is smaller than the one from Pearson's chi-square test. This is because the K-S test focuses on the fact that it would be highly unusual to have more than 63% of the winners come from positions 1 through 4 if all 8 positions were equally favorable. (Pearson's test gives a somewhat small p-value because the observed proportions of winners from the 8 positions are collectively different enough from 1/8 each to be somewhat unlikely if all 8 positions are equally favorable. The K-S test statistic further acknowledges that it'd be rather unlikely to have all 4 of the outer positions (5 through 8) have less than an average number of winners (and have so many of the winners coming from the inner 4 positions) if all 8 positions are equally favorable.) (Note: The asymptotic p-value from the K-S test is not reliable when the hypothesized dist'n is discrete.)