MTB > # I'll use Minitab to compute the value of the K-S statistic for the MTB > # data given on the top of p. 126 of G&C and the distribution proposed MTB > # in Problem 4.19 on p. 151 of G&C. Then I'll use tables to make a MTB > # statement about the p-value. Finally, I'll give instructions for MTB > # using StatXact to obtain an exact p-value. MTB > set c1 DATA> 9800 10200 9300 8700 15200 6900 8600 9600 12200 15500 11600 7200 DATA> end MTB > # In order to get a feel for the relationship between the data and the MTB > # proposed dist'n, we can examine a Q-Q plot. I'll plot the ordered MTB > # pairs MTB > # ( x_(i), F_0^{-1}( i/(n+1) ) ). MTB > set c2 DATA> 1:12 MTB > let c2 = c2/13 MTB > invcdf c2 c3; SUBC> norm 10000 2000. MTB > sort c1 c1 MTB > name c1 'obs data' c3 'hyp e.v.' MTB > # (Note: The hyp e.v. values are approximations of the order statistics from MTB > # 12 normal random variablies having mean 10,000 and standard deviation 2000.) MTB > plot c3 c1 hyp e.v. - * - - 20000+ * - - * - * - 10500+ * - * - * - - * 9000+ * - - * - * - 7500+ - * - - ------+---------+---------+---------+---------+---------+----obs data 7500 9000 10500 12000 13500 15000 MTB > # The dist'n underlying the data may be slightly skewed, but it doesn't appear MTB > # to be highly incompatible with a normal dist'n. The sample size is too small MTB > # to reach a strong conclusion. MTB > desc c1 N MEAN MEDIAN TRMEAN STDEV SEMEAN obs data 12 10400 9700 10240 2773 801 MIN MAX Q1 Q3 obs data 6900 15500 8625 12050 MTB > let k90 = 1 MTB > exec 'skku' Executing from file: skku.MTB skewness 0.830365 kurtosis -0.0487697 MTB > # To compute the value of the K-S statistic, I'll first put the values of MTB > # F_0( x_(i) ) into c4, the values of i/n into c5, and the values of (i-1)/n MTB > # into c6. Then I'll put the values of i/n - F_0( x_(i) ) into c7 and the MTB > # values of F_0( x_(i) ) - (i-1)/n into c8. The largest of the values in MTB > # c7 and c8 will be the value of the test statistic. MTB > cdf c1 c4; SUBC> norm 10000 2000. MTB > set c5 DATA> 1:12 DATA> end MTB > let c5 = c5/12 MTB > set c6 DATA> 0:11 DATA> end MTB > let c6 = c6/12 MTB > let c7 = c5 - c4 MTB > let c8 = c4 - c6 MTB > desc c7 c8 N MEAN MEDIAN TRMEAN STDEV SEMEAN C7 12 0.0358 0.0381 0.0382 0.0655 0.0189 C8 12 0.0475 0.0452 0.0451 0.0655 0.0189 MIN MAX Q1 Q3 C7 -0.0787 0.1268 -0.0225 0.0842 C8 -0.0435 0.1620 -0.0009 0.1058 Looking at the values under MAX, it can be seen that the value of the test statistic is 0.1620. Using Table F of G&C it can be seen that the p-value exceeds 0.2. Using Birnbaum's table, it can be determined that the p-value is between 0.83986 and 0.99995 (but one might guess that it's somewhat close to 0.84 since the table gives us that P_0( D_12 >= 2/12 ) = 0.83986, and the test statistic value of 0.1620 isn't much different from 2/12 = 0.1667). To use StatXact, we put the data in the CaseData editor and then select Nonparametrics > One-Sample Goodness-of-Fit > Kolmogorov ... Then click the variable into the Response box using the arrow, select Normal from the Type menu under Distribution, and enter the values for the Mean and Std-dev. Finally click to select Exact (under Compute), and click OK. The value of the test statistic is 0.162, and the exact p-value is 0.8623 (and the asymptotic p-value is 0.9111). StatXact can also be used to do Lilliefors's test and the Shapiro-Wilk test. With the data in the CaseData editor, select Nonparametrics > One-Sample Goodness-of-Fit > Lilliefors ... Then click the variable into the response box. (Since StatXact does this test using the Monte Carlo method, before running this test, the number of Monte Carlo trials should be set to 1000000 and the random number seed should be set to the fixed value of 23456.) Finally, click OK to run Lillifors's test. The Monte Carlo estimate of the exact p-value is about 0.23. (Note: StatXact reports 0.2293, but note that the confidence interval for the exact p-value is (0.2282, 0.2304), which indicates that there is some uncertainty associated with the estimated p-value. So it would be silly to report the p-value using four significant digits. I think it's better to just report any sort of approximate or estimated p-value using only two significant digits.) For the Shapiro-Wilk test use Nonparametrics > One-Sample Goodness-of-Fit > Shapiro-Wilk ... Click the variable into the box and then click OK ... it's that simple. StatXact always does this test using an approximate p-value formula. So we can round the reported p-value of 0.199 to 0.20. (Note: The p-value is in the same ballpark as the one from Lilliefors's test.)