MTB > # I'll use Minitab to compute the value of the K-S statistic for MTB > # the data and proposed dist'n given in Problem 4.18 on p. 151 of MTB > # G&C. Then I'll use tables to make a statement about the p-value. MTB > # Finally, I'll give instructions for using StatXact to obtain an MTB > # exact p-value. MTB > set c1 DATA> 1.6 10.3 3.5 13.5 18.4 7.7 24.3 10.7 8.4 4.9 7.9 12.0 16.2 6.8 14.7 DATA> end MTB > MTB > # I'll store the data in a file so that I can then read it into StatXact. MTB > write 'failtime' c1 Writing data to file: failtime.DAT MTB > # In order to get a feel for the relationship between the data and the MTB > # proposed dist'n, we can examine a Q-Q plot. I'll plot the ordered MTB > # pairs MTB > # ( x_(i), F_0^{-1}( i/(n+1) ) ). MTB > # For the exponential dist'n under consideration, we have MTB > # F_0^{-1}( i/(n+1) ) = 10*log( (n+1)/(n+1-i) ). MTB > set c2 DATA> 1:15 DATA> end MTB > let c3 = 10*loge( 16/(16-c2) ) MTB > name c1 'obs data' c3 'hyp e.v.' MTB > # (Notes: (1) The hyp e.v. values are the approximated expected values for MTB > # the order statistics from a sample of size 15 from the proposed (null hyp.) MTB > # dist'n. (2) I must order the data values from smallest to largest in order MTB > # to match them with the hyp e.v. values. (3) I could have used the invcdf MTB > # command (along with the expo 10 subcommand) to obtain the desired inverse MTB > # cdf values.) MTB > sort c1 c1 MTB > # Here is the pertinent Q-Q plot. MTB > width 61 MTB > height 23 MTB > plot c3 c1 - - 28.0+ * - hyp e.v.- - - 21.0+ * - - - * - 14.0+ * - - * - * - * 7.0+ * - * - ** - * - * * 0.0+ * --------+---------+---------+---------+---------+---------+--obs data 4.0 8.0 12.0 16.0 20.0 24.0 MTB > # There seems to be a decent agreement between the plotted points and the MTB > # comparison line (the line having slope 1 and intercept 0). But the small MTB > # values are collectively larger than they should be (if the underlying MTB > # dist'n is an exponential dist'n having mean 10) and the large values are MTB > # collectively smaller than they should be. So while the mean may be close MTB > # to 10, it appears that the standard deviation may be less than 10, and the MTB > # true underlying dist'n may be relatively less stretched out than an exponential MTB > # dist'n is. We can check on this by looking at the values of some summary MTB > # statistics. MTB > desc c1 N MEAN MEDIAN TRMEAN STDEV SEMEAN obs data 15 10.73 10.30 10.38 6.01 1.55 MIN MAX Q1 Q3 obs data 1.60 24.30 6.80 14.70 MTB > let k90 = 1 MTB > exec 'skku' Executing from file: skku.MTB skewness 0.661732 kurtosis 0.427545 MTB > # (Note: The sample skewness is appreciably less than 2, which is MTB > # the skewness of an exponential dist'n.) MTB > # Now I want to compute the value of the one-sample K-S test statistic. I'll MTB > # put the values of F_0( x_(i) ) into c4, the values of i/n into c5, and the MTB > # values of (i-1)/n into c6. Then I'll put the values of i/n - F_0( x_(i) ) MTB > # into c7 and the values of F_0( x_(i) ) - (i-1)/n into c8. The largest of MTB > # the values in c7 and c8 will be the value of the test statistic (see p. 113 MTB > # of G&C). MTB > cdf c1 c4; SUBC> expo 10. MTB > let c5 = c2/15 MTB > set c6 DATA> 0:14 DATA> end MTB > let c6 = c6/15 MTB > let c7 = c5 - c4 MTB > let c8 = c4 - c6 MTB > print c7 c8 ROW C7 C8 1 -0.081190 0.147856 2 -0.161979 0.228645 3 -0.187374 0.254040 4 -0.226716 0.293383 5 -0.203654 0.270320 6 -0.146155 0.212822 7 -0.101623 0.168289 8 -0.109660 0.176326 9 -0.056991 0.123658 10 -0.032139 0.098806 11 -0.007426 0.074093 12 0.029926 0.036741 13 0.064565 0.002101 14 0.092151 -0.025484 15 0.088037 -0.021370 MTB > desc c7 c8 N MEAN MEDIAN TRMEAN STDEV SEMEAN C7 15 -0.0693 -0.0812 -0.0697 0.1062 0.0274 C8 15 0.1360 0.1479 0.1363 0.1062 0.0274 MIN MAX Q1 Q3 C7 -0.2267 0.0922 -0.1620 0.0299 C8 -0.0255 0.2934 0.0367 0.2286 MTB > # Looking at the MAX values, it can be seen that the value of the MTB > # test statistic is 0.2934. From Table F on p. 565 of G&C it can MTB > # determined that we have MTB > # 0.1 < p-value < 0.2. MTB > # (Note: These values match the answers given for Problem 4.18 on MTB > # p. 612 of G&C.) Since 15*d_15 is about 4.40, using Birnbaum's MTB > # table it can be concluded that MTB > # 0.05483 < p-value < 0.19725 MTB > # (since from the table we have P_0( D_15 < 4/15 ) = 0.80275 and MTB > # P_0( D_15 < 5/15 ) = 0.94517, and so it follows that P_0( D_15 >= MTB > # 0.2934 ) is some value between P_0( D_15 >= 5/15 ) = 0.05483 and MTB > # P_0( D_15 >= 4/15 )= 0.19725). Upon combining the results, it MTB > # can be concluded that MTB > # 0.1 < p-value < 0.19725, MTB > # which may seem a bit vague, but on the other hand it may be be MTB > # entirely sufficient for concluding that there is not strong MTB > # evidence against the hypothesized distribution. MTB > save 'failtime' Saving worksheet in file: failtime.MTW -------------------------------------------------------------------------------- StatXact Instructions for One-Sample K-S Test Put the data into a column of the CaseData editor. Then (from the menus) select Nonparametrics > One-Sample Goodness-of-Fit > Kolmogorov ... When the box for the Kolmogorov-Smirnov test comes up, put the proper variable into the Response box by clicking the arrow. Then under Distribution, select Exponential from the Type menu, and type 10 into the mean box. Under Compute, click to select Exact, and then finally click OK. In the output, the first column under Statistic is for the two-sided test (against the general alternative). It can be seen that the value of the test statistic is 0.2934 (in agreement with what I got using Minitab), and the exact p-value is 0.1224 (and the asymptotic p-value is 0.1511). ---------------------------------------------------------------------------------- Doing Lilliefors's Test StatXact won't do Lilliefors's test about an exponential dist'n (although it will do Lilliefors's test of normality), but StatXact can be helpful. It can be used to get the value of the test statistic. Put the data into a column of the CaseData editor. Then (from the menus) select Basic_Statistics > Descriptive Statistics ... When the Descriptive Statistics box comes up, put the variable into the Selected Variables box. Since the sample mean is among the default selections, there may be no need to select it (but click to do so if its not preselected). Then click OK. Note that the sample mean is 10.73 so that you can enter it when you do the K-S test. Now select Nonparametrics > One-Sample Goodness-of-Fit > Kolmogorov ... When the box for the Kolmogorov-Smirnov test comes up, put the proper variable into the Response box by clicking the arrow. Then under Distribution, select Exponential from the Type menu, and enter the sample mean into the mean box. Then click OK. The p-value in the output isn't correct for Lillifors's test. But we can note that the test statistic value is 0.2694 and compare it to the critical values given in Table T of G&C (p. 598). Annoyingly, n=15 isn't covered by the table. If we use n=14, we get that the p-value is between 0.05 and 0.10. If we use n=16, we get that it is between 0.01 and 0.05. To play it safe, you can state 0.01 < p-value < 0.1, but this is pretty vague! (My guess is that the p-value is around 0.05. This is partially confirmed by another table I have for this version of Lilliefors's test (which isn't as accurate, but I'll give you a copy anyway since it does contain some sample sizes not covered by the table in G&C) which gives the 0.05 critical value for the n = 15 case as being 0.269, which is the value of the test statistic when rounded to the nearest thousandth.) Note: I find it rather odd that when one compares the p-value from the K-S test done above to the p-value from Lilliefors's test, it is found that we have stronger evidence that the data didn't come from *any* exponential distribution than we have that it didn't come from an exponential distribution having a mean of 10.