MTB > # I'll use Minitab to compute the value of the K-S statistic for
MTB > # the data and proposed dist'n given in Problem 4.18 on p. 151 of
MTB > # G&C. Then I'll use tables to make a statement about the p-value.
MTB > # Finally, I'll give instructions for using StatXact to obtain an
MTB > # exact p-value.
MTB > set c1
DATA> 1.6 10.3 3.5 13.5 18.4 7.7 24.3 10.7 8.4 4.9 7.9 12.0 16.2 6.8 14.7
DATA> end
MTB >
MTB > # I'll store the data in a file so that I can then read it into StatXact.
MTB > write 'failtime' c1
Writing data to file: failtime.DAT
MTB > # In order to get a feel for the relationship between the data and the
MTB > # proposed dist'n, we can examine a Q-Q plot. I'll plot the ordered
MTB > # pairs
MTB > # ( x_(i), F_0^{-1}( i/(n+1) ) ).
MTB > # For the exponential dist'n under consideration, we have
MTB > # F_0^{-1}( i/(n+1) ) = 10*log( (n+1)/(n+1-i) ).
MTB > set c2
DATA> 1:15
DATA> end
MTB > let c3 = 10*loge( 16/(16-c2) )
MTB > name c1 'obs data' c3 'hyp e.v.'
MTB > # (Notes: (1) The hyp e.v. values are the approximated expected values for
MTB > # the order statistics from a sample of size 15 from the proposed (null hyp.)
MTB > # dist'n. (2) I must order the data values from smallest to largest in order
MTB > # to match them with the hyp e.v. values. (3) I could have used the invcdf
MTB > # command (along with the expo 10 subcommand) to obtain the desired inverse
MTB > # cdf values.)
MTB > sort c1 c1
MTB > # Here is the pertinent Q-Q plot.
MTB > width 61
MTB > height 23
MTB > plot c3 c1
-
-
28.0+ *
-
hyp e.v.-
-
-
21.0+ *
-
-
- *
-
14.0+ *
-
- *
- *
- *
7.0+ *
- *
- **
- *
- * *
0.0+ *
--------+---------+---------+---------+---------+---------+--obs data
4.0 8.0 12.0 16.0 20.0 24.0
MTB > # There seems to be a decent agreement between the plotted points and the
MTB > # comparison line (the line having slope 1 and intercept 0). But the small
MTB > # values are collectively larger than they should be (if the underlying
MTB > # dist'n is an exponential dist'n having mean 10) and the large values are
MTB > # collectively smaller than they should be. So while the mean may be close
MTB > # to 10, it appears that the standard deviation may be less than 10, and the
MTB > # true underlying dist'n may be relatively less stretched out than an exponential
MTB > # dist'n is. We can check on this by looking at the values of some summary
MTB > # statistics.
MTB > desc c1
N MEAN MEDIAN TRMEAN STDEV SEMEAN
obs data 15 10.73 10.30 10.38 6.01 1.55
MIN MAX Q1 Q3
obs data 1.60 24.30 6.80 14.70
MTB > let k90 = 1
MTB > exec 'skku'
Executing from file: skku.MTB
skewness 0.661732
kurtosis 0.427545
MTB > # (Note: The sample skewness is appreciably less than 2, which is
MTB > # the skewness of an exponential dist'n.)
MTB > # Now I want to compute the value of the one-sample K-S test statistic. I'll
MTB > # put the values of F_0( x_(i) ) into c4, the values of i/n into c5, and the
MTB > # values of (i-1)/n into c6. Then I'll put the values of i/n - F_0( x_(i) )
MTB > # into c7 and the values of F_0( x_(i) ) - (i-1)/n into c8. The largest of
MTB > # the values in c7 and c8 will be the value of the test statistic (see p. 113
MTB > # of G&C).
MTB > cdf c1 c4;
SUBC> expo 10.
MTB > let c5 = c2/15
MTB > set c6
DATA> 0:14
DATA> end
MTB > let c6 = c6/15
MTB > let c7 = c5 - c4
MTB > let c8 = c4 - c6
MTB > print c7 c8
ROW C7 C8
1 -0.081190 0.147856
2 -0.161979 0.228645
3 -0.187374 0.254040
4 -0.226716 0.293383
5 -0.203654 0.270320
6 -0.146155 0.212822
7 -0.101623 0.168289
8 -0.109660 0.176326
9 -0.056991 0.123658
10 -0.032139 0.098806
11 -0.007426 0.074093
12 0.029926 0.036741
13 0.064565 0.002101
14 0.092151 -0.025484
15 0.088037 -0.021370
MTB > desc c7 c8
N MEAN MEDIAN TRMEAN STDEV SEMEAN
C7 15 -0.0693 -0.0812 -0.0697 0.1062 0.0274
C8 15 0.1360 0.1479 0.1363 0.1062 0.0274
MIN MAX Q1 Q3
C7 -0.2267 0.0922 -0.1620 0.0299
C8 -0.0255 0.2934 0.0367 0.2286
MTB > # Looking at the MAX values, it can be seen that the value of the
MTB > # test statistic is 0.2934. From Table F on p. 565 of G&C it can
MTB > # determined that we have
MTB > # 0.1 < p-value < 0.2.
MTB > # (Note: These values match the answers given for Problem 4.18 on
MTB > # p. 612 of G&C.) Since 15*d_15 is about 4.40, using Birnbaum's
MTB > # table it can be concluded that
MTB > # 0.05483 < p-value < 0.19725
MTB > # (since from the table we have P_0( D_15 < 4/15 ) = 0.80275 and
MTB > # P_0( D_15 < 5/15 ) = 0.94517, and so it follows that P_0( D_15 >=
MTB > # 0.2934 ) is some value between P_0( D_15 >= 5/15 ) = 0.05483 and
MTB > # P_0( D_15 >= 4/15 )= 0.19725). Upon combining the results, it
MTB > # can be concluded that
MTB > # 0.1 < p-value < 0.19725,
MTB > # which may seem a bit vague, but on the other hand it may be be
MTB > # entirely sufficient for concluding that there is not strong
MTB > # evidence against the hypothesized distribution.
MTB > save 'failtime'
Saving worksheet in file: failtime.MTW
--------------------------------------------------------------------------------
StatXact Instructions for One-Sample K-S Test
Put the data into a column of the CaseData editor. Then (from the menus) select
Nonparametrics > One-Sample Goodness-of-Fit > Kolmogorov ...
When the box for the Kolmogorov-Smirnov test comes up, put the proper variable
into the Response box by clicking the arrow. Then under Distribution, select
Exponential from the Type menu, and type 10 into the mean box. Under Compute,
click to select Exact, and then finally click OK.
In the output, the first column under Statistic is for the two-sided test
(against the general alternative). It can be seen that the value of the
test statistic is 0.2934 (in agreement with what I got using Minitab), and
the exact p-value is 0.1224 (and the asymptotic p-value is 0.1511).
----------------------------------------------------------------------------------
Doing Lilliefors's Test
StatXact won't do Lilliefors's test about an exponential dist'n (although it will do
Lilliefors's test of normality), but StatXact can be helpful. It can be used to get
the value of the test statistic.
Put the data into a column of the CaseData editor. Then (from the menus) select
Basic_Statistics > Descriptive Statistics ...
When the Descriptive Statistics box comes up, put the variable into the Selected
Variables box. Since the sample mean is among the default selections, there may
be no need to select it (but click to do so if its not preselected). Then click OK.
Note that the sample mean is 10.73 so that you can enter it when you do the K-S test.
Now select
Nonparametrics > One-Sample Goodness-of-Fit > Kolmogorov ...
When the box for the Kolmogorov-Smirnov test comes up, put the proper variable
into the Response box by clicking the arrow. Then under Distribution, select
Exponential from the Type menu, and enter the sample mean into the mean box.
Then click OK.
The p-value in the output isn't correct for Lillifors's test. But we can note
that the test statistic value is 0.2694 and compare it to the critical values given
in Table T of G&C (p. 598). Annoyingly, n=15 isn't covered by the table. If we use
n=14, we get that the p-value is between 0.05 and 0.10. If we use n=16, we get that
it is between 0.01 and 0.05. To play it safe, you can state
0.01 < p-value < 0.1,
but this is pretty vague! (My guess is that the p-value is around 0.05. This is
partially confirmed by another table I have for this version of Lilliefors's test
(which isn't as accurate, but I'll give you a copy anyway since it does contain
some sample sizes not covered by the table in G&C) which gives the 0.05 critical
value for the n = 15 case as being 0.269, which is the value of the test statistic
when rounded to the nearest thousandth.)
Note: I find it rather odd that when one compares the p-value from the K-S test done
above to the p-value from Lilliefors's test, it is found that we have stronger evidence
that the data didn't come from *any* exponential distribution than we have that it
didn't come from an exponential distribution having a mean of 10.