StatXact Hints


Below are comments about

using the Monte Carlo option

In some cases if you try to obtain an exact p-value with StatXact, you will get a message indicating that the sample size(s) is/are too large for an exact computation. In such cases, rather than rely on an asymptotic p-value (the value which results from using a large sample approximation (like a normal or chi-square approximation)), I think it's good to use the Monte Carlo option of StatXact.

Before using StatXact's Monte Carlo option, I think it's a good idea to change the "settings" to things other than the default values. (Plus, to make it easier for me to grade the homework, I want everyone to use the same settings that I do, so that we all get the same answer --- if the default settings are used, it would be possible to students to obtain different answers when using the Monte Carlo option.) To change the settings,
  1. pull down the Options menu on the main bar,
  2. then choose Monte Carlo.
  3. Change the Monte Carlo Sample Size from 10,000 to 100,000.
  4. click on Fixed for Random Number Seed (which deactivtes the use of the Clock), and go with the default Fixed value of 23456.
  5. Next, change Frequency of Intermediate Display to 10,000.
  6. Finally, uncheck Use Importance Sampling where available,
  7. check to Save Monte Carlo Parameters Permanently,
  8. and click OK.
(Note: In general, it may be a good idea to use importance sampling, but when I explain in class how the Monte Carlo option works, I'll be referring to the simple version which does not use importance sampling, and so let's just use the simple version for the computation of p-value estimates for the homework.)


doing a one-way layout version of Page's test (using the Linear-by-linear Association test)

I presented this test in class. A description of a very similar test can be found on the bottom half of p. 88 of Miller's Beyond ANOVA: Basics of Applied Statistics. Miller's version gives equal weight to each sample in the test statistic through the use of the average rank for each sample, whereas the version that we can easily do using StatXact gives equal weight to each observation through the use of rank sums for each sample. When the sample sizes are equal, both versions of the test are equivalent, but still the null mean and variance from StatXact will differ from what is given in Miller.

To do this test, enter your data like you would for the J-T test (i.e., on the CaseData spreadsheet, in one column put the group indicators, and put the data values into another column). The group inicators should go from 1 to k in an order corresponding to the monotone alternative. Then, convert the CaseData to TableData by having the CaseData spreadsheet to the foreground, clicking on the CaseData menu, selecting Convert to TableData, and then putting the group indicator variable as the rows, the observed responses as the columns, but also indicating that you want both Row and Column Scores, selecting the columns scores to be of the Wilcoxon (Mid-Rank) variety. Finally, click OK.

Then go to the K Independent Samples portion of StatXact's Statistics menu and select Linear-by-linear Association. Click in the proper variables for the Population and Response, then click on Exact, and finally OK.

If you use the data in Table 6.6 on p. 205 of H&W, you should get a p-value of about 0.0198 (which is smaller than the value of about 0.0210 that results from the J-T test). If instead of using 1, 2, and 3 for the group indicators, you use 1, 2, and 4, then the p-value is about 0.0173. The "weights" of 1, 2, and 4 for the groups rewards you with a smaller p-value because the last group is the one that's the most different. Using "weights" of 1, 4, and 5 results in a larger p-value (about 0.0322). By playing around with the weights you can get a variety of p-values, but the most commonly used version of the test would just use weights of 1, 2, ..., k. The easiest way to change the weights is to change the row scores on the table data layout.

If one runs a Linear-by-linear Association test using the "raw" data (before converting to the table with the (mid)rank scores), then it's like a permutation version of the test. If you use the data in Table 6.6 on p. 205 of H&W, you should get a p-value of about 0.0169.


doing the Fligner-Wolfe test of Sec. 6.4 of H&W

  1. Go to CaseData, and enter all of the observation values into one column/variable, and in an adjacent column/variable, enter a 1 beside each observation of the control sample, and enter a 2 beside each observation belonging to any of the treatment samples (whether it be sample 2, 3, ..., or k).
  2. Then use these two columns/variables to do a W-M-W test.

doing the Steel-Dwass-Critchlow-Fligner procedure of Sec. 6.5 of H&W

Although StatXact doesn't include the S-D-C-F procedure on its menus, it can be used to do most of the grubby computations for you. You can use StatXact to do k choose 2 W-M-W tests, and the Standardized values reported near the top of the W-M-W output are the z-scores that correspond to the expression in the brackets of (6.62) on p. 241 of H&W. If you multiply the Standardized values by the square root of 2, then you'll have the W*ij values needed for the S-D-C-F procedure (and you can complete the procedure by making use of the tables in H&W).


doing Page's test of Sec. 7.2 of H&W

The test is on the K Related Samples portion of StatXact's Statistics menu. To perform the test, put the observations for each of the k treatments in a separate variable/column of the CaseData spreadsheet. (The observations must be put into the columns in the proper order, with the rows of the spreadsheet corresponding to the blocks.) Then select Page's test from the Statistics menu, and click each of the k variables into the Populations/Treatments box, in the proper order. (They should be listed down the box in the order corresponding to the monotone alternative.) Then click on Exact, and finally OK. (If the number of blocks is too large, then StatXact may not be able to do an exact test, in which case you should choose the Monte Carlo option, following the guidelines given above.)

As usual, StatXact does not let you specify the direction for the one-sided test that it does. It will either report a p-value which corresponds to the treatment effect monotonically increasing with the order in which they are listed, or a p-value which corresponds to the treatment effect monotonically decreasing with the order in which they are listed; whichever one is smaller. So you have to take the responsibility of checking to make sure that the p-value given for the one-sided test is the one that you want. (As usual, the p-value for the other one-sided test can be obtained from the StatXact output with just a little bit of work.)


doing procedures from Ch. 8 of H&W

Go to CaseData, and enter the x values into one column/variable, and enter the y values in an adjacent column/variable, making sure that each bivariate pair occupies the same row.

Then go to the Ordinal Response portion of StatXact's Statistics menu (the INFERENCE FOR MEASURES OF ASSOCIATION PART) and select Kendall's Tau & Somer's D. Next, click in the two variables, click on Exact, and finally click OK.

The point estimate given is fine, and the exact p-value is fine, but I have a HUGE problem with the asymptotic p-value. (I spotted this apparent mistake 3 years ago when using version 4 of StatXact, but unlike when I reported some mistakes in version 3, I never got around to reporting the mistakes (there were others as well) that I found in version 4. I had hoped that when version 5 came out, others had discovered the mistakes, but apparently if they did, they didn't report them either. (One would hope that the Cytel people would have tested the software better.) So that I can more easily focus the attention of the Cytel people on the mistakes, I'm going to put them here, on a separate web page.