Answers for HW #5

Spring 2008


Problem 1

part (a)

There is no good reason to rely on the robustness of Student's t test, since all of the two-sample nonparametric procedures are valid, and one of them produces a smaller p-value.

0.070 is the smallest of the valid p-values, and it is produced by the Fisher's exact test. Other p-values are given below. (Note: Welch's test should not be used since under the null hypothesis the variances are equal.)

part (b)

The value of the test statistic is about 1.6845, and the resulting p-value is about 0.11.

Problem 2

part (a)

Since it seems safe to assume that one distribution is stochastically larger than the other if they aren't identical, the two-sample nonparametric tests can be used for a test about the means, and so one doesn't have to strongly consider whether or not Welch's test is an appropriate choice, since it doesn't produce the smallest p-value. (If one was to use Welch's test, its robustness would have to be relied upon, which may be a problem due to the small sample sizes and the fact that one distribution may be more strongly skewed than the other one is.) We should remove Student's t test from consideration because the variances may be rather unequal. (Note: One always has to make some assumptions, and in this case the data provides no evidence against the assumption that one distribution is stochastically larger than the other if they differ, but the data does provide some evidence of nonnormality and unequal variances. The fact that heteroscedasticity makes Welch's test the choice over Student's t test causes a problem since with such small sample sizes, the estimated df to use may be inaccurate. So, all in all, I worry less about making the assumption of one distribution being stochastically larger than the other if they differ, than I do about the robustness of Welch's test.)

0.021 is the smallest of the valid p-values, and it is produced by the W-M-W test (using the exact version, breaking ties conservatively). (Note: The only tied values occur in the same sample, and so the same sum of ranks is obtained whether one uses midranks or only integer ranks.) Other p-values are given below. (Note: One cannot directly address the one-sided altrernative with the W-W runs test, but the p-value from the W-W runs test can be applied to the two-sided alternative, and then given the indication from the data of which mean is greater than which other mean if they differ, that two-sided alternative p-value can be used with the one-sided alternative. This inability to directly address a one-sided alternative with a W-W runs test generally results in low power when the test result is applied to a one-sided alternative.)

part (b)

Because it doesn't seem appropriate to assume a shift model, the interval associated with the W-M-W procedure, which is (4, 38), should not be used. Furthermore, since there is no good reason to assume equal variances, the Welch interval, which is (-2, 39), should be preferred to Student's t interval, which is (1, 36). (It should be noted that while it's not reasonable to believe that the assumption of normality underlying Welch's interval is met, relying on it's robustness is the best that we can do given the procedures covered in STAT 554, since the other procedures can be more inaccurate if the variances differ.) Although the usual rule of thumb applied to the estimated standard error of the difference in sample means indicates rounding to the nearest tenth is appropriate, since the data values are only recorded to the nearest integer, and violations of the assumptions makes the procedure subject to some inaccuracy, I will report the interval estimate as (-2, 39).


Problem 3

There is no good reason to rely on the robustness of Student's t test, since all of the two-sample nonparametric procedures are valid, and one of them produces a smaller p-value.

0.60 is the smallest of the valid p-values, and it is produced by Wald-Wolfowitz runs test (exact version). Other p-values are given below. (Note: Welch's test should not be used since under the null hypothesis the variances are equal.)

Problem 4

It should be noted that we have matched pairs instead of two independent samples.

One can begin by examining some possibilities. Here are p-values from a variety of tests: The Wilcoxon signed-rank test should be chosen, since it produces the smallest p-value of all of the reasonable candidates, and it's a perfectly valid test when doing a test for a treatment effect. The desired p-value is about 0.0026.

Johnson's test should not be used, since it is for tests about the mean of a skewed distribution. The accuracy of the t test and the tests based on the trimmed means should be decent, though perhaps not great. But since these tests are not exact tests, and they don't produce the smallest p-value, there is no need to rely on them. The sign test is perfectly valid, but it doesn't produce a p-value which is smaller than the perfectly valid one which results from the signed-rank test.