Procedure demonstrated with an example

  1. The Wilcoxon rank-sum test (WMW test)
  2. Analyzing the data with WMW test
  3. Output and interpretation

The Wilcoxon rank-rum test (Wilcoxon Mann-Whiney U-test, or WMW test)

A common experiment design is to have a test and control conditions. A two sample t-test would have been a good choice if the test and control groups are independent and follow Normal distribution. If conditions are not met, nonparametric test methods are needed. This section covers one such test, called Wilcoxon rank-sum test (equivalent to the Mann-Whiney U-test) for two samples. The test is preferred when:

  1. Comparing two samples.
  2. The two groups of data are independent.
  3. The type of variable could be continuous or ordinal.
  4. The data might not be normally distributed.

Analyzing the data with WMW test

Consider the following example. Soil respiration is a measure of Microbioal activity in soil, which affects plant growth. In one study, soil cores were taken from two locations in a forest: 1) under an opening in the forest canopy (the "gap"location) and 2) at a nearby area under heavy tree grouwh (the "growth" location). The amount of carbon dioxide given off by each soil core was measured (in mol CO2/g soio/hr).

The question is to test whether the gap and growth areas do not differ with respect to soil respiration.

The data is "soil.csv".

Open the data set from SAS. Or import with the following command.

 
   data soil;
	infile "H:\sas\data\soil.csv" dlm=',' firstobs=2;
	input group $ resp;
    run;

According to the QQplots of the data (ignored, please refer to the QQplot instruction ), the distributions does not appear Normal. Hence, a WMW test is run with the following command.

 
   proc NPAR1WAY data=soil wilcoxon;
	title "Nonparametric test to compare respiration between growth and gap area";
	class group;
	var resp;
	exact wilcoxon;
   run;

The SAS procedure NPAR1WAY performs the non parametric tests. The option "wilcoxon" requests the Wilcoson rank sum test (plus a number of other statistics). The "class" and "var" statements are identical to the same statements of the t-test procedure. The "exact" statement causes the program to compute exact p-values (in addition to the asymptotic approximations usually computed) for the tests listed after this statement. It is suggested that an "exact" statement is included when the sample size is relatively small.

Output and intepretation

          Nonparametric test to compare respiration between growth and gap area             
                                                               

                                     The NPAR1WAY Procedure

                         Wilcoxon Scores (Rank Sums) for Variable resp
                                  Classified by Variable group

                                   Sum of      Expected       Std Dev          Mean
             group        N        Scores      Under H0      Under H0         Score
            ------------------------------------------------------------------------
             growth       7         77.50          56.0      8.625543     11.071429
             gap          8         42.50          64.0      8.625543      5.312500

                               Average scores were used for ties.


                                    Wilcoxon Two-Sample Test

                               Statistic (S)               77.5000

                               Normal Approximation
                               Z                            2.4346
                               One-Sided Pr >  Z            0.0075
                               Two-Sided Pr > |Z|           0.0149

                               t Approximation
                               One-Sided Pr >  Z            0.0144
                               Two-Sided Pr > |Z|           0.0289

                               Exact Test
                               One-Sided Pr >=  S           0.0051
                               Two-Sided Pr >= |S - Mean|   0.0099

                            Z includes a continuity correction of 0.5.


                                       Kruskal-Wallis Test

                                    Chi-Square         6.2130
                                    DF                      1
                                    Pr > Chi-Square    0.0127



Based on the p-value of the exact test, which is 0.0099 (0.05)to two-sided test, one can conclude that the the respiration between growth and gap areas are significantly different.

When we compare growth and gap areas, we can also specify one area has higher/lower respiration than the other. In other words, a directional test. In such case we can refer to the one sided p-value of 0.0051 which is also less than 0.05, and conclude that the growth area has significantly higher respiration than the gap area.

For even moderate sample sizes, WMW test is almost as powerful as its parametric equivalent, the t-test. Thus, if there is a question concerning distributions or if the data are really ordinal, one should not hesitate to use the WMW test instead of the t-test.