|
Two sample t-test with SASIdea and demo exampleThe idea of two sample t-test is to compare two population averages by comparing two independent samples. A common experiment design is to have a test and control conditions and then randomly assign a subject into either one. One variable to be measured and compared between two conditions (samples). Suppose there is a study to compare two study methods and see how they improve the grades differently. There is a new method (treament, or t) and a standard method (control, or c). Users will be randomly assigned either one method. After they are trained with the method, their performance is measured as grades. The data set is “reading.csv”. The problem is to test whether the two methods make a difference? The model you can set up for this problem is           Grade (continuous) ~ method (categorical: 2 levels) Open the data set from SAS. data read; infile "H:\sas\data\reading.csv" dlm=',' firstobs=2; input method $ grade; run; Checking assumptionsTwo sample t-test assumes that
When the assumptions are not met, other methods are possible based on the two samples:
In this demo example, two samples (control and treatment) are independent, and pass the Normality check. So we continue with two sample t-test. Note that the test is two-sided (sides=2), the significance level is 0.05, and the test is to compare the difference between two means (mu1 - mu2) against 0 (h0=0). Compare two independent samplesproc ttest data=read sides=2 alpha=0.05 h0=0; title "Two sample t-test example"; class method; var grade; run; Reading the output two sample t example The TTEST Procedure Variable: Grade Method N Mean Std Dev Std Err Minimum Maximum control 5 88.6000 7.3007 3.2650 80.0000 98.0000 treatment 5 101.6 2.0736 0.9274 99.0000 104.0 Diff (1-2) -13.0000 5.3666 3.3941 Method Method Mean 95% CL Mean Std Dev 95% CL Std Dev control 88.6000 79.5350 97.6650 7.3007 4.3741 20.9789 treatment 101.6 99.0252 104.2 2.0736 1.2424 5.9587 Diff (1-2) Pooled -13.0000 -20.8268 -5.1732 5.3666 3.6249 10.2811 Diff (1-2) Satterthwaite -13.0000 -21.9317 -4.0683 Method Variances DF t Value Pr > |t| Pooled Equal 8 -3.83 0.0050 Satterthwaite Unequal 4.6412 -3.83 0.0141 Equality of Variances Method Num DF Den DF F Value Pr > F Folded F 4 4 12.40 0.0318 Note that the results show both "Pooled" and "Satterthwaite" sections, which is based on sample variances check. The test on Equality of Variances is given at the end, and is repeated below, Equality of Variances Method Num DF Den DF F Value Pr > F Folded F 4 4 12.40 0.0318 Some people use the simple rule here:
In this example, the p-value = 0.0318 < 0.05, so we should read the "Satterthwaite" section. For example
The conclusion is to reject the null hypothesis and that the the reading grade of two methods are significantly different. Note that SAS perform a two-sided test, meaning the hypothesis is to compare a significant difference between two groups. If one wants to test whether one group is greater(smaller) than the other, p-value can be divided by 2. For example, the p-value/2=0.0141/2=0.007 < 0.05, hence the concluse for the one side test is to reject the hypothesis and therefore the new method improve the grading score. |
© COPYRIGHT 2010 ALL RIGHTS RESERVED tqin@purdue.edu |