|
Contingency procedure demonstrated with an example
The contingency testContingency table test is used when both dependent and independent variables are categorical. It is usually used to check relationship between two variables.
Analyzing simple data with countsWhen checking the opinion and gender, a data is generated as opinion.csv. A common situation is when counts are availiable for each categories, for example, if the frequencies are given below, Frequency table:
Then one can use the counts, rather than the original data files to run the contingency test. The counts can be inputed and analyzed as below. data simple; input opinion $ gender $ count; datalines; yes female 55 yes male 50 no female 65 no male 80 ; proc freq data=simple; tables opinion*gender / chisq nocol norow nopercent expected; weight count; run; The "chisq" option requests a chi-squre test, and "nocol", "norow", "noprecent" simplify the output and "expected" requests the expected values. The "weight" statement tells the precedure how many subjects there are for each combination of gender and opinion. When the counts are not given, a similar test can done as below, but results remain the same. data simple2; infile "H:\sas\data\opinion.csv" dlm=',' firstobs=2; input opinion $ gender $; run; proc freq data=simple2; tables opinion*gender / chisq nocol norow nopercent expected; run; Output, interpretation and assumption checkingThe FREQ Procedure The FREQ Procedure Table of opinion by gender opinion gender Frequency Expected female male Total ----------------------------------- no 65 80 145 69.6 75.4 ----------------------------------- yes 55 50 105 50.4 54.6 ----------------------------------- Total 120 130 250 Statistics for Table of opinion by gender Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 1.3920 0.2381 Likelihood Ratio Chi-Square 1 1.3926 0.2380 Continuity Adj. Chi-Square 1 1.1059 0.2930 Mantel-Haenszel Chi-Square 1 1.3865 0.2390 Phi Coefficient -0.0746 Contingency Coefficient 0.0744 Cramer's V -0.0746 Fisher's Exact Test ---------------------------------- Cell (1,1) Frequency (F) 65 Left-sided Pr <= F 0.1465 Right-sided Pr >= F 0.9046 Table Probability (P) 0.0511 Two-sided Pr <= P 0.2508 Sample Size = 250 The results include a contingency table with observed and expected values. A list of tests are performed where the first one is the classic Chi-square test. With a big p-value of 0.2381 we do not reject the hypothesis so gender and opinion is not associated. Note that the expected values for each combination is big(69.6, 75.4, 50.4, 54.6) so the assumption is met and conclusions are sound. Otherwise, we should use Fisher's exact test in the end of the output where p-value is 0.2508.
|
© COPYRIGHT 2010 ALL RIGHTS RESERVED tqin@purdue.edu |