Proportion Comparison: T or Chi?
If we try to compare two proportions from two independent groups, which test should we use, T-test or Chi-squared test? Suppose we want to test whether the prevalence of overweight is different between men and women. We did an investigation and got the estimated overweight prevalence in both men and women groups, how can we compare those two estimated prevalence?
Student’s t-test is a very popular method to compare population means, while in most cases it only works for numeric data because of one of the assumptions of Students’ t-test: The mean follows a normal distribution. In our case, however, we have a binary output for each sample (overweight or not). A more typical method for this situation is Chi-squared test.
Therefore, if we still want to use Students’ t-test to test where there’s a significant difference between the two overweight prevalence, is it correct? We know the percentage of being overweight is a Bernoulli distribution with parameter P. And the number of people who are overweight is a binomial distribution with parameter P and N, where N is the sample size of that group. It has been proved (CLT) that binomial distribution is asymptotic to a normal distribution when NP and N(1-P) are both greater than 5. Thus, the percentage of being overweight is asymptotic to a normal distribution with mean = P and variance = P(1-P).
How big is the difference between the two tests? We did a simulation to compare the test results between Chi-squared test and Student’s t-test. In the function below, we simulate a bunch of samples with binomial distribution, and check whether the two methods can correctly reject the wrong null hypothesis or not reject the true null hypothesis.
First, we check what percentage of 1000 simulations incorrectly reject the true null hypothesis when the true theta is 0.5 in both groups.
Group 1 Size | Group 2 Size | T-test Reject Percentage | Chi-squared Reject Percentage |
---|---|---|---|
10 | 30 | 0.054 | 0.040 |
15 | 15 | 0.053 | 0.053 |
20 | 20 | 0.051 | 0.051 |
40 | 40 | 0.056 | 0.056 |
100 | 100 | 0.058 | 0.058 |
Next, We check what percentage of 1000 simulations correctly reject the wrong null hypothesis when the true theta is 0.4 and 0.6 (0.3 and 0.7) in two groups.
Group 1 Size | Group 2 Size | T-test Reject Percentage | Chi-squared Reject Percentage |
---|---|---|---|
10 | 30 | 0.229 | 0.212 |
15 | 15 | 0.179 | 0.179 |
20 | 20 | 0.226 | 0.226 |
40 | 40 | 0.455 | 0.455 |
100 | 100 | 0.827 | 0.827 |
Group 1 Size | Group 2 Size | T-test Reject Percentage | Chi-squared Reject Percentage |
---|---|---|---|
10 | 30 | 0.612 | 0.640 |
15 | 15 | 0.599 | 0.599 |
20 | 20 | 0.704 | 0.704 |
40 | 40 | 0.957 | 0.957 |
100 | 100 | 1.000 | 1.000 |
As we can see from the simulation result, when the sample sizes are small, t-test is easier to reject the null hypothesis; while when the sample sizes are big enough in both groups, the test results from those two tests tend to be same. However, since the Chi-squared test has more power than the approximated t-test (hope we would talk about “power” in future), we still recommend using Chi-squared test in this case.