East Carolina University
Department of Psychology
Effect of n1/n2 on Estimated d and rpb
When comparing two means, the most commonly employed effect size estimators are d and the point-biserial r. Each has it advocates and its critics. One of the factors that one should consider when choosing which to employ is the effect of disparate sample sizes on the two estimators, which I illustrate below.
Equal Sample Sizes
First we look at an analysis of two samples where the sample sizes are equal.
T-TEST GROUPS=A(1 2)
/MISSING=ANALYSIS
/VARIABLES=Y1
/CRITERIA=CI(.9500).
Group Statistics | |||||
A | N | Mean | Std. Deviation | Std. Error Mean | |
Y1 | 1 | 20 | 5.5000 | 2.30560 | .51555 |
2 | 20 | 7.8000 | 2.30560 | .51555 |
Independent Samples Test | ||||||
t-test for Equality of Means | ||||||
t | df | Sig. (2-tailed) | Mean Difference | |||
Y1 | Equal variances assumed | -3.155 | 38 | .003 | -2.30000 |
Notice that the two means differ by one standard deviation (2.3). That is, estimated d = 1.00, a large effect (Cohen's benchmark for a large effect was d = .8).
Now we compute the point biserial.
CORRELATIONS
/VARIABLES=Y1 WITH A
/PRINT=TWOTAIL NOSIG
/MISSING=PAIRWISE.
Correlations | ||
A | ||
Y1 | Pearson Correlation | .456** |
Sig. (2-tailed) | .003 | |
N | 40 | |
**. Correlation is significant at the 0.01 level (2-tailed). |
The value of the point biserial falls just short of Cohen's general benchmark for a large r, .50.
If we square the point biserial to get a proportion of variance, we obtain 20.8%. This is absolutely equivalent to eta-squared in ANOVA. Cohen's benchmark for a large eta-squared is 14%.
No matter how we look at it, our effect here is large.
(Very) Unequal Sample Sizes
Now let us look at the analysis on a data set where the sample sizes differ considerably. The standard deviations and the mean differ very little from those in the first data set.
T-TEST GROUPS=B(1 2)
/MISSING=ANALYSIS
/VARIABLES=Y2
/CRITERIA=CI(.9500).
Group Statistics | |||||
B | N | Mean | Std. Deviation | Std. Error Mean | |
Y2 | 1 | 100 | 5.5000 | 2.25854 | .22585 |
2 | 4 | 7.7750 | 2.24109 | 1.12055 |
Independent Samples Test | |||||
t-test for Equality of Means | |||||
t | df | Sig. (2-tailed) | Mean Difference | ||
Y2 | Equal variances assumed | -1.976 | 102 | .051 | -2.27500 |
The means still differ by one standard deviation -- estimated d = 1.01, a large effect (Cohen's benchmark for a large effect was d = .8).
Now we compute the point biserial.
CORRELATIONS
/VARIABLES=Y2 WITH B
/PRINT=TWOTAIL NOSIG
/MISSING=PAIRWISE.
Correlations | ||
B | ||
Y2 | Pearson Correlation | .192 |
Sig. (2-tailed) | .051 | |
N | 104 |
The value of the point biserial puts it in Cohen's small (.1) to medium (.3) range.
If we square the point biserial to get a proportion of variance, we obtain 3.7%. This is absolutely equivalent to eta-squared in ANOVA. This is in Cohen's small (1%) to medium (6%) range.
Although the estimated d indicates we have a large effect, the point biserial indicates that we have a small to medium effect.
How is the ratio of sample sizes having such an effect on the value of the point biserial?
If you would like to read more about the differences between estimated d and the rpb as estimators of effect size, I recommend the following article: McGrath, R. E., & Meyer, G. J. (2006). When effect sizes disagree: The case of r and d. Psychological Methods, 11, 386-401.
Download the data used here -- SPSS sav format.
Contact Information for the Webmaster,
Dr. Karl L. Wuensch
This page most recently revised on 14-November-2020.