What Is Pooled Standard Deviation?
Pooled standard deviation is a weighted average of standard deviations from two or more groups, used when we assume the groups share a common population variance. It provides a better estimate of this common variance by combining information from all groups, weighted by their degrees of freedom.
This statistic is essential for the independent two-sample t-test (assuming equal variances), Cohen's d effect size calculation, and certain ANOVA procedures. By pooling the information, we gain statistical power compared to using individual group estimates.
Formula
When to Use It
- Two-sample t-test: When testing the difference between two group means under equal variance assumption.
- Effect size (Cohen's d): The denominator in Cohen's d uses pooled standard deviation.
- ANOVA: The within-group variance estimate is a pooled measure across all groups.
- Meta-analysis: Combining results from studies with different sample sizes.
Example Calculations
| n1 | s1 | n2 | s2 | Pooled SD |
|---|---|---|---|---|
| 20 | 5.0 | 20 | 5.0 | 5.000 |
| 10 | 3.0 | 30 | 6.0 | 5.408 |
| 50 | 10.0 | 50 | 12.0 | 11.045 |
Frequently Asked Questions
What if variances are not equal?
If the equal variance assumption is violated, use Welch's t-test instead, which does not pool the variances. Levene's test can help determine if variances are significantly different.
Can I pool more than two groups?
Yes. The formula generalizes to any number of groups: s_p = sqrt(sum of (ni-1)si^2 / sum of (ni-1)). This is exactly what happens in one-way ANOVA.
Is larger or smaller pooled SD better?
A smaller pooled SD indicates less variability within groups, making it easier to detect differences between group means. It is not inherently better or worse -- it describes the data's spread.