Table of Contents
What Is Grouped Data Standard Deviation?
When data is organized into frequency tables or class intervals, you cannot compute the standard deviation from individual values. Instead, you use the midpoint of each class interval and the frequency of observations in that class. The grouped data standard deviation provides an estimate of the spread of data around the mean.
This method is widely used when dealing with large datasets that have been summarized into frequency distributions, such as exam scores grouped into ranges, income brackets, or age groups in census data.
Formulas
Step-by-Step Method
- Calculate the midpoint (x) for each class interval.
- Multiply each midpoint by its frequency: f * x.
- Sum all f * x values and divide by total frequency N to get the mean.
- Calculate (x - mean)^2 for each midpoint.
- Multiply each squared deviation by its frequency: f * (x - mean)^2.
- Sum all f * (x - mean)^2 values.
- Divide by N (population) or N-1 (sample) to get variance.
- Take the square root for standard deviation.
Worked Example
| Class | Midpoint (x) | Frequency (f) | f*x | f*(x-mean)² |
|---|---|---|---|---|
| 10-20 | 15 | 5 | 75 | 2000 |
| 20-30 | 25 | 12 | 300 | 1200 |
| 30-40 | 35 | 18 | 630 | 0 |
| 40-50 | 45 | 10 | 450 | 1000 |
| 50-60 | 55 | 5 | 275 | 2000 |
Frequently Asked Questions
What is the difference between population and sample standard deviation for grouped data?
Population standard deviation divides the sum of squared deviations by N (total frequency), while sample standard deviation divides by N-1. Use sample SD when your data represents a sample from a larger population, and population SD when you have the entire population.
How accurate is grouped data standard deviation?
Grouped data SD is an approximation because it uses midpoints instead of actual values. The accuracy depends on how well the midpoints represent the data within each class. Smaller class intervals generally produce more accurate results.
Can I use this for unequal class widths?
Yes, simply enter the correct midpoint for each class interval regardless of width. The formula works with any set of midpoints and frequencies, whether or not class widths are equal.