Table of Contents
What Is a Histogram?
A histogram is a graphical representation of data distribution using rectangular bars. Each bar represents a range (bin) of values, and the height indicates the frequency of data points falling within that range. Unlike bar charts, histograms show continuous data and the bars touch each other.
Histograms are fundamental in statistics for understanding the shape, center, and spread of data. They help identify patterns such as normal distribution, skewness, and the presence of outliers. They are used in quality control, scientific research, and data analysis across many fields.
How It Works
The data range is divided into equal-width intervals called bins. Each data value is assigned to its corresponding bin, and the frequency (count) for each bin is tallied. The resulting distribution shows how data is spread across the range.
Types of Histograms
| Type | Description | Use Case |
|---|---|---|
| Frequency | Shows count of data points per bin | Most common, general analysis |
| Relative Frequency | Shows proportion per bin | Comparing datasets of different sizes |
| Cumulative | Shows running total | Finding percentiles |
| Density | Area sums to 1 | Probability estimation |
Frequently Asked Questions
How many bins should I use?
A common rule of thumb is Sturges' formula: k = 1 + 3.322 * log10(n), where n is the number of data points. For 100 data points, this suggests about 8 bins. Too few bins hide patterns; too many create noise.
What is the difference between a histogram and a bar chart?
Histograms display continuous numerical data with no gaps between bars, while bar charts display categorical data with gaps. Histograms show frequency distributions; bar charts compare quantities across categories.
Can histograms show negative values?
Yes, histograms can display any numerical range including negative values. The bins simply cover whatever range your data spans, from the minimum to the maximum value.