Table of Contents
What Is a Box Plot?
A box plot (box-and-whisker plot) is a graphical representation of the five-number summary: minimum, Q1, median, Q3, and maximum. It provides a visual summary of data distribution, central tendency, variability, and skewness in a compact format. Invented by John Tukey in 1970, it is one of the most useful tools in exploratory data analysis.
The box spans from Q1 to Q3 (the IQR), with a line at the median. Whiskers extend to the most extreme data points within 1.5*IQR of the box edges. Points beyond the whiskers are plotted individually as outliers.
Box Plot Components
| Component | Meaning |
|---|---|
| Box | Q1 to Q3, contains middle 50% of data |
| Median line | 50th percentile, divides box |
| Whiskers | Extend to last data within 1.5×IQR |
| Outlier dots | Points beyond 1.5×IQR from box |
Outlier Detection
- Values below the lower fence or above the upper fence are outliers.
- Values beyond 3×IQR are extreme outliers.
FAQ
What does a symmetric box plot look like?
The median line is centered in the box, and whiskers are approximately equal length. Skewed data has an off-center median and unequal whiskers.
Can box plots compare groups?
Yes! Side-by-side box plots are excellent for comparing distributions across groups, making differences in center, spread, and outliers immediately visible.