What Is the Poisson Distribution?
The Poisson distribution models the probability of a given number of events occurring in a fixed interval of time or space, given that events occur independently at a constant average rate. Named after French mathematician Simeon Denis Poisson, it is one of the most important discrete probability distributions in statistics.
Common examples include the number of phone calls a call center receives per hour, the number of typos per page in a book, the number of cars passing through a toll booth per minute, or the number of radioactive decay events per second. The distribution is characterized by a single parameter lambda, which represents both the mean and variance.
Formula
Where λ is the average rate (expected number of events), k is the number of events, and e is Euler's number (approximately 2.71828).
Probability Table (λ = 5)
| k | P(X = k) | P(X ≤ k) |
|---|---|---|
| 0 | 0.0067 | 0.0067 |
| 1 | 0.0337 | 0.0404 |
| 2 | 0.0842 | 0.1247 |
| 3 | 0.1404 | 0.2650 |
| 5 | 0.1755 | 0.6160 |
| 10 | 0.0181 | 0.9863 |
Applications
- Telecommunications: Modeling call arrivals for staffing decisions.
- Quality Control: Counting defects per unit in manufacturing.
- Biology: Number of mutations in a DNA strand per generation.
- Insurance: Predicting the number of claims per time period.
- Web Analytics: Modeling page visits or server requests per minute.
Frequently Asked Questions
When should I use Poisson vs. Binomial?
Use Poisson when counting events in a continuous interval (time, space, area) with no fixed number of trials. Use Binomial when you have a fixed number of trials with a success/failure outcome. Poisson is the limiting case of Binomial when n is large and p is small.
What are the assumptions of the Poisson distribution?
Events must occur independently, at a constant average rate, and two events cannot occur at exactly the same instant. The probability of an event in a small interval is proportional to the interval length.
Can lambda be non-integer?
Yes. Lambda represents the average rate and can be any positive real number (e.g., 3.7 emails per hour). However, k (the number of events) must be a non-negative integer.