Table of Contents
What Is P-Hat?
P-hat (written as p with a caret symbol above it) is the sample proportion -- the ratio of observed successes to the total number of observations in a sample. It is the best point estimate of the unknown population proportion (p). For example, if 45 out of 100 surveyed people prefer brand A, then p-hat = 0.45.
P-hat is central to inferential statistics, particularly in hypothesis testing for proportions and in constructing confidence intervals. The sampling distribution of p-hat is approximately normal when the sample size is large enough (np >= 10 and n(1-p) >= 10).
Formulas
Example Values
| x | n | p-hat | SE | 95% CI |
|---|---|---|---|---|
| 50 | 200 | 0.250 | 0.0306 | (0.190, 0.310) |
| 120 | 400 | 0.300 | 0.0229 | (0.255, 0.345) |
| 500 | 1000 | 0.500 | 0.0158 | (0.469, 0.531) |
| 90 | 100 | 0.900 | 0.0300 | (0.841, 0.959) |
Applications
- Polling: Election polls use p-hat to estimate candidate support percentages.
- Quality Control: Manufacturing defect rates are estimated using sample proportions.
- Medicine: Drug trial success rates use p-hat to estimate treatment effectiveness.
- Market Research: Customer preference surveys report sample proportions.
Frequently Asked Questions
How large should my sample be?
For the normal approximation to be valid, you need np-hat >= 10 and n(1-p-hat) >= 10. For more precise estimates, increase n. The standard error decreases as the square root of n increases.
Is p-hat the same as probability?
Not exactly. P-hat is a statistic (calculated from sample data) that estimates the population probability (parameter). Different samples will yield different p-hat values, but they should cluster around the true p.
What if p-hat is 0 or 1?
If all observations are successes (p-hat = 1) or all are failures (p-hat = 0), the standard error formula gives 0, which is misleading. In such cases, use adjusted methods like adding pseudo-observations (Agresti-Coull method).