What Is a P-Value?
A p-value is the probability of obtaining a test statistic at least as extreme as the one observed, assuming the null hypothesis is true. It quantifies the strength of evidence against the null hypothesis. A small p-value (typically less than 0.05) indicates strong evidence against the null hypothesis, leading researchers to reject it.
P-values are fundamental to hypothesis testing in all branches of science. They help distinguish between results that are likely due to chance and those that reflect a genuine effect. However, p-values should not be interpreted as the probability that the null hypothesis is true or false.
How P-Values Are Calculated
For a z-test, the p-value is derived from the standard normal cumulative distribution function (CDF):
For a right-tailed test, p = 1 - Φ(z). For a left-tailed test, p = Φ(z). This calculator uses a high-precision polynomial approximation to the normal CDF.
Common Significance Levels
| Significance Level (α) | Critical Z (Two-Tailed) | Interpretation |
|---|---|---|
| 0.10 | ±1.645 | Weak evidence against H0 |
| 0.05 | ±1.960 | Moderate evidence (most common) |
| 0.01 | ±2.576 | Strong evidence |
| 0.001 | ±3.291 | Very strong evidence |
Frequently Asked Questions
What does a p-value of 0.05 mean?
A p-value of 0.05 means there is a 5% probability of observing results as extreme as (or more extreme than) the data, assuming the null hypothesis is true. It does not mean there is a 5% chance the null hypothesis is true. If p < 0.05, we typically reject the null hypothesis and conclude the result is statistically significant.
Can a p-value be greater than 1?
No. A p-value is a probability and must be between 0 and 1. If your calculation yields a value outside this range, there is an error in the computation. A p-value close to 1 means the data is very consistent with the null hypothesis.
What is the difference between one-tailed and two-tailed tests?
A two-tailed test checks for any difference (greater or less than), while a one-tailed test checks for a difference in a specific direction. Two-tailed tests are more conservative and are the default in most research. Use a one-tailed test only when you have a strong directional hypothesis before collecting data.