Confusion Matrix Calculator

Evaluate the performance of a classification model by computing accuracy, precision, recall, and F1-score from True Positives, True Negatives, False Positives, and False Negatives.

What Is a Confusion Matrix?
Key Metrics Formulas
The Matrix Layout
Interpreting Results
FAQ

What Is a Confusion Matrix?

A confusion matrix is a table that summarizes the performance of a classification algorithm. It displays the counts of true positive, true negative, false positive, and false negative predictions, allowing you to see not just how many mistakes a model makes, but what types of mistakes it makes.

This tool is essential in machine learning, medical diagnostics, spam detection, and any binary classification task. It provides a much richer picture than accuracy alone, especially when dealing with imbalanced datasets where one class significantly outnumbers the other.

Key Metrics Formulas

Accuracy = (TP + TN) / (TP + TN + FP + FN)

Precision = TP / (TP + FP)

Recall = TP / (TP + FN)

F1-Score = 2 × (Precision × Recall) / (Precision + Recall)

Specificity = TN / (TN + FP)

The Matrix Layout

	Predicted Positive	Predicted Negative
Actual Positive	True Positive (TP)	False Negative (FN)
Actual Negative	False Positive (FP)	True Negative (TN)

Interpreting Results

High Precision, Low Recall: The model is conservative; when it predicts positive, it's usually right, but it misses many actual positives.
Low Precision, High Recall: The model catches most positives but also flags many negatives incorrectly.
F1-Score: The harmonic mean of precision and recall; useful when you need a single balanced metric.
Specificity: The ability of the model to correctly identify negatives.

Frequently Asked Questions

When is accuracy misleading?

Accuracy can be misleading with imbalanced data. If 95% of cases are negative, a model that always predicts negative achieves 95% accuracy but has 0% recall for positives. In such cases, precision, recall, and F1-score are more informative metrics.

What is a good F1-score?

F1-scores range from 0 to 1. A score above 0.9 is excellent, 0.7-0.9 is good, and below 0.5 suggests the model needs improvement. The acceptable threshold depends on the application and cost of errors.

How do I improve my confusion matrix results?

Consider techniques like resampling imbalanced data, adjusting classification thresholds, feature engineering, trying different algorithms, or using ensemble methods. The optimal approach depends on which type of error (FP or FN) is more costly.

Table of Contents