Table of Contents
What Is Shannon Entropy?
Shannon entropy, introduced by Claude Shannon in 1948, quantifies the average amount of information (or uncertainty) contained in a probability distribution. It is measured in bits (when using base-2 logarithm) or nats (when using natural logarithm). Higher entropy indicates greater uncertainty or more uniform distribution of probabilities.
Shannon entropy has become a cornerstone concept in information theory, data compression, cryptography, machine learning, ecology, linguistics, and many other fields. It provides a rigorous mathematical framework for measuring information content and is the theoretical basis for data compression algorithms.
Formula
Applications
- Data compression: Entropy defines the theoretical minimum average number of bits needed to encode messages from a source.
- Machine learning: Information gain (based on entropy) is used in decision tree algorithms to select the best features for splitting.
- Ecology: Shannon diversity index uses entropy to measure species diversity in ecological communities.
- Cryptography: Entropy measures the randomness of passwords, keys, and random number generators.
Frequently Asked Questions
What is maximum entropy?
Maximum entropy occurs when all outcomes are equally likely (uniform distribution). For n outcomes, H_max = log2(n). This represents the state of maximum uncertainty. The ratio H/H_max (evenness) measures how close the distribution is to uniform.
What does zero entropy mean?
Zero entropy means there is no uncertainty; one outcome has probability 1 and all others have probability 0. The result is completely predictable and carries no information, since you already know what will happen.