next up previous
Next: Other useful definitions Up: Probability and entropy -- Previous: Meaning of probability.

Definition of Entropy and related functions

The number of elements in a set tex2html_wrap_inline3511
is denoted by tex2html_wrap_inline3588.

For a random variable X, |X| denotes the number of elements in the set tex2html_wrap_inline3537.

The entropy of X
is defined by:
equation2289
with the convention for tex2html_wrap_inline3598 that tex2html_wrap_inline3600, since tex2html_wrap_inline3602.

The entropy is a measure of the information content or `uncertainty' of x. The question of why entropy is a fundamental measure of information content will be discussed in the forthcoming chapters. Here we note some properties of this mathematical function.

The joint entropy of X,Y
is then:
equation2297
Entropy is additive for independent random variables:
 equation2299
The conditional entropy of X given tex2html_wrap_inline3626
is the entropy of the probability distribution tex2html_wrap_inline3628.
 equation2302
The conditional entropy of X given Y
is the average over y of the conditional entropy of X given y.
 eqnarray2307
This measures the average uncertainty that remains about x when y is known.
Chain rule for Entropy.
The joint entropy, conditional entropy and marginal entropy are related by:
 equation2309
In words, this says that the information content of XY is the information content of X plus the information content of Y given X.
The mutual information between X and Y
is
 eqnarray2311
and satisfies H(X;Y) = H(Y;X), and tex2html_wrap_inline3664. It measures the average reduction in uncertainty about x that results from learning the value of y, or vice versa. Equivalently, it measures the amount of information that y conveys about x.
The `entropy distance' between two random variables
can be defined to be the difference between their joint entropy and their mutual information:
 equation2313

This quantity satisfies the axioms for a distance -- tex2html_wrap_inline3674, tex2html_wrap_inline3676, tex2html_wrap_inline3678, and tex2html_wrap_inline3680.

Figure 1.14 shows how the total information content H(X,Y) of a joint ensemble can be broken down.

  figure2271
Figure: The relationship between joint information, marginal information, conditional information and mutual information.


next up previous
Next: Other useful definitions Up: Probability and entropy -- Previous: Meaning of probability.

David J.C. MacKay
Sat May 10 23:05:10 BST 1997