next up previous
Next: Exercises Up: Probability and entropy -- Previous: Definition of Entropy and

Other useful definitions

The relative entropy or Kullback-Leibler divergence
between two probability distributions P(x) and Q(x) that are defined over the same alphabet tex2html_wrap_inline3537 is
 equation2443
The relative entropy satisfies tex2html_wrap_inline3716 (Gibbs' inequality) with equality only if tex2html_wrap_inline3718. Note that in general tex2html_wrap_inline3720, so this is not strictly a `distance', though it is sometimes called the `K-L distance'. This quantity is important in pattern recognition and neural networks.
Convex function.
A function f(x) is convex over (a,b) if for all tex2html_wrap_inline3726 and tex2html_wrap_inline3728,
eqnarray2446
(See figure 1.15.) A function f is strictly convex if, for all tex2html_wrap_inline3726, the equality holds only for tex2html_wrap_inline3734 and tex2html_wrap_inline3736.

  figure2373
Figure: A convex function.

Some strictly convex functions are

Jensen's inequality.
If f is a convex function and x is a random variable then:
equation2452
where tex2html_wrap_inline3764 denotes expectation. If f is strictly convex and tex2html_wrap_inline3768, then the random variable x is a constant (with probability 1).



David J.C. MacKay
Sat May 10 23:05:10 BST 1997