=========================================================================== CSC 310H Lecture 5 Summary Fall 2007 =========================================================================== ------ Topics ------ - Expected codeword length - Connection to entropy - Entropy - Shannon-Fano codes - reference: sections 4.1, 5.1, 5.3, 5.4. ------------------------ Expected codeword length ------------------------ Definition: L = L(c,X) = E[length(X)], where X is a single source symbol. L = Sum_{i=1} {p_i*l_i}, where p_i is the probability of a_i and l_i is the length of c(a_i) Idea: The best code has the minimum possible L. Observation: If X is a string of length N, then E[length(X)] = E[length(X_1)] + .. + E[length(X_N)] = N*L by linearity of expectation. Reference: section 5.1 --------------------- Connection to entropy --------------------- Result: L = Sum_{i=1}^I {p_i*l_i} >= Sum_{i=1}^I {p_i*log(1/p_i)} = H Idea: - Let K = Sum_{i=1}^I {2^{-l_i}}. KM inequality says that K <= 1. So log(K) <= 0. - Write L = Sum_{i=1}^I {p_i*l_i} >= Sum_{i=1}^I {p_i*(l_i+log(K)} - Manipulate terms to obtain H-L <= Sum_{i=1}^I {p_i*log(1/(2^{-l_i}*K*p_i))} - Use log(x)<=(x-1) for all x>0 to show RHS above is <=0. Reference: section 5.3, but the book uses "Gibbs inequality". The proof sketch above does not. ------- Entropy ------- The quantity H = Sum_{i=1}^I {p_i*log(1/p_i)} is called the "entropy" of the random experiment that has p_i as the probability of outcome a_i. Define h(a_i) = log(1/p_i). We call this quantity the "information content" of outcome X=a_i. Then, the entropy can be seen as the expected value of the information content of various outcomes. References: - Section 2.4 contains the definition of entropy. - Section 4.1 contains a discussion why the entropy is a sensible measure of the information content of a random variable. IMPORTANT: - Entropy is a very important concept in this course. It is worth spending some time to become familiar with it. Section 4.1 is a bit long, but it is quite informative. ------------------ Shannon-Fano codes ------------------ Ideas: - A good code has L as small as possible. - Ideally, choose lengths l_i = log(1/p_i), so that L=H. - Unfortunately, these are not necessarily integers. - Instead, choose l_i = ceiling(log(1/p_i)). - Check these lengths satisfy the KM inequality. - Check: H <= L < H + 1 Reference: read section 5.4. Note: the book doesn't call them Shannon-Fano codes, but we do.