=========================================================================== CSC 310H Lecture 6 Summary Fall 2007 =========================================================================== ------ Topics ------ - Optimal codes - Operations that can only improve a code - Huffman codes - Optimality of Huffman codes - reference: section 5.5. ------------- Optimal codes ------------- Definition: A symbol code is "optimal" if its average codeword length L is as small as that of any other symbol code. IMPORTANT: The optimal symbol code for a certain source NEED NOT have L=H! It might very well happen that this is unachievable. For example, for any 2 symbol source, no matter what the probabilities are, the optimal symbol code for that source is always {0, 1}, which has L=1. However, the entropy of that source depends on the symbol probabilities, and it can be very close to 0. A symbol code is optimal only means that you cannot improve its L with any other symbol code. Definition: A symbol code is "complete" if it satisfies the KM inequality with equality, i.e., Sum_{i=1}^I {2^{-l_i}} = 1. Theorem: An ID symbol code is complete iff for every leaf at the last level of the tree corresponding to that node, either the leaf is a codeword, or one of its prefixes is a codeword. Idea: - An optimal code must be complete (except for the case where some symbols have probability 0). --------------------------------------- Operations that can only improve a code --------------------------------------- 1. If xby is a codeword for various y and some bit b, and xb'y' is not a codeword for b'=/=b and any y', can shorten xby into xy. 2. If p_i < p_j and l_i < l_j, swap codewords for a_i & a_j. 3. If (2) above can no longer be applied, we can move the codeword for the second least probable symbol to be the sibling of the codeword for the least probable symbol. Idea: - Let C be an optimal ID code. So, (1) and (2) can no longer be applied. There exists another optimal code C' in which the codewords for the two least probable symbols are siblings. ------------- Huffman codes ------------- Ideas: - Take 2 least probable symbols, a_{I-1}, a_I. - Combine them into a meta-symbol a' that has probability the sum of the two symbols you start with, p' = p_{I-1} + p_I. - Run Huffman coding algorithm on the set of symbols {a_1, .., a_{I-2}, a'}. - Take the codeword z generated for a' by this recursive call and make z0 and z1 the codewords for a_{I-1} and a_I. Reference: read section 5.5. --------------------------- Optimality of Huffman codes --------------------------- Ideas: - Induction on I: "for *any* source probabilities, Huffman procedure applied to a source with I symbols produces an optimal code." - Basis case: I=2, optimal code is always {0, 1}. - Inductive case: assume statement true for I-1, cosider procedure being applied to a source with {a_1, .., a_I} and {p_1 >= .. >= p_I}. - Inside Huffman procedure, a code is generated for the source {a_1, .., a_{I-2}, a'}, {p_1, .., p_{I-2}, p'} with p' = p_{I-1} + p_I. - Let C_H = code output for I-symbol source by Huffman procedure. - Let C'_H = code output for (I-1)-symbol source by recursive call inside Huffman procedure. - Assume C_H is not optimal for I-symbol source. We will derive a contradiction. - Let C_O = optimal code for I-symbol source. - By fact from above, we can assume in C_O codewords for a_{I-1} and a_I are siblings, let their parrent be z. - Let C'_O = code for (I-1)-symbol source where z is codeword for a'. - Let L_H, L'_H, L_O, L'_O be the average codeword lengths for C_H, C'_H, C_O, C'_O, respectively. Then: - L_H = L'_H + p_{I-1} + p_I - L_O = L'_O + p_{I-1} + p_I - L_O < L_H (by assumption that L_H is not optimal). - L'_H <= L'_O (by induction hypothesis, L'_H is optimal). - These four relations are a contradiction. Reference: solution to solved exercise 5.16 on pp105-6.