=========================================================================== CSC 310H Lecture 7 Summary Fall 2007 =========================================================================== ------ Topics ------ - Extensions of the source - Lossy coding - Smallest sufficient subset and essential bit content - Statement of Shannon's Noiseless Coding Theorem - reference: sections 5.6, 4.2, 4.3, 4.6 ------------------------ Extensions of the source ------------------------ Reference: - Formally defined in section 4.3, "Extended ensembles". - Huffman coding on extended sources explained in 5.6, "The extra bit". Ideas: - When encoding the N-th extension, the per-symbol overhead is 1/N. - Increasing N, the number of bits per symbol needed to encode is H. ------------ Lossy coding ------------ Reference: sections 4.2, 4.3. Ideas: - For lossless coding, the number of bits per symbol is essentially H. Can we do better if we are prepared to allow for errors? I.e., lossy coding. - For all lossy coding discussion below, we will only consider codes that assign codewords of the same length. Even though we might do a bit better by using other schemes (such as Humman coding), it turns out that the improvement will be minimal. - So, to encode a set of r symbols, we will use a code in which all codewords have length ceil(log(r)). ---------------------------------------------------- Smallest sufficient subset and essential bit content ---------------------------------------------------- Reference: sections 4.2, 4.3. Ideas: - H_0(X) = log(|A_X|), is the "raw bit content" of X. To encode X with an equal-length code, we need precisely ceil(H_0(X)) bits per symbol. - For 0 <= delta < 1, S_delta(X) is the smallest subset of A_X such that Pr[ x not in S_delta(X) ] <= delta. - To achieve error <= delta, we simply ignore symbols outside S_delta and use an equal-length code on S_delta. - H_delta(X) = log(|S_delta|) is the "essential bit content" of X. The number of bits per symbol for an equal-length code on S_delta is exactly ceil(H_delta(X)). ----------------------------------------------- Statement of Shannon's Noiseless Coding Theorem ----------------------------------------------- Reference: section 4.3, 4.6. - As with lossless coding, suppose we perform delta-error lossy coding on X^N, the N-th extension of the source X. - The number of bits per symbol we achieve is exactly (1/N)*ceil(H_delta(X^N)). We drop the ceiling as it becomes insignificant as N increases. - How good is this compared to lossless coding, where we achive almost H? Theorem [Shannon]: For all epsilon > 0, for all 0 < delta < 1, there exists N_0 such that for all N > N_0, H - epsilon < (1/N) * H_delta(X^N) < H + epsilon Reference: The proof is in sections 4.4 and 4.5, and it involves using the Law of Large Numbers. Notes: - I will not ask you for details about the proof, but I expect you to completely understand the statement of the Theorem. This includes the definition of H_delta and the play of the parameters. - The second < says: If delta is very small, i.e., we allow for a tiny error probability, then as N grows, the per-symbol length of this equal-length code approaches H: (1/N) * H_delta(X^N) < H + epsilon This motivates the claim at the beginning of this lecture that using codes other than equal-length codes does not improve things too much. The best we can hope for is H anyway. - The first < says: If delta approaches 1, i.e., the error probability is huge, then as N grows, the per-symbol length we need to encode is still (1/N) * H_delta(X^N) > H - epsilon which gets arbitrarily close to the entropy H.