=========================================================================== CSC 310 Lecture 20 Summary Fall 2007 =========================================================================== Topics: - Upper bound on number of codewords in any code with given distance - Lower bound on number of codewords in a code with given distance - Syndrome decoding - Syndrome decoding for Hamming codes - References: sections 13.3, lecture 19 from 2006 course by prof. Roweis. ------------------------------------------------------------------ Upper bound on number of codewords in any code with given distance ------------------------------------------------------------------ Idea: - Look at a code with distance d >= 2t+1 for some t. - What is the maximum number of coderwords it might contain? - Consider a ball of radius t around each codeword in Hamming space, i.e., set of vectors the differ from the codeword in at most t coordinates. Its size is 1 + (N choose 1) + .. + (N choose t). - The radius t balls around each codeword must be disjoint if d >= 2t+1, so (max # codewords)*(1 + (N choose 1) + .. + (N choose t)) <= 2^N - This gives an upper bound on the maximum number of codewords that we can have in a code of block size N and distance 2t+1. - If the above bound is achieved with equality, we call that code "perfect". - Very few perfect codes are known, Hamming codes are an example. ---------------------------------------------------------------- Lower bound on number of codewords in a code with given distance ---------------------------------------------------------------- - In general, we don't know how to build a code that has as many codewords as would be allowed by the inequality above. - How many codewords can we find using a simple algorithm? - To achieve distance d, we must not pick any codeword which is in a ball of radius d-1 around another codeword. - Suppose we picked r' codewords so far. The balls of radius d-1 around these codewords cover at most r'*(1 + (N choose 1) + .. + (N choose d-1)) of the entire space. They might cover even less if they overlap, but they cannot possibly cover more than this. - If this number is strictly less than 2^N, there exists a codeword that we can include in our code and still keep distance d. - Thus, the minimum number of codewords we get before this procedure halts satisfies (min # codewords)*(1 + (N choose 1) + .. + (N choose d-1)) >= 2^N ----------------- Syndrome decoding ----------------- - Setup: - given K-bit block x - encoder computes N-bit codeword w - send w across channel, pick up N-bit noise vector n - decoder gets N-bit r = w + n - it corrects errors by selecting an N-bit codeword w' (based on r) - it decodes w' as K-bit message x' - Ideally, w' = w and x' = x. - How does the decoder eliminate the noise n? - "Syndrome decoding": compute z = H*r = H*(w+n) = H*w + H*n - Since w is a codeword, H*w = 0. Thus, H*r = H*n. - The job of the error-correcting part of the decoder is to find a noise vector n' such that H*r = H*n' and n' is the most probable noise vector for this channel. Then, correct r as w' = r + n'. - For the BSC, noise vectors with low weight (number of 1s) are more probable than noise vectors with high weight. - z is called the "syndrome" of n. - For a general linear code, the syndrome is a (N-K)-bit string, different from the all-zero string. Thus, there are 2^{N-K} - 1 possibilities for z. It is totally impractical to have a table with that large a size. - General algorithm to compute syndrome decoding table. - set table entry for every z != 0 to "none" - enumerate noise vectors n from most likely to least likely (e.g., over the BSC, enumerate noise patterns with a single 1, followed by noise patterns of weight 2, and so on) - compute H*n - if corresponding table entry is "none", save n in that location ----------------------------------- Syndrome decoding for Hamming codes ----------------------------------- - Consider parity check matrix in which columns are ordered according to the number from {1, .., 2^{N-K}-1} that they represent in binary. - Then, an N-bit noise vector n of weight 1 with the single bit in row i will produce the syndrome H*n = (binary representation of i). - The decoder can thus identify a unique bit to flip in order to obtain a codeword.