===========================================================================
CSC 310                     Lecture 20 Summary                    Fall 2007
===========================================================================

Topics:

 -  Upper bound on number of codewords in any code with given distance

 -  Lower bound on number of codewords in a code with given distance

 -  Syndrome decoding

 -  Syndrome decoding for Hamming codes

 -  References: sections 13.3, lecture 19 from 2006 course by prof. Roweis.

------------------------------------------------------------------
Upper bound on number of codewords in any code with given distance
------------------------------------------------------------------

Idea:

 -  Look at a code with distance d >= 2t+1 for some t.

 -  What is the maximum number of coderwords it might contain?

 -  Consider a ball of radius t around each codeword in Hamming space,
    i.e., set of vectors the differ from the codeword in at most t
    coordinates. Its size is 1 + (N choose 1) + .. + (N choose t).

 -  The radius t balls around each codeword must be disjoint if d >= 2t+1,
    so

        (max # codewords)*(1 + (N choose 1) + .. + (N choose t)) <= 2^N

 -  This gives an upper bound on the maximum number of codewords that
    we can have in a code of block size N and distance 2t+1.

 -  If the above bound is achieved with equality, we call that code
    "perfect".

 -  Very few perfect codes are known, Hamming codes are an example.

----------------------------------------------------------------
Lower bound on number of codewords in a code with given distance
----------------------------------------------------------------

 -  In general, we don't know how to build a code that has as many
    codewords as would be allowed by the inequality above.

 -  How many codewords can we find using a simple algorithm?

 -  To achieve distance d, we must not pick any codeword which is in a ball
    of radius d-1 around another codeword.

 -  Suppose we picked r' codewords so far. The balls of radius d-1 around
    these codewords cover at most

        r'*(1 + (N choose 1) + .. + (N choose d-1))

    of the entire space. They might cover even less if they overlap, but
    they cannot possibly cover more than this.

 -  If this number is strictly less than 2^N, there exists a codeword that
    we can include in our code and still keep distance d.

 -  Thus, the minimum number of codewords we get before this procedure
    halts satisfies

        (min # codewords)*(1 + (N choose 1) + .. + (N choose d-1)) >= 2^N

-----------------
Syndrome decoding
-----------------

 -  Setup:

     -   given K-bit block x

     -   encoder computes N-bit codeword w

     -   send w across channel, pick up N-bit noise vector n

     -   decoder gets N-bit r = w + n

     -   it corrects errors by selecting an N-bit codeword w' (based on r)

     -   it decodes w' as K-bit message x'

 -  Ideally, w' = w and x' = x.

 -  How does the decoder eliminate the noise n?

 -  "Syndrome decoding": compute z = H*r = H*(w+n) = H*w + H*n

 -  Since w is a codeword, H*w = 0. Thus, H*r = H*n.

 -  The job of the error-correcting part of the decoder is to find a
    noise vector n' such that H*r = H*n' and n' is the most probable noise
    vector for this channel. Then, correct r as w' = r + n'.

 -  For the BSC, noise vectors with low weight (number of 1s) are more
    probable than noise vectors with high weight.

 -  z is called the "syndrome" of n.

 -  For a general linear code, the syndrome is a (N-K)-bit string,
    different from the all-zero string. Thus, there are 2^{N-K} - 1
    possibilities for z. It is totally impractical to have a table
    with that large a size.

 -  General algorithm to compute syndrome decoding table.

     -  set table entry for every z != 0 to "none"

     -  enumerate noise vectors n from most likely to least likely (e.g.,
        over the BSC, enumerate noise patterns with a single 1, followed by
        noise patterns of weight 2, and so on)

     -  compute H*n

     -  if corresponding table entry is "none", save n in that location

-----------------------------------
Syndrome decoding for Hamming codes
-----------------------------------

 -  Consider parity check matrix in which columns are ordered according to
    the number from {1, .., 2^{N-K}-1} that they represent in binary.

 -  Then, an N-bit noise vector n of weight 1 with the single bit in row i
    will produce the syndrome H*n = (binary representation of i).

 -  The decoder can thus identify a unique bit to flip in order to obtain a
    codeword.