===========================================================================
CSC 310H                     Lecture 8 Summary                    Fall 2007
===========================================================================

------
Topics
------

 -  Problems with block Huffman coding
 -  Intervals as codes
 -  Bit strings as intervals
 -  Encode blocks of symbols: Arithmetic coding, take 1.
 -  References: sections 5.6, 6.2, lectures 7-8 from 2006 course by prof
    Roweis.

----------------------------------
Problems with block Huffman coding
----------------------------------

Ideas:

 1. In order to get the average number of bits per symbol close to the
    entropy, need to consider large blocks of N symbols.

     a. If the source has I symbols, its N-th extension has I^N symbols. To
        compute a Huffman code for this source, we need to consider all
        those very improbable strings and assign them codewords. As N
        grows, the number of symbols we have to deal with increases
        exponentially with N!

     b. In order to emit a codeword, we need N symbols of the source
        string. This is not as big of an issue compared to a. above, but it
        still is a consideration.

 2. No easy way to alter the symbol probabilities once the encoding starts.
    For example, suppose that, as we read in an input bit string where we
    assumed that 0 occurs with probability 90%, we realize the probability
    of a 0 is really much closer to 80%. The only way to incorporate this
    knowledge is to recompute the entire Huffman code, which is expensive.

Reference: section 5.6.

------------------
Intervals as codes
------------------

Ideas:

 -  A ID code tree divides up the available codespace into sub-intervals
    of the [0,1) interval, where each subinterval has a length which is
    a power of 2.

 -  If symbol probabilities are not powers of 2, there is no way in which
    an ID code tree can assign codewords with those lengths.

 -  For now, try to have codewords the intervals themselves.

    For example, if {a_1, a_2, a_3, a_4} have probabilities
    {1/3, 1/6, 1/6, 1/3}, assign the codewords:

        a_1 -> [0, 1/3)
        a_2 -> [1/3, 1/2)
        a_3 -> [1/2, 2/3)
        a_4 -> [2/3, 1)

------------------------
Bit strings as intervals
------------------------

 -  However, how do we transmit an interval? Its endpoints can be
    rational/real numbers.

 -  All the encoder needs to do is to transmit a rational number that lies
    inside of of these intervals. The decoder can then figure out what the
    symbol being encoded was.

    For the example above, we could send 0 or 1/8 for a_1, as they both lie
    inside the interval corresponding to a_1.

 -  We will actually transmit a smaller interval which completely lies
    inside one of the codeword intervals, as follows.

 -  For a bit string b = b_1, .., b_k, define r(b) to be the real number
    that has as its binary real number representation the string b.

    For example, if b = 00101, r(b) = 0.0010100.. = 1*2{-3} + 1*2^{-5} =
    5/32.

 -  For a bit string b, define int(b) to be the real number interval
    [ r(b), r(b+1) ).

    For example, if b = 011, int(b) = [0.011, 0.100) = [3/8, 1/2)

 -  Note that the length of int(b) is 2^{-k}, where k is the length of b.

 -  So, encoder upon seeing a_i, sends a bit string b such that int(b) is a
    subset of the interval corresponding to a_i.

    For example, with the code above, upon seeing a_2, the encoder can send
    b = 011 as [3/8, 1/2) is a subinterval of [1/3, 1/2). It could also
    send b = 0110, which corresponds to [3/8, 7/16).

---------------------------------------------------
Encode blocks of symbols: Arithmetic coding, take 1
---------------------------------------------------

Idea:

 -  Instead of encoding single source symbols, encode blocks of N symbols.

 -  Subdivide interval corresponding to a_4 into intervals corresponding to
    a_4 a_1, .., a_4 a_4.

Algo: Arithmetic Coding, take 1:

    u^0 <- 0
    v^0 <- 1
    for k <- 1 to N
        u^k <- u^{k-1} + (v^{k-1} - u^{k-1})*sum_{j=1}^{i_k-1}{ p_j }
        v^k <- u^k + (v^{k-1} - u^{k-1})*p_{i_k}
    endfor
    send a bit string b such that int(b) is a subset of [u^N, v^N)