===========================================================================
CSC 310                     Lecture 14 Summary                    Fall 2007
===========================================================================

Topics:

 -  Transmission over a noisy channel

 -  Channels BSC, BEC, Z

 -  Entropy, joint entropy, conditional entropy, mutual information

 -  References: chapter 8, sections 9.1, 9.2, 9.3, 9.5.

---------------------------------
Transmission over a noisy channel
---------------------------------

 -  We want to deal with the following scenario: data must be sent from a
    source to a destination, and it might get corrupted along the way.

 -  We want to add redundancy to our data before trasmitting it, so that
    even if a few errors occur during the transmission, we can still make
    use of the data that gets through.

 -  Given a channel, we will first compute how much information the channel
    can transmit.

 -  When errors occur, ideally we would like to correct them without
    requesting a retransmission. Even if we cannot correct them, it is
    still useful to detect errors.

 -  Suppose we have a very low tolerance for error. Does this mean that we
    will have to sacrifice the channel capacity? For example, if we use a
    repetition code (send each bit N times), by having N grow we can obtain
    a probability of error as low as we want. But we end up sending N times
    the original message.

 -  As opposed to the data compression part of the course, the practical
    algorithms are not matching the best known theoretical results.

 -  Reference: section 9.1.

--------------------
Channels BSC, BEC, Z
--------------------

Definition:

    A channel consists of a set of input symbols A_X = { a_1, .., a_r }, a
    set of output symbols A_Z = { b_1, .., b_s }, and transition
    probabilities Q_{j|i} = Pr[output is b_j|input is a_i].

    We will always assume that:

     -  there is a sequential correspondence between symbols sent and
        symbols received, that is, there are no insertions, deletions or
        permutations of symbols,

     -  the channel is memoryless, that is, the channel alters symbols
        independently of one another,

     -  the channel transition probabilities are fixed by the nature of the
        channel and cannot be changed in any way.

Examples: BSC, BEC, Z channels.

Reference: section 9.3.

---------------------------------------------------------------
Entropy, joint entropy, conditional entropy, mutual information
---------------------------------------------------------------

Notation:

    X = random variable denoting the input symbol

    Y = random variable denoting the output symbol (this is a function of
        what we send through the channel, that is, X)

    p_i = Pr[X = a_i] (these we can set as we want)

    q_j = Pr[Y = b_j]

    Q_{j|i} = Pr[Y = b_j|X = a_i] (these are given with the channel)

Main Idea:

    The transition probabilities of the channel Q_{j|i} are fixed. The only
    thing we can manipulate are the probabilities of the symbols we send
    through the channel, that is, the p_i's.

Entropies and mutual information:

 -  H(X) is the entropy of the distribution on the input symbols.

 -  H(Y) is the entropy of the distribution on the output symbols.

 -  H(X,Y) is the joint entropy of X and Y. To compute it, we must compute
    the joint probabilities R_ij = Pr[X = a_i and Y = b_j]. H(X,Y) is then
    the entropy of that distribution.

 -  I(X;Y) is called the mutual information between random variables X and
    Y. It is defined as

        I(X;Y) := H(X) + H(Y) - H(X,Y).

 -  For a given j, H(X|Y = b_j) is the entropy of the conditional
    distribution Pr[X = a_i|Y = b_j].

 -  H(X|Y) is called the conditional entropy of X given Y, and it is
    defined as the average entropy of the conditional distribution
    X|Y = b_j:

        H(X|Y) := sum_j{Pr[Y = b_j]*H(X|Y = b_j)}

 -  H(Y|X = a_i) and H(Y|X) are defined similarly.

 -  Note: Do not confuse H(X|Y = b) with H(X|Y)! Intuition:

        H(X|Y = b) : The uncertainty about the input X given that the
                     output is equal to b.

        H(X|Y) : The uncertainty about the input X given the output Y. This
                 is what the decoder has to deal with: it sees Y and it
                 needs to determine what symbol X was transmitted.

 -  Can check: H(X,Y) = H(X) + H(X|Y). With this, we obtain:

        I(X;Y) = H(X) + H(Y) - H(X,Y)
               = H(X) - H(X|Y)
               = H(Y) - H(Y|X)

Reference: chapter 8 (which is not long), sections 9.2, 9.5.