=========================================================================== CSC 310 Lecture 14 Summary Fall 2007 =========================================================================== Topics: - Transmission over a noisy channel - Channels BSC, BEC, Z - Entropy, joint entropy, conditional entropy, mutual information - References: chapter 8, sections 9.1, 9.2, 9.3, 9.5. --------------------------------- Transmission over a noisy channel --------------------------------- - We want to deal with the following scenario: data must be sent from a source to a destination, and it might get corrupted along the way. - We want to add redundancy to our data before trasmitting it, so that even if a few errors occur during the transmission, we can still make use of the data that gets through. - Given a channel, we will first compute how much information the channel can transmit. - When errors occur, ideally we would like to correct them without requesting a retransmission. Even if we cannot correct them, it is still useful to detect errors. - Suppose we have a very low tolerance for error. Does this mean that we will have to sacrifice the channel capacity? For example, if we use a repetition code (send each bit N times), by having N grow we can obtain a probability of error as low as we want. But we end up sending N times the original message. - As opposed to the data compression part of the course, the practical algorithms are not matching the best known theoretical results. - Reference: section 9.1. -------------------- Channels BSC, BEC, Z -------------------- Definition: A channel consists of a set of input symbols A_X = { a_1, .., a_r }, a set of output symbols A_Z = { b_1, .., b_s }, and transition probabilities Q_{j|i} = Pr[output is b_j|input is a_i]. We will always assume that: - there is a sequential correspondence between symbols sent and symbols received, that is, there are no insertions, deletions or permutations of symbols, - the channel is memoryless, that is, the channel alters symbols independently of one another, - the channel transition probabilities are fixed by the nature of the channel and cannot be changed in any way. Examples: BSC, BEC, Z channels. Reference: section 9.3. --------------------------------------------------------------- Entropy, joint entropy, conditional entropy, mutual information --------------------------------------------------------------- Notation: X = random variable denoting the input symbol Y = random variable denoting the output symbol (this is a function of what we send through the channel, that is, X) p_i = Pr[X = a_i] (these we can set as we want) q_j = Pr[Y = b_j] Q_{j|i} = Pr[Y = b_j|X = a_i] (these are given with the channel) Main Idea: The transition probabilities of the channel Q_{j|i} are fixed. The only thing we can manipulate are the probabilities of the symbols we send through the channel, that is, the p_i's. Entropies and mutual information: - H(X) is the entropy of the distribution on the input symbols. - H(Y) is the entropy of the distribution on the output symbols. - H(X,Y) is the joint entropy of X and Y. To compute it, we must compute the joint probabilities R_ij = Pr[X = a_i and Y = b_j]. H(X,Y) is then the entropy of that distribution. - I(X;Y) is called the mutual information between random variables X and Y. It is defined as I(X;Y) := H(X) + H(Y) - H(X,Y). - For a given j, H(X|Y = b_j) is the entropy of the conditional distribution Pr[X = a_i|Y = b_j]. - H(X|Y) is called the conditional entropy of X given Y, and it is defined as the average entropy of the conditional distribution X|Y = b_j: H(X|Y) := sum_j{Pr[Y = b_j]*H(X|Y = b_j)} - H(Y|X = a_i) and H(Y|X) are defined similarly. - Note: Do not confuse H(X|Y = b) with H(X|Y)! Intuition: H(X|Y = b) : The uncertainty about the input X given that the output is equal to b. H(X|Y) : The uncertainty about the input X given the output Y. This is what the decoder has to deal with: it sees Y and it needs to determine what symbol X was transmitted. - Can check: H(X,Y) = H(X) + H(X|Y). With this, we obtain: I(X;Y) = H(X) + H(Y) - H(X,Y) = H(X) - H(X|Y) = H(Y) - H(Y|X) Reference: chapter 8 (which is not long), sections 9.2, 9.5.