=========================================================================== CSC 373H Lecture Summary for Week 6 Winter 2006 =========================================================================== RNA secondary structure. - Input: A sequence of bases b_1,b_2,...,b_n, each b_i in {A,C,G,U}. Output: A sequence of pairs (i_1,j_1),(i_2,j_2),...,(i_k,j_k) where k is as large as possible and: . for each pair (i,j), 1 <= i < j-4 <= n-4; . for each pair (i,j), {b_i,b_j} = {A,U} or {C,G}; . no index is repeated (i.e., all i's and j's are distinct); . no two pairs "cross", i.e., for all pairs (i,j) and (i',j'), it is NOT the case that i < i' < j < j'. - Step 1: Define array OPT[i,j] = max number of pairs on b_i,...,b_j - Step 2: Write recurrence OPT[i,j] = 0 for all i >= j-4 OPT[i,j] = max of: OPT[i,j-1], 1 + OPT[i+1,j-1] if b_i matches b_j, max( 1 + OPT[i,t-1] + OPT[t+1,j-1] ) for all t in [i+1,j-5] such that b_t matches b_j The first term is the best possible if b_j is unmatched. The second term is the best way to match b_i with b_j (if they match). The last term covers all other possible ways that b_j could be matched, and the best possible answer in each case. - Step 3: Bottom-up algorithm Observations: OPT[i,j] depends on previous values "below and to the left" also, we don't need to store OPT[i,j] for values of i >= n-4 or values of j <= 5. for i := n-5 downto 1: for j := i to i+4: OPT[i,j] := 0 for j := i+5 to n: OPT[i,j] := OPT[i,j-1] if b_i matches b_j and OPT[i,j] < 1 + OPT[i+1,j-1]: OPT[i,j] := 1 + OPT[i+1,j-1] for t := i+1 to j-5: if b_t matches b_j and OPT[i,j] < 1 + OPT[i,t-1] + OPT[t+1,j-1]: OPT[i,j] := 1 + OPT[i,t-1] + OPT[t+1,j-1] Example: see page 190 of textbook. - Step 4: Compute optimal answer - Runtime?