FRESCO: flexible alignment with rectangle scoring schemes.

Abstract

While the popular DNA sequence alignment tools incorporate powerful heuristics to allow for fast and accurate alignment of DNA, most of them still optimize the classical Needleman Wunsch scoring scheme. The development of novel scoring schemes is often hampered by the difficulty of finding an optimizing algorithm for each non-trivial scheme. In this paper we define the broad class of rectangle scoring schemes, and describe an algorithm and tool that can align two sequences with an arbitrary rectangle scoring scheme in polynomial time. Rectangle scoring schemes encompass some of the popular alignment scoring metrics currently in use, as well as many other functions. We investigate a novel scoring function based on minimizing the expected number of random diagonals observed with the given scores and show that it rivals the LAGAN and Clustal-W aligners, without using any biological or evolutionary parameters. The FRESCO program, freely available at http://compbio.cs.toronto.edu/fresco, gives bioinformatics researchers the ability to quickly compare the performance of other complex scoring formulas without having to implement new algorithms to optimize them.

Publication
Pacific Symposium on Biocomputing, 13: 3-14, 2008
Date
Links