Comparisons of Long Genomic Sequences.

Abstract

Comparing genomic sequences across related species is a fruitful source of biological insight, because functional elements such as exons tend to exhibit significant sequence similarity due to purifying selection, whereas regions that are not functional tend to be less conserved. The first step in comparing genomic sequences is to align them - that is, to map the letters of one sequence to those of the others. There are several categories of alignments: local alignments identify local similarities between regions of each sequence, global alignments find a mapping between all the letters of the sequences. Alignments can be either pairwise, between two sequences, or multiple that compare several sequences. The main challenge in developing algorithms for genomic alignment is that these must be fast enough to deal with megabase long sequences and gigabase long genomes, but also accurately map individual base pairs. While generating alignments is difficult computationally, visualization of alignments also presents challenges, such as how to enable users to interact with the data and the processing programs in the context of enormous datasets. Visualization frameworks should be easy to understand by a biologist and provide insight into the mutations that a particular region has undergone. Finally, alignments are useful only if they help shed light on the important functional elements in the genomic sequence. In this chapter, after a detailed discussion of algorithms used to construct genomic alignments and methods to visualize them we give a short overview of several algorithms that use an alignment to improve predictions of transcription factor binding sites.

Publication
Handbook of Computational Molecular Biology, 2005
Date
Links