Computational Biology

SCARPA: Scaffolding Reads with Practical Algorithms

Abstract:
Current assemblers for high throughput sequencing platforms can produce quality draft assemblies for small genomes. In contrast, the assemblies produced for complex genomes using short reads are typically very fragmented. The finishing stage, which requires additional sequencing, can significantly benefit from scaffolding. We have developed Scarpa, which combines fixed-parameter tractable and bounded algorithms with Linear Programming in order to produce accurate scaffolds. Scarpa also estimates library size distribution and detects mis-assembled contigs.

Publication:
Donmez N., Brudno M. "SCARPA: scaffolding reads with practical algorithms"; Bioinformatics. 2013 Feb 15;29(4):428-34. doi: 10.1093/bioinformatics/bts716. [pdf]

GitHub logo
Scarpa is available for academic and commercial use under the GNU General Public License (GPL) from GitHub.

Genome assembly for highly polymorphic genomes

Abstract:
Several recently sequenced genomes exhibit very high polymorphism rates. For these organisms, genome assembly remains a challenge. Single Nucleotide Polymorphisms (SNPs) and small indels (insertion/deletion) may be mistaken for sequencing errors and larger variations such as Copy Number Variations (CNVs) or rearrangements make it difficult to assemble a single reference sequence. We introduce Hapsembler, an assembler to facilitate the assembly of such genomes. Hapsembler features a haplotype-aware error correction procedure and uses a novel structure called a mate pair graph to resolve ambiguities that arise from polymorphism and repeats.

Graph

Publication:
Donmez N., Brudno M. "Hapsembler: An assembler for highly polymorphic genomes"; International Conference of Research in Computational Biology (RECOMB) 2011, V. Bafna and S.C. Sahinalp (Eds.), LNBI 6577:38-52, 2011. [pdf]

GitHub logo
Hapsembler is available for academic and commercial use under the GNU General Public License (GPL) from GitHub.

Polymorphism in Ciona savignyi

C. savignyi

Abstract:
We compare two haploid genotypes of one C. savignyi individual and identify codons at which these genotypes differ by two non-synonymous substitutions. Using the C. intestinalis genome as an outgroup, we show that both substitutions tend to occur in the same genotype. Only in 53 (34.4%) of 154 codons, one substitution occurs in each of the two genotypes, although 77 (50.0%) of such codons are to be expected if substitutions were independent. We consider two feasible evolutionary causes for the observed pattern: substitutions driven by positive selection and compensatory substitutions, as well as several potential biases.

Publication:
Donmez N., Bazykin G., Brudno M., Kondrashov A.S. "Polymorphism due to multiple amino acid substitutions at a codon site within Ciona savignyi"; Genetics 181:685-690, 2009. [pdf]

Graphics

Concepture: recognizing gestures with repetitive patterns

Abstract:
We present Concepture, a framework based on regular language grammars for the authoring and recognition of sketched gestures with infinitely varying and repetitive patterns. Such gestures, while often seen in gesture based applications are currently hard-coded and not customizable. We endorse an example-based workflow, where users author gestures by sketching one or more example instances of the gesture. We algorithmically deconstruct these examples into perceptible stroke segments. Adjacent segment-pairs further capture local spatial relationships between segments and these segment-pairs form the alphabet of a regular language. We then initialize a grammar for our gesture by admitting strings that represent the user provided examples. Grammar refinement is userfriendly, in that we automatically generate new candidate gestures that are visually presented to the user for verification as instances of the gesture. We show Concepture to be effective in efficiently authoring a number of common, yet difficult to recognize gestures, and illustrate it using clip-art and image annotation applications.

Conceptures

Publication:
Donmez N., Singh K. "Concepture: a regular language based framework for recognizing gestures with repetitive and variational patterns"; Proceedings of SBIM '12 Proceedings of the International Symposium on Sketch-Based Interfaces and Modeling. (Best Paper Award) [pdf]