Abstract
Algorithm development for comparing and aligning biological sequences has, until recently, been based on the SI model of mutational events which assumes that modification of sequences proceeds through any of the operations of substitution, insertion or deletion (the latter two collectively termed indels). While this model has worked fairly well, it has long been apparent that other mutational events occur. In this paper, we introduce a new model, the DSI model which includes another common mutational event, tandem duplication. Tandem duplication produces tandem repeats which are common in DNA, making up perhaps 10% of the human genome. They are responsible for some human diseases and may serve a multitude of functions in DNA regulation and evolution. Using the DSI model, we develop new exact and heuristic algorithms for comparing and aligning DNA sequences when they contain tandem repeats.
Keywords
Affiliated Institutions
Related Publications
Sequence of a genomic DNA clone for the small subunit of ribulose bis-phosphate carboxylase-oxygenase from tobacco
We have cloned and sequenced a gene for the small subunit (SS) of ribulose bis-phosphate carboxylase-oxygenase from Nicotiana tabacum. The tobacco gene is most closely related t...
Fast and accurate short read alignment with Burrows–Wheeler transform
Abstract Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A...
Recombinant genomes which express chloramphenicol acetyltransferase in mammalian cells.
We constructed a series of recombinant genomes which directed expression of the enzyme chloramphenicol acetyltransferase (CAT) in mammalian cells. The prototype recombinant in t...
Recombinant Genomes Which Express Chloramphenicol Acetyltransferase in Mammalian Cells
We constructed a series of recombinant genomes which directed expression of the enzyme chloramphenicol acetyltransferase (CAT) in mammalian cells. The prototype recombinant in t...
Assembling millions of short DNA sequences using SSAKE
Abstract Summary: Novel DNA sequencing technologies with the potential for up to three orders magnitude more sequence throughput than conventional Sanger sequencing are emerging...
Publication Info
- Year
- 1997
- Type
- article
- Volume
- 4
- Issue
- 3
- Pages
- 351-367
- Citations
- 47
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1089/cmb.1997.4.351