Using Signal Processing Techniques For DNA Sequence Comparison
Proceedings Of The Fifteenth Annual Northeast Bioengineering Conference
The most widely used algorithm for the comparison of two sequences of DNA are O(m*n) on the lengths, m and n, of the sequences been compared. The authors present a comparison algorithm that is O(nlog n) on the length, n, of the longer sequence. This algorithm uses techniques developed for rapid comparison of two discrete signals, in particular, cross-correlation using the fast Fourier transform (FFT). The authors treat the DNA as a discrete signal with each nucleotide base represented by a single point in the signal. There are only four possible values that the signal can assume which they represent by one of four complex numbers. The comparison is made by performing a cross correlation between one signal and the complex conjugate of the other. Any significant peak in the resulting signal indicates a strong similarity between the two sequences. The authors present the results of comparison of two strains of the human immunodeficiency virus and of human and simian immunodeficiency viruses. Their results suggest that this technique is a powerful method for comparing very long sequences of DNA.
Signal processing, DNA, Sequences, Signal processing algorithms, Humans, Genomics, Bioinformatics, Fourier transforms, Fast Fourier transforms, Educational institutions
Fifteenth Annual Northeast Bioengineering Conference
March 27-28, 1989
Erik Allen Cheever , '82; D. B. Searls; William A. Karunaratne , '91; and G. C. Overton.
"Using Signal Processing Techniques For DNA Sequence Comparison".
Proceedings Of The Fifteenth Annual Northeast Bioengineering Conference.