DNA sequence comparison by a novel probabilistic method

Chenglong Yu, Mo Deng, Stephen Yau

    Research output: Contribution to journalArticlepeer-review

    43 Citations (Scopus)


    This paper proposes a novel method for comparing DNA sequences. By using a graphical representation, we are able to construct the probability distributions of DNA sequences. These probability distributions can then be used to make similarity studies by using the symmetrised Kullback-Leibler divergence. After presenting our method, we test it using six DNA sequences taken from the threonine operons of Escherichia coli K-12 and Shigella flexneri. Our approach is then used to study the evolution of primates using mitochondrial DNA data. Our method allows us to reconstruct a phylogenetic tree for primate evolution. In addition, we use our technique to analyze the classification and phylogeny of the Tomato Yellow Leaf Curl Virus (TYLCV) based on its whole genome sequences. These examples show that large volumes of DNA sequences can be handled more easily and more quickly by our approach than by the existing multiple alignment methods. Moreover, our method, unlike other approaches, does not require human intervention, because it can be applied automatically.

    Original languageEnglish
    Pages (from-to)1484-1492
    Number of pages9
    JournalInformation Sciences
    Issue number8
    Publication statusPublished - 15 Apr 2011


    • DNA
    • Graphical representation
    • Kullback-Leibler divergence
    • Probability distribution
    • Sequence comparison


    Dive into the research topics of 'DNA sequence comparison by a novel probabilistic method'. Together they form a unique fingerprint.

    Cite this