Alignment-free Whole Genome Comparison Using k-mer Forests

G. Gamage, N. Gimhana, A. Wickramarachchi, V. Mallawaarachchi, I. Perera

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

4 Citations (Scopus)

Abstract

In evolutionary biology, the study of phylogenetics can be considered as one of the main research disciplines. Phylogenetics is based on comparative data, which is mainly DNA sequences or raw sequencing reads. Alignment-based sequencing and alignment-free sequencing are the two main similarity computation methods, which are used to find genetic relatedness of different species. Alignment-based methods are relatively complex and computationally challenging as the genome scales when considering mammalian datasets and complex metagenomic colonies. Moreover, they show poor accuracy in certain cases in genetic comparison due to misalignments and algorithmic tolerances. Alignment-free comparison methods perform much better in genetic distance computation by addressing most of the challenges observed in alignment-based methods. In this paper, we propose a novel alignment-free, pairwise, distance calculation method based on k-mers. With this, we convert longer DNA sequences into simplified k-mer forest structures, which makes the comparison more convenient. Further, we are using a specialized tree pruning approach, which minimizes tree comparison time considerably compared to other alignment-free methods.

Original languageEnglish
Title of host publication2019 19th International Conference on Advances in ICT for Emerging Regions (ICTer)
Place of PublicationNew Jersey, U.S.A.
PublisherInstitute of Electrical and Electronics Engineers
Pages1-7
Number of pages7
ISBN (Electronic)978-1-7281-5156-4, 978-1-7281-5157-1, 978-1-7281-5154-0
ISBN (Print)978-1-7281-5155-7
DOIs
Publication statusPublished - 2019
Externally publishedYes
Event19th International Conference on Advances in ICT for Emerging Regions, ICTer 2019 - Colombo, Sri Lanka
Duration: 3 Sept 20194 Sept 2019

Publication series

NameInternational Conference on Advances in ICT for Emerging Regions
ISSN (Print)2377-6854
ISSN (Electronic)2472-7598

Conference

Conference19th International Conference on Advances in ICT for Emerging Regions, ICTer 2019
Country/TerritorySri Lanka
CityColombo
Period3/09/194/09/19

Keywords

  • genetic comparison
  • genetic distance
  • k-mer forest
  • phylogenetics

Fingerprint

Dive into the research topics of 'Alignment-free Whole Genome Comparison Using k-mer Forests'. Together they form a unique fingerprint.

Cite this