Kullback Leibler divergence in complete bacterial and phage genomes

Sajia Akhter, Ramy K. Aziz, Mona T. Kashef, Eslam S. Ibrahim, Barbara Bailey, Robert A. Edwards

Research output: Contribution to journalArticle

Abstract

The amino acid content of the proteins encoded by a genome may predict the coding potential of that genome and may reflect lifestyle restrictions of the organism. Here, we calculated the Kullback-Leibler divergence from the mean amino acid content as a metric to compare the amino acid composition for a large set of bacterial and phage genome sequences. Using these data, we demonstrate that (i) there is a significant difference between amino acid utilization in different phylogenetic groups of bacteria and phages; (ii) many of the bacteria with the most skewed amino acid utilization profiles, or the bacteria that host phages with the most skewed profiles, are endosymbionts or parasites; (iii) the skews in the distribution are not restricted to certain metabolic processes but are common across all bacterial genomic subsystems; (iv) amino acid utilization profiles strongly correlate with GC content in bacterial genomes but very weakly correlate with the G+C percent in phage genomes. These findings might be exploited to distinguish coding from non-coding sequences in large data sets, such as metagenomic sequence libraries, to help in prioritizing subsequent analyses.

Original languageEnglish
Article numbere4026
Number of pages17
JournalPeerJ
Volume5
Issue number11
DOIs
Publication statusPublished - 30 Nov 2017
Externally publishedYes

Bibliographical note

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

Keywords

  • Genometrics
  • Genomics
  • Information theory
  • Metagenomics

Fingerprint Dive into the research topics of 'Kullback Leibler divergence in complete bacterial and phage genomes'. Together they form a unique fingerprint.

  • Cite this

    Akhter, S., Aziz, R. K., Kashef, M. T., Ibrahim, E. S., Bailey, B., & Edwards, R. A. (2017). Kullback Leibler divergence in complete bacterial and phage genomes. PeerJ, 5(11), [e4026]. https://doi.org/10.7717/peerj.4026