Finding novel genes in bacterial communities isolated from the environment

Lutz Krause, Naryttza N. Diaz, Daniela Bartels, Robert A. Edwards, Alfred Pühler, Forest Rohwer, Folker Meyer, Jens Stoye

Research output: Contribution to journalArticlepeer-review

61 Citations (Scopus)
16 Downloads (Pure)


Motivation: Novel sequencing techniques can give access to organisms that are difficult to cultivate using conventional methods. When applied to environmental samples, the data generated has some drawbacks, e.g. short length of assembled contigs, in-frame stop codons and frame shifts. Unfortunately, current gene finders cannot circumvent these difficulties. At the same time, the automated prediction of genes is a prerequisite for the increasing amount of genomic sequences to ensure progress in metagenomics. Results: We introduce a novel gene finding algorithm that incorporates features overcoming the short length of the assembled contigs from environmental data, in-frame stop codons as well as frame shifts contained in bacterial sequences. The results show that by searching for sequence similarities in an environmental sample our algorithm is capable of detecting a high fraction of its gene content, depending on the species composition and the overall size of the sample. The method is valuable for hunting novel unknown genes that may be specific for the habitat where the sample is taken. Finally, we show that our algorithm can even exploit the limited information contained in the short reads generated by 454 technology for the prediction of protein coding genes.

Original languageEnglish
Pages (from-to)e281-e289
Issue number14
Publication statusPublished - 15 Jul 2006
Externally publishedYes

Bibliographical note

© The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email:
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact


Dive into the research topics of 'Finding novel genes in bacterial communities isolated from the environment'. Together they form a unique fingerprint.

Cite this