Fast and sensitive taxonomic classification for metagenomics with Kaiju

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Fast and sensitive taxonomic classification for metagenomics with Kaiju. / Menzel, Peter; Ng, Kim Lee; Krogh, Anders.

In: Nature Communications, Vol. 7, 11257, 2016.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Menzel, P, Ng, KL & Krogh, A 2016, 'Fast and sensitive taxonomic classification for metagenomics with Kaiju', Nature Communications, vol. 7, 11257. https://doi.org/10.1038/ncomms11257

APA

Menzel, P., Ng, K. L., & Krogh, A. (2016). Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nature Communications, 7, [11257]. https://doi.org/10.1038/ncomms11257

Vancouver

Menzel P, Ng KL, Krogh A. Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nature Communications. 2016;7. 11257. https://doi.org/10.1038/ncomms11257

Author

Menzel, Peter ; Ng, Kim Lee ; Krogh, Anders. / Fast and sensitive taxonomic classification for metagenomics with Kaiju. In: Nature Communications. 2016 ; Vol. 7.

Bibtex

@article{ae6ac71d284b433fa5998ca124d0d9b9,
title = "Fast and sensitive taxonomic classification for metagenomics with Kaiju",
abstract = "The constantly decreasing cost and increasing output of current sequencing technologies enable large scale metagenomic studies of microbial communities from diverse habitats. Therefore, fast and accurate methods for taxonomic classification are needed, which can operate on increasingly larger datasets and reference databases. Recently, several fast metagenomic classifiers have been developed, which are based on comparison of genomic k-mers. However, nucleotide comparison using a fixed k-mer length often lacks the sensitivity to overcome the evolutionary distance between sampled species and genomes in the reference database. Here, we present the novel metagenome classifier Kaiju for fast assignment of reads to taxa. Kaiju finds maximum exact matches on the protein-level using the Borrows-Wheeler transform, and can optionally allow amino acid substitutions in the search using a greedy heuristic. We show in a genome exclusion study that Kaiju can classify more reads with higher sensitivity and similar precision compared to fast k-mer based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies more than twice as many reads in ten real metagenomes compared to programs based on genomic k-mers. Kaiju can process up to millions of reads per minute, and its memory footprint is below 5 GB of RAM, allowing the analysis on a standard PC. The program is available under the GPL3 license at: github.com/bioinformatics-centre/kaiju",
author = "Peter Menzel and Ng, {Kim Lee} and Anders Krogh",
year = "2016",
doi = "10.1038/ncomms11257",
language = "English",
volume = "7",
journal = "Nature Communications",
issn = "2041-1723",
publisher = "nature publishing group",

}

RIS

TY - JOUR

T1 - Fast and sensitive taxonomic classification for metagenomics with Kaiju

AU - Menzel, Peter

AU - Ng, Kim Lee

AU - Krogh, Anders

PY - 2016

Y1 - 2016

N2 - The constantly decreasing cost and increasing output of current sequencing technologies enable large scale metagenomic studies of microbial communities from diverse habitats. Therefore, fast and accurate methods for taxonomic classification are needed, which can operate on increasingly larger datasets and reference databases. Recently, several fast metagenomic classifiers have been developed, which are based on comparison of genomic k-mers. However, nucleotide comparison using a fixed k-mer length often lacks the sensitivity to overcome the evolutionary distance between sampled species and genomes in the reference database. Here, we present the novel metagenome classifier Kaiju for fast assignment of reads to taxa. Kaiju finds maximum exact matches on the protein-level using the Borrows-Wheeler transform, and can optionally allow amino acid substitutions in the search using a greedy heuristic. We show in a genome exclusion study that Kaiju can classify more reads with higher sensitivity and similar precision compared to fast k-mer based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies more than twice as many reads in ten real metagenomes compared to programs based on genomic k-mers. Kaiju can process up to millions of reads per minute, and its memory footprint is below 5 GB of RAM, allowing the analysis on a standard PC. The program is available under the GPL3 license at: github.com/bioinformatics-centre/kaiju

AB - The constantly decreasing cost and increasing output of current sequencing technologies enable large scale metagenomic studies of microbial communities from diverse habitats. Therefore, fast and accurate methods for taxonomic classification are needed, which can operate on increasingly larger datasets and reference databases. Recently, several fast metagenomic classifiers have been developed, which are based on comparison of genomic k-mers. However, nucleotide comparison using a fixed k-mer length often lacks the sensitivity to overcome the evolutionary distance between sampled species and genomes in the reference database. Here, we present the novel metagenome classifier Kaiju for fast assignment of reads to taxa. Kaiju finds maximum exact matches on the protein-level using the Borrows-Wheeler transform, and can optionally allow amino acid substitutions in the search using a greedy heuristic. We show in a genome exclusion study that Kaiju can classify more reads with higher sensitivity and similar precision compared to fast k-mer based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies more than twice as many reads in ten real metagenomes compared to programs based on genomic k-mers. Kaiju can process up to millions of reads per minute, and its memory footprint is below 5 GB of RAM, allowing the analysis on a standard PC. The program is available under the GPL3 license at: github.com/bioinformatics-centre/kaiju

U2 - 10.1038/ncomms11257

DO - 10.1038/ncomms11257

M3 - Journal article

C2 - 27071849

VL - 7

JO - Nature Communications

JF - Nature Communications

SN - 2041-1723

M1 - 11257

ER -

ID: 148689508