Taxonomic classification method for metagenomics based on core protein families with Core-Kaiju

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Taxonomic classification method for metagenomics based on core protein families with Core-Kaiju. / Tovo, Anna; Menzel, Peter; Krogh, Anders; Cosentino Lagomarsino, Marco; Suweis, Samir.

In: Nucleic Acids Research, Vol. 48, No. 16, e93, 2020.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Tovo, A, Menzel, P, Krogh, A, Cosentino Lagomarsino, M & Suweis, S 2020, 'Taxonomic classification method for metagenomics based on core protein families with Core-Kaiju', Nucleic Acids Research, vol. 48, no. 16, e93. https://doi.org/10.1093/nar/gkaa568

APA

Tovo, A., Menzel, P., Krogh, A., Cosentino Lagomarsino, M., & Suweis, S. (2020). Taxonomic classification method for metagenomics based on core protein families with Core-Kaiju. Nucleic Acids Research, 48(16), [e93]. https://doi.org/10.1093/nar/gkaa568

Vancouver

Tovo A, Menzel P, Krogh A, Cosentino Lagomarsino M, Suweis S. Taxonomic classification method for metagenomics based on core protein families with Core-Kaiju. Nucleic Acids Research. 2020;48(16). e93. https://doi.org/10.1093/nar/gkaa568

Author

Tovo, Anna ; Menzel, Peter ; Krogh, Anders ; Cosentino Lagomarsino, Marco ; Suweis, Samir. / Taxonomic classification method for metagenomics based on core protein families with Core-Kaiju. In: Nucleic Acids Research. 2020 ; Vol. 48, No. 16.

Bibtex

@article{aee9c430831a48999d0d1634ba3a864b,
title = "Taxonomic classification method for metagenomics based on core protein families with Core-Kaiju",
abstract = "Characterizing species diversity and composition of bacteria hosted by biota is revolutionizing our understanding of the role of symbiotic interactions in ecosystems. Determining microbiomes diversity implies the assignment of individual reads to taxa by comparison to reference databases. Although computational methods aimed at identifying the microbe(s) taxa are available, it is well known that inferences using different methods can vary widely depending on various biases. In this study, we first apply and compare different bioinformatics methods based on 16S ribosomal RNA gene and shotgun sequencing to three mock communities of bacteria, of which the compositions are known. We show that none of these methods can infer both the true number of taxa and their abundances. We thus propose a novel approach, named Core-Kaiju, which combines the power of shotgun metagenomics data with a more focused marker gene classification method similar to 16S, but based on emergent statistics of core protein domain families. We thus test the proposed method on various mock communities and we show that Core-Kaiju reliably predicts both number of taxa and abundances. Finally, we apply our method on human gut samples, showing how Core-Kaiju may give more accurate ecological characterization and a fresh view on real microbiomes.",
author = "Anna Tovo and Peter Menzel and Anders Krogh and {Cosentino Lagomarsino}, Marco and Samir Suweis",
year = "2020",
doi = "10.1093/nar/gkaa568",
language = "English",
volume = "48",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "16",

}

RIS

TY - JOUR

T1 - Taxonomic classification method for metagenomics based on core protein families with Core-Kaiju

AU - Tovo, Anna

AU - Menzel, Peter

AU - Krogh, Anders

AU - Cosentino Lagomarsino, Marco

AU - Suweis, Samir

PY - 2020

Y1 - 2020

N2 - Characterizing species diversity and composition of bacteria hosted by biota is revolutionizing our understanding of the role of symbiotic interactions in ecosystems. Determining microbiomes diversity implies the assignment of individual reads to taxa by comparison to reference databases. Although computational methods aimed at identifying the microbe(s) taxa are available, it is well known that inferences using different methods can vary widely depending on various biases. In this study, we first apply and compare different bioinformatics methods based on 16S ribosomal RNA gene and shotgun sequencing to three mock communities of bacteria, of which the compositions are known. We show that none of these methods can infer both the true number of taxa and their abundances. We thus propose a novel approach, named Core-Kaiju, which combines the power of shotgun metagenomics data with a more focused marker gene classification method similar to 16S, but based on emergent statistics of core protein domain families. We thus test the proposed method on various mock communities and we show that Core-Kaiju reliably predicts both number of taxa and abundances. Finally, we apply our method on human gut samples, showing how Core-Kaiju may give more accurate ecological characterization and a fresh view on real microbiomes.

AB - Characterizing species diversity and composition of bacteria hosted by biota is revolutionizing our understanding of the role of symbiotic interactions in ecosystems. Determining microbiomes diversity implies the assignment of individual reads to taxa by comparison to reference databases. Although computational methods aimed at identifying the microbe(s) taxa are available, it is well known that inferences using different methods can vary widely depending on various biases. In this study, we first apply and compare different bioinformatics methods based on 16S ribosomal RNA gene and shotgun sequencing to three mock communities of bacteria, of which the compositions are known. We show that none of these methods can infer both the true number of taxa and their abundances. We thus propose a novel approach, named Core-Kaiju, which combines the power of shotgun metagenomics data with a more focused marker gene classification method similar to 16S, but based on emergent statistics of core protein domain families. We thus test the proposed method on various mock communities and we show that Core-Kaiju reliably predicts both number of taxa and abundances. Finally, we apply our method on human gut samples, showing how Core-Kaiju may give more accurate ecological characterization and a fresh view on real microbiomes.

UR - http://www.scopus.com/inward/record.url?scp=85091264070&partnerID=8YFLogxK

U2 - 10.1093/nar/gkaa568

DO - 10.1093/nar/gkaa568

M3 - Journal article

C2 - 32633756

AN - SCOPUS:85091264070

VL - 48

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 16

M1 - e93

ER -

ID: 250253601