TY - JOUR
T1 - binny
T2 - an automated binning algorithm to recover high-quality genomes from complex metagenomic datasets
AU - Hickl, Oskar
AU - Queirós, Pedro
AU - Wilmes, Paul
AU - May, Patrick
AU - Heintz-Buschart, Anna
N1 - Publisher Copyright:
© The Author(s) 2022. Published by Oxford University Press.
PY - 2022/11/1
Y1 - 2022/11/1
N2 - The reconstruction of genomes is a critical step in genome-resolved metagenomics and for multi-omic data integration from microbial communities. Here, we present binny, a binning tool that produces high-quality metagenome-assembled genomes (MAG) from both contiguous and highly fragmented genomes. Based on established metrics, binny outperforms or is highly competitive with commonly used and state-of-the-art binning methods and finds unique genomes that could not be detected by other methods. binny uses kmer-composition and coverage by metagenomic reads for iterative, nonlinear dimension reduction of genomic signatures as well as subsequent automated contig clustering with cluster assessment using lineage-specific marker gene sets. When compared with seven widely used binning algorithms, binny provides substantial amounts of uniquely identified MAGs and almost always recovers the most near-complete (> 95% pure, > 90% complete) and high-quality (> 90% pure, > 70% complete) genomes from simulated datasets from the Critical Assessment of Metagenome Interpretation initiative, as well as substantially more high-quality draft genomes, as defined by the Minimum Information about a Metagenome-Assembled Genome standard, from a real-world benchmark comprised of metagenomes from various environments than any other tested method.
AB - The reconstruction of genomes is a critical step in genome-resolved metagenomics and for multi-omic data integration from microbial communities. Here, we present binny, a binning tool that produces high-quality metagenome-assembled genomes (MAG) from both contiguous and highly fragmented genomes. Based on established metrics, binny outperforms or is highly competitive with commonly used and state-of-the-art binning methods and finds unique genomes that could not be detected by other methods. binny uses kmer-composition and coverage by metagenomic reads for iterative, nonlinear dimension reduction of genomic signatures as well as subsequent automated contig clustering with cluster assessment using lineage-specific marker gene sets. When compared with seven widely used binning algorithms, binny provides substantial amounts of uniquely identified MAGs and almost always recovers the most near-complete (> 95% pure, > 90% complete) and high-quality (> 90% pure, > 70% complete) genomes from simulated datasets from the Critical Assessment of Metagenome Interpretation initiative, as well as substantially more high-quality draft genomes, as defined by the Minimum Information about a Metagenome-Assembled Genome standard, from a real-world benchmark comprised of metagenomes from various environments than any other tested method.
KW - dimensionality reduction
KW - embedding
KW - iterative clustering
KW - MAGs
KW - marker gene sets
KW - metagenome-assembled genome
KW - t-SNE
UR - http://www.scopus.com/inward/record.url?scp=85153786293&partnerID=8YFLogxK
U2 - 10.1093/bib/bbac431
DO - 10.1093/bib/bbac431
M3 - Article
AN - SCOPUS:85153786293
SN - 1467-5463
VL - 23
JO - Briefings in Bioinformatics
JF - Briefings in Bioinformatics
IS - 6
M1 - bbac431
ER -