binny: an automated binning algorithm to recover high-quality genomes from complex metagenomic datasets

Oskar Hickl, Pedro Queirós, Paul Wilmes, Patrick May*, Anna Heintz-Buschart*

*Corresponding author for this work

Research output: Contribution to journalArticleResearchpeer-review

6 Citations (Scopus)

Abstract

The reconstruction of genomes is a critical step in genome-resolved metagenomics and for multi-omic data integration from microbial communities. Here, we present binny, a binning tool that produces high-quality metagenome-assembled genomes (MAG) from both contiguous and highly fragmented genomes. Based on established metrics, binny outperforms or is highly competitive with commonly used and state-of-the-art binning methods and finds unique genomes that could not be detected by other methods. binny uses kmer-composition and coverage by metagenomic reads for iterative, nonlinear dimension reduction of genomic signatures as well as subsequent automated contig clustering with cluster assessment using lineage-specific marker gene sets. When compared with seven widely used binning algorithms, binny provides substantial amounts of uniquely identified MAGs and almost always recovers the most near-complete (> 95% pure, > 90% complete) and high-quality (> 90% pure, > 70% complete) genomes from simulated datasets from the Critical Assessment of Metagenome Interpretation initiative, as well as substantially more high-quality draft genomes, as defined by the Minimum Information about a Metagenome-Assembled Genome standard, from a real-world benchmark comprised of metagenomes from various environments than any other tested method.

Original languageEnglish
Article numberbbac431
JournalBriefings in Bioinformatics
Volume23
Issue number6
DOIs
Publication statusPublished - 1 Nov 2022
Externally publishedYes

Keywords

  • dimensionality reduction
  • embedding
  • iterative clustering
  • MAGs
  • marker gene sets
  • metagenome-assembled genome
  • t-SNE

Fingerprint

Dive into the research topics of 'binny: an automated binning algorithm to recover high-quality genomes from complex metagenomic datasets'. Together they form a unique fingerprint.

Cite this