TY - JOUR
T1 - Statistical and Machine Learning Techniques in Human Microbiome Studies
T2 - Contemporary Challenges and Solutions
AU - Moreno-Indias, Isabel
AU - Lahti, Leo
AU - Nedyalkova, Miroslava
AU - Elbere, Ilze
AU - Roshchupkin, Gennady
AU - Adilovic, Muhamed
AU - Aydemir, Onder
AU - Bakir-Gungor, Burcu
AU - Santa Pau, Enrique Carrillo de
AU - D’Elia, Domenica
AU - Desai, Mahesh S.
AU - Falquet, Laurent
AU - Gundogdu, Aycan
AU - Hron, Karel
AU - Klammsteiner, Thomas
AU - Lopes, Marta B.
AU - Marcos-Zambrano, Laura Judith
AU - Marques, Cláudia
AU - Mason, Michael
AU - May, Patrick
AU - Pašić, Lejla
AU - Pio, Gianvito
AU - Pongor, Sándor
AU - Promponas, Vasilis J.
AU - Przymus, Piotr
AU - Saez-Rodriguez, Julio
AU - Sampri, Alexia
AU - Shigdel, Rajesh
AU - Stres, Blaz
AU - Suharoschi, Ramona
AU - Truu, Jaak
AU - Truică, Ciprian Octavian
AU - Vilne, Baiba
AU - Vlachakis, Dimitrios
AU - Yilmaz, Ercument
AU - Zeller, Georg
AU - Zomer, Aldert L.
AU - Gómez-Cabrero, David
AU - Claesson, Marcus J.
AU - ML4Microbiome
N1 - Funding Information:
We are grateful to all COST Action CA18131 ?Statistical and machine learning techniques in human microbiome studies? members for their contributions to the discussion about the topics in this perspective, and especially to the WG4 and WG1. Funding. This study was supported by the COST Action CA18131 ?Statistical and machine learning techniques in human microbiome studies.? IM-I was supported by the ?MS type I? program (CP16/00163) from the Instituto de Salud Carlos III and co-funded by Fondo Europeo de Desarrollo Regional-FEDER. MN was grateful for the additional support by the project ?Information and Communication Technologies for a Single Digital Market in Science, Education and Security? of the Scientific Research Center, NIS-3317 and National roadmaps for research infrastructures (RIs) grant number NIS-3318. LL was supported by Academy of Finland (decision 295741). IE was supported by H2020-EU.4.b. project ?Integration of knowledge and biobank resources in comprehensive translational approach for personalized prevention and treatment of metabolic disorders (INTEGROMED)? (grant agreement ID 857572). MD was supported by the Luxembourg National Research Fund (FNR) CORE grant (C18/BM/12585940).
Publisher Copyright:
© Copyright © 2021 Moreno-Indias, Lahti, Nedyalkova, Elbere, Roshchupkin, Adilovic, Aydemir, Bakir-Gungor, Santa Pau, D’Elia, Desai, Falquet, Gundogdu, Hron, Klammsteiner, Lopes, Marcos-Zambrano, Marques, Mason, May, Pašić, Pio, Pongor, Promponas, Przymus, Saez-Rodriguez, Sampri, Shigdel, Stres, Suharoschi, Truu, Truică, Vilne, Vlachakis, Yilmaz, Zeller, Zomer, Gómez-Cabrero and Claesson.
PY - 2021/2/22
Y1 - 2021/2/22
N2 - The human microbiome has emerged as a central research topic in human biology and biomedicine. Current microbiome studies generate high-throughput omics data across different body sites, populations, and life stages. Many of the challenges in microbiome research are similar to other high-throughput studies, the quantitative analyses need to address the heterogeneity of data, specific statistical properties, and the remarkable variation in microbiome composition across individuals and body sites. This has led to a broad spectrum of statistical and machine learning challenges that range from study design, data processing, and standardization to analysis, modeling, cross-study comparison, prediction, data science ecosystems, and reproducible reporting. Nevertheless, although many statistics and machine learning approaches and tools have been developed, new techniques are needed to deal with emerging applications and the vast heterogeneity of microbiome data. We review and discuss emerging applications of statistical and machine learning techniques in human microbiome studies and introduce the COST Action CA18131 “ML4Microbiome” that brings together microbiome researchers and machine learning experts to address current challenges such as standardization of analysis pipelines for reproducibility of data analysis results, benchmarking, improvement, or development of existing and new tools and ontologies.
AB - The human microbiome has emerged as a central research topic in human biology and biomedicine. Current microbiome studies generate high-throughput omics data across different body sites, populations, and life stages. Many of the challenges in microbiome research are similar to other high-throughput studies, the quantitative analyses need to address the heterogeneity of data, specific statistical properties, and the remarkable variation in microbiome composition across individuals and body sites. This has led to a broad spectrum of statistical and machine learning challenges that range from study design, data processing, and standardization to analysis, modeling, cross-study comparison, prediction, data science ecosystems, and reproducible reporting. Nevertheless, although many statistics and machine learning approaches and tools have been developed, new techniques are needed to deal with emerging applications and the vast heterogeneity of microbiome data. We review and discuss emerging applications of statistical and machine learning techniques in human microbiome studies and introduce the COST Action CA18131 “ML4Microbiome” that brings together microbiome researchers and machine learning experts to address current challenges such as standardization of analysis pipelines for reproducibility of data analysis results, benchmarking, improvement, or development of existing and new tools and ontologies.
KW - ML4Microbiome
KW - biomarker identification
KW - machine learning
KW - microbiome
KW - personalized medicine
UR - http://www.scopus.com/inward/record.url?scp=85102370593&partnerID=8YFLogxK
UR - https://pubmed.ncbi.nlm.nih.gov/33692771
U2 - 10.3389/fmicb.2021.635781
DO - 10.3389/fmicb.2021.635781
M3 - Article
C2 - 33692771
AN - SCOPUS:85102370593
SN - 1664-302X
VL - 12
SP - 635781
JO - Frontiers in Microbiology
JF - Frontiers in Microbiology
M1 - 635781
ER -