Abstract
Computational competitions are the standard for benchmarking medical image analysis algorithms, but they typically use small curated test datasets acquired at a few centers, leaving a gap to the reality of diverse multicentric patient data. To this end, the Federated Tumor Segmentation (FeTS) Challenge represents the paradigm for real-world algorithmic performance evaluation. The FeTS challenge is a competition to benchmark (i) federated learning aggregation algorithms and (ii) state-of-the-art segmentation algorithms, across multiple international sites. Weight aggregation and client selection techniques were compared using a multicentric brain tumor dataset in realistic federated learning simulations, yielding benefits for adaptive weight aggregation, and efficiency gains through client sampling. Quantitative performance evaluation of state-of-the-art segmentation algorithms on data distributed internationally across 32 institutions yielded good generalization on average, albeit the worst-case performance revealed data-specific modes of failure. Similar multi-site setups can help validate the real-world utility of healthcare AI algorithms in the future.
| Original language | English |
|---|---|
| Article number | 6274 |
| Number of pages | 20 |
| Journal | Nature Communications |
| Volume | 16 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - 8 Jul 2025 |
Keywords
- Humans
- Benchmarking/methods
- Algorithms
- Brain Neoplasms/diagnostic imaging
- Image Processing, Computer-Assisted/methods
- Artificial Intelligence
- Magnetic Resonance Imaging
Fingerprint
Dive into the research topics of 'Towards fair decentralized benchmarking of healthcare AI algorithms with the Federated Tumor Segmentation (FeTS) challenge'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver
}
In: Nature Communications, Vol. 16, No. 1, 6274, 08.07.2025.
Research output: Contribution to journal › Article › Research › peer-review
TY - JOUR
T1 - Towards fair decentralized benchmarking of healthcare AI algorithms with the Federated Tumor Segmentation (FeTS) challenge
AU - Zenk, Maximilian
AU - Baid, Ujjwal
AU - Pati, Sarthak
AU - Linardos, Akis
AU - Edwards, Brandon
AU - Sheller, Micah
AU - Foley, Patrick
AU - Aristizabal, Alejandro
AU - Zimmerer, David
AU - Gruzdev, Alexey
AU - Martin, Jason
AU - Shinohara, Russell T.
AU - Reinke, Annika
AU - Isensee, Fabian
AU - Parampottupadam, Santhosh
AU - Parekh, Kaushal
AU - Floca, Ralf
AU - Kassem, Hasan
AU - Baheti, Bhakti
AU - Thakur, Siddhesh
AU - Chung, Verena
AU - Kushibar, Kaisar
AU - Lekadir, Karim
AU - Jiang, Meirui
AU - Yin, Youtan
AU - Yang, Hongzheng
AU - Liu, Quande
AU - Chen, Cheng
AU - Dou, Qi
AU - Heng, Pheng Ann
AU - Zhang, Xiaofan
AU - Zhang, Shaoting
AU - Khan, Muhammad Irfan
AU - Azeem, Mohammad Ayyaz
AU - Jafaritadi, Mojtaba
AU - Alhoniemi, Esa
AU - Kontio, Elina
AU - Khan, Suleiman A.
AU - Mächler, Leon
AU - Ezhov, Ivan
AU - Kofler, Florian
AU - Shit, Suprosanna
AU - Paetzold, Johannes C.
AU - Loehr, Timo
AU - Wiestler, Benedikt
AU - Peiris, Himashi
AU - Pawar, Kamlesh
AU - Zhong, Shenjun
AU - Chen, Zhaolin
AU - Hayat, Munawar
AU - Egan, Gary
AU - Harandi, Mehrtash
AU - Isik Polat, Ece
AU - Polat, Gorkem
AU - Kocyigit, Altan
AU - Temizel, Alptekin
AU - Tuladhar, Anup
AU - Tyagi, Lakshay
AU - Souza, Raissa
AU - Forkert, Nils D.
AU - Mouches, Pauline
AU - Wilms, Matthias
AU - Shambhat, Vishruth
AU - Maurya, Akansh
AU - Danannavar, Shubham Subhas
AU - Kalla, Rohit
AU - Anand, Vikas Kumar
AU - Krishnamurthi, Ganapathy
AU - Nalawade, Sahil
AU - Ganesh, Chandan
AU - Wagner, Ben
AU - Reddy, Divya
AU - Das, Yudhajit
AU - Yu, Fang F.
AU - Fei, Baowei
AU - Madhuranthakam, Ananth J.
AU - Maldjian, Joseph
AU - Singh, Gaurav
AU - Ren, Jianxun
AU - Zhang, Wei
AU - An, Ning
AU - Hu, Qingyu
AU - Zhang, Youjia
AU - Zhou, Ying
AU - Siomos, Vasilis
AU - Tarroni, Giacomo
AU - Passerrat-Palmbach, Jonathan
AU - Rawat, Ambrish
AU - Zizzo, Giulio
AU - Kadhe, Swanand Ravindra
AU - Epperlein, Jonathan P.
AU - Braghin, Stefano
AU - Wang, Yuan
AU - Kanagavelu, Renuga
AU - Wei, Qingsong
AU - Yang, Yechao
AU - Liu, Yong
AU - Kotowski, Krzysztof
AU - Adamski, Szymon
AU - Machura, Bartosz
AU - Malara, Wojciech
AU - Zarudzki, Lukasz
AU - Nalepa, Jakub
AU - Shi, Yaying
AU - Gao, Hongjian
AU - Avestimehr, Salman
AU - Yan, Yonghong
AU - Akbar, Agus S.
AU - Kondrateva, Ekaterina
AU - Yang, Hua
AU - Li, Zhaopei
AU - Wu, Hung Yu
AU - Roth, Johannes
AU - Saueressig, Camillo
AU - Milesi, Alexandre
AU - Nguyen, Quoc D.
AU - Gruenhagen, Nathan J.
AU - Huang, Tsung Ming
AU - Ma, Jun
AU - Singh, Har Shwinder H.
AU - Pan, Nai Yu
AU - Zhang, Dingwen
AU - Zeineldin, Ramy A.
AU - Futrega, Michal
AU - Yuan, Yading
AU - Conte, Gian Marco
AU - Feng, Xue
AU - Pham, Quan D.
AU - Xia, Yong
AU - Jiang, Zhifan
AU - Luu, Huan Minh
AU - Dobko, Mariia
AU - Carré, Alexandre
AU - Tuchinov, Bair
AU - Mohy-ud-Din, Hassan
AU - Alam, Saruar
AU - Singh, Anup
AU - Shah, Nameeta
AU - Wang, Weichung
AU - Sako, Chiharu
AU - Bilello, Michel
AU - Ghodasara, Satyam
AU - Mohan, Suyash
AU - Davatzikos, Christos
AU - Calabrese, Evan
AU - Rudie, Jeffrey
AU - Villanueva-Meyer, Javier
AU - Cha, Soonmee
AU - Hess, Christopher
AU - Mongan, John
AU - Ingalhalikar, Madhura
AU - Jadhav, Manali
AU - Pandey, Umang
AU - Saini, Jitender
AU - Huang, Raymond Y.
AU - Chang, Ken
AU - To, Minh Son
AU - Bhardwaj, Sargam
AU - Chong, Chee
AU - Agzarian, Marc
AU - Kozubek, Michal
AU - Lux, Filip
AU - Michálek, Jan
AU - Matula, Petr
AU - Ker^kovský, Miloš
AU - Kopr^ivová, Tereza
AU - Dostál, Marek
AU - Vybíhal, Václav
AU - Pinho, Marco C.
AU - Holcomb, James
AU - Metz, Marie
AU - Jain, Rajan
AU - Lee, Matthew D.
AU - Lui, Yvonne W.
AU - Tiwari, Pallavi
AU - Verma, Ruchika
AU - Bareja, Rohan
AU - Yadav, Ipsa
AU - Chen, Jonathan
AU - Kumar, Neeraj
AU - Gusev, Yuriy
AU - Bhuvaneshwar, Krithika
AU - Sayah, Anousheh
AU - Bencheqroun, Camelia
AU - Belouali, Anas
AU - Madhavan, Subha
AU - Colen, Rivka R.
AU - Kotrotsou, Aikaterini
AU - Vollmuth, Philipp
AU - Brugnara, Gianluca
AU - Preetha, Chandrakanth J.
AU - Sahm, Felix
AU - Bendszus, Martin
AU - Wick, Wolfgang
AU - Mahajan, Abhishek
AU - Balaña, Carmen
AU - Capellades, Jaume
AU - Puig, Josep
AU - Choi, Yoon Seong
AU - Lee, Seung Koo
AU - Chang, Jong Hee
AU - Ahn, Sung Soo
AU - Shaykh, Hassan F.
AU - Herrera-Trujillo, Alejandro
AU - Trujillo, Maria
AU - Escobar, William
AU - Abello, Ana
AU - Bernal, Jose
AU - Gómez, Jhon
AU - LaMontagne, Pamela
AU - Marcus, Daniel S.
AU - Milchenko, Mikhail
AU - Nazeri, Arash
AU - Landman, Bennett
AU - Ramadass, Karthik
AU - Xu, Kaiwen
AU - Chotai, Silky
AU - Chambless, Lola B.
AU - Mistry, Akshitkumar
AU - Thompson, Reid C.
AU - Srinivasan, Ashok
AU - Bapuraj, J. Rajiv
AU - Rao, Arvind
AU - Wang, Nicholas
AU - Yoshiaki, Ota
AU - Moritani, Toshio
AU - Turk, Sevcan
AU - Lee, Joonsang
AU - Prabhudesai, Snehal
AU - Garrett, John
AU - Larson, Matthew
AU - Jeraj, Robert
AU - Li, Hongwei
AU - Weiss, Tobias
AU - Weller, Michael
AU - Bink, Andrea
AU - Pouymayou, Bertrand
AU - Sharma, Sonam
AU - Tseng, Tzu Chi
AU - Adabi, Saba
AU - Xavier Falcão, Alexandre
AU - Martins, Samuel B.
AU - Teixeira, Bernardo C.A.
AU - Sprenger, Flávia
AU - Menotti, David
AU - Lucio, Diego R.
AU - Niclou, Simone P.
AU - Keunen, Olivier
AU - Hau, Ann Christin
AU - Pelaez, Enrique
AU - Franco-Maldonado, Heydy
AU - Loayza, Francis
AU - Quevedo, Sebastian
AU - McKinley, Richard
AU - Slotboom, Johannes
AU - Radojewski, Piotr
AU - Meier, Raphael
AU - Wiest, Roland
AU - Trenkler, Johannes
AU - Pichler, Josef
AU - Necker, Georg
AU - Haunschmidt, Andreas
AU - Meckel, Stephan
AU - Guevara, Pamela
AU - Torche, Esteban
AU - Mendoza, Cristobal
AU - Vera, Franco
AU - Ríos, Elvis
AU - López, Eduardo
AU - Velastin, Sergio A.
AU - Choi, Joseph
AU - Baek, Stephen
AU - Kim, Yusung
AU - Ismael, Heba
AU - Allen, Bryan
AU - Buatti, John M.
AU - Zampakis, Peter
AU - Panagiotopoulos, Vasileios
AU - Tsiganos, Panagiotis
AU - Alexiou, Sotiris
AU - Haliassos, Ilias
AU - Zacharaki, Evangelia I.
AU - Moustakas, Konstantinos
AU - Kalogeropoulou, Christina
AU - Kardamakis, Dimitrios M.
AU - Luo, Bing
AU - Poisson, Laila M.
AU - Wen, Ning
AU - Vallières, Martin
AU - Loutfi, Mahdi Ait Lhaj
AU - Fortin, David
AU - Lepage, Martin
AU - Morón, Fanny
AU - Mandel, Jacob
AU - Shukla, Gaurav
AU - Liem, Spencer
AU - Alexandre, Gregory S.
AU - Lombardo, Joseph
AU - Palmer, Joshua D.
AU - Flanders, Adam E.
AU - Dicker, Adam P.
AU - Ogbole, Godwin
AU - Oyekunle, Dotun
AU - Odafe-Oyibotha, Olubunmi
AU - Osobu, Babatunde
AU - Shu’aibu Hikima, Mustapha
AU - Soneye, Mayowa
AU - Dako, Farouk
AU - Dorcas, Adeleye
AU - Murcia, Derrick
AU - Fu, Eric
AU - Haas, Rourke
AU - Thompson, John A.
AU - Ormond, David Ryan
AU - Currie, Stuart
AU - Fatania, Kavi
AU - Frood, Russell
AU - Simpson, Amber L.
AU - Peoples, Jacob J.
AU - Hu, Ricky
AU - Cutler, Danielle
AU - Moraes, Fabio Y.
AU - Tran, Anh
AU - Hamghalam, Mohammad
AU - Boss, Michael A.
AU - Gimpel, James
AU - Kattil Veettil, Deepak
AU - Schmidt, Kendall
AU - Cimino, Lisa
AU - Price, Cynthia
AU - Bialecki, Brian
AU - Marella, Sailaja
AU - Apgar, Charles
AU - Jakab, Andras
AU - Weber, Marc André
AU - Colak, Errol
AU - Kleesiek, Jens
AU - Freymann, John B.
AU - Kirby, Justin S.
AU - Maier-Hein, Lena
AU - Albrecht, Jake
AU - Mattson, Peter
AU - Karargyris, Alexandros
AU - Shah, Prashant
AU - Menze, Bjoern
AU - Maier-Hein, Klaus
AU - Bakas, Spyridon
N1 - Funding: Research reported in this publication was partly funded by the Helmholtz Association (HA) within the project “Trust- worthy Federated Data Analytics” (TFDA) (funding number ZT-I-OO1 4), and partly by the National Institutes of Health (NIH), under award num- bers NCI:U01CA242871 (PI: S.Bakas) and NCI:U24CA279629 (PI: S.Bakas). K. Kushibar holds the Juan de la Cierva fellowship with a reference number FJC2021-047659-I. This work was supported in part by Hong Kong Research Grants Council Project No. T45- 401/22-N. Team HT- TUAS was partly funded by Business Finland under Grant 33961/31/ 2020. They also acknowledges the CSC-Puhti super-computer for their support and computational resources during FeTS 2021 and 2022. N. D. Forkert was supported by the Canadian Institutes of Health Research (CIHR Project Grant 462169). Jakub Nalepa was supported by the Sile- sian University of Technology funds through the Excellence Initiative–Research University program (Grant 02/080/SDU/10-21-01), and by the Silesian University of Technology funds through the grant for maintaining and developing research potential. Research reported in this publication was partly funded by R21EB030209, NIH/NIBIB (PI: Y. Yuan), UL1TR001433, NIH/NCATS, a research grant from Varian Medical Systems (Palo Alto, CA, USA) (PI: Y. Yuan). Y. Yuan also acknowledges the generous support of Herbert and Florence Irving/the Irving Trust. Z. Jiang was supported by National Cancer Institute (UG3 CA236536). H. Mohy-ud-Din was supported by a grant from the Higher Education Commission of Pakistan as part of the National Center for Big Data and Cloud Computing and the Clinical and Translational Imaging Lab at LUMS. M. Kozubek was supported by the Ministry of Health of the Czech Republic (grant NU21-08-00359 and conceptual development of research organization FNBr-65269705) and Ministry of Education, Youth and Sports of the Czech Republic (Project LM2023050). Václav Vybíhal was supported by MH CZ - DRO (FNBr, 65269705). Y. Gusev was sup- ported by CCSG Grant number: NCI P30 CA51008. P. Vollmuth was supported by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - Project-ID 404521405, SFB 1389 - UNITE Glio- blastoma, Work Package C02, and Priority Programme 2177 “Radiomics: Next Generation of Biomedical Imaging” (KI 2410/1-1 ∣ MA 6340/18-1). B. Landman was supported by NSF 2040462. A. Rao was supported by the NIH (R37CA214955-01A1). A. Falcão was supported by CNPq 304711/ 2023-3. P. Guevara was supported by the ANID-Basal proyects AFB240002 (AC3E) and FB210017 (CENIA). Research reported in this publication was partly funded by the NSF Convergence Accelerator - Track D: ImagiQ: Asynchronous and Decentralized Federated Learning for Medical Imaging, Grant Number: 2040532, and R21CA270742 (Per- iod of Funding: 09/15/20 - 05/31/21). Martin Vallières acknowledges funding from the Canada CIFAR AI Chairs Program. Stuart Currie receives salary support from a Leeds Hospitals Charity (9R01/1403) and Cancer Research UK (C19942/A28832) grants. Kavi Fatania is a 4ward North Clinical PhD fellow funded by Wellcome award (203914/Z/16/Z). Russell Frood is a Clinical Trials Fellow funded by CRUK (RCCCTF-Oct22/ 100002). This work was funded in part by National Institutes of Health R01CA233888 and the grant NCI:U24CA248265 © 2025. The Author(s).
PY - 2025/7/8
Y1 - 2025/7/8
N2 - Computational competitions are the standard for benchmarking medical image analysis algorithms, but they typically use small curated test datasets acquired at a few centers, leaving a gap to the reality of diverse multicentric patient data. To this end, the Federated Tumor Segmentation (FeTS) Challenge represents the paradigm for real-world algorithmic performance evaluation. The FeTS challenge is a competition to benchmark (i) federated learning aggregation algorithms and (ii) state-of-the-art segmentation algorithms, across multiple international sites. Weight aggregation and client selection techniques were compared using a multicentric brain tumor dataset in realistic federated learning simulations, yielding benefits for adaptive weight aggregation, and efficiency gains through client sampling. Quantitative performance evaluation of state-of-the-art segmentation algorithms on data distributed internationally across 32 institutions yielded good generalization on average, albeit the worst-case performance revealed data-specific modes of failure. Similar multi-site setups can help validate the real-world utility of healthcare AI algorithms in the future.
AB - Computational competitions are the standard for benchmarking medical image analysis algorithms, but they typically use small curated test datasets acquired at a few centers, leaving a gap to the reality of diverse multicentric patient data. To this end, the Federated Tumor Segmentation (FeTS) Challenge represents the paradigm for real-world algorithmic performance evaluation. The FeTS challenge is a competition to benchmark (i) federated learning aggregation algorithms and (ii) state-of-the-art segmentation algorithms, across multiple international sites. Weight aggregation and client selection techniques were compared using a multicentric brain tumor dataset in realistic federated learning simulations, yielding benefits for adaptive weight aggregation, and efficiency gains through client sampling. Quantitative performance evaluation of state-of-the-art segmentation algorithms on data distributed internationally across 32 institutions yielded good generalization on average, albeit the worst-case performance revealed data-specific modes of failure. Similar multi-site setups can help validate the real-world utility of healthcare AI algorithms in the future.
KW - Humans
KW - Benchmarking/methods
KW - Algorithms
KW - Brain Neoplasms/diagnostic imaging
KW - Image Processing, Computer-Assisted/methods
KW - Artificial Intelligence
KW - Magnetic Resonance Imaging
UR - https://www.scopus.com/pages/publications/105010224653
UR - https://pubmed.ncbi.nlm.nih.gov/40628696/
U2 - 10.1038/s41467-025-60466-1
DO - 10.1038/s41467-025-60466-1
M3 - Article
C2 - 40628696
AN - SCOPUS:105010224653
SN - 2041-1723
VL - 16
JO - Nature Communications
JF - Nature Communications
IS - 1
M1 - 6274
ER -