On Evaluating Adversarial Robustness of Chest X-ray Classification: Pitfalls and Best Practices

Salah Ghamizi*, Maxime Cordy, Mike Papadakis, Yves Le Traon

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

1 Citation (Scopus)

Abstract

Vulnerability to adversarial attacks is a well-known weakness of Deep Neural Networks. While most of the studies focus on natural images with standardized benchmarks like ImageNet and CIFAR, little research has considered real-world applications, in particular in the medical domain. Our research shows that, contrary to previous claims, the robustness of chest x-ray classification is much harder to evaluate and leads to very different assessments based on the dataset, the architecture, and robustness metric. We argue that previous studies did not take into account the peculiarity of medical diagnosis, like the co-occurrence of diseases, the disagreement of labellers (domain experts), the threat model of the attacks, and the risk implications for each successful attack. In this paper, we discuss the methodological foundations, review the pitfalls and best practices, and suggest new methodological considerations for evaluating the robustness of chest xray classification models. Our evaluation of three datasets, seven models, and 18 diseases is the largest evaluation of the robustness of chest X-ray classification models.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume3381
Publication statusPublished - 2023
Externally publishedYes
Event2023 Workshop on Artificial Intelligence Safety, SafeAI 2023 - Washington, United States
Duration: 13 Feb 202314 Feb 2023

Keywords

  • Adversarial
  • Chest X-ray
  • CheXpert
  • CXR
  • Evasion
  • NIH
  • PadChest
  • Radiograph
  • Robustness

Fingerprint

Dive into the research topics of 'On Evaluating Adversarial Robustness of Chest X-ray Classification: Pitfalls and Best Practices'. Together they form a unique fingerprint.

Cite this