A comparison between different prediction models for invasive breast cancer occurrence in the French E3N cohort

Laureen Dartois, Émilien Gauthier, Julia Heitzmann, Laura Baglietto, Stefan Michiels, Sylvie Mesrine, Marie Christine Boutron-Ruault, Suzette Delaloge, Stéphane Ragusa, Françoise Clavel-Chapelon*, Guy Fagherazzi

*Corresponding author for this work

Research output: Contribution to journalArticleResearchpeer-review

10 Citations (Scopus)


Breast cancer remains a global health concern with a lack of high discriminating prediction models. The k-nearest-neighbor algorithm (kNN) estimates individual risks using an intuitive tool. This study compares the performances of this approach with the Cox and the Gail models for the 5-year breast cancer risk prediction. The study included 64,995 women from the French E3N prospective cohort. The sample was divided into a learning (N = 51,821) series to learn the models using fivefold cross-validation and a validation (N = 13,174) series to evaluate them. The area under the receiver operating characteristic curve (AUC) and the expected over observed number of cases (E/O) ratio were estimated. In the two series, 393 and 78 premenopausal and 537 and 98 postmenopausal breast cancers were diagnosed. The discrimination values of the best combinations of predictors obtained from cross-validation ranged from 0.59 to 0.60. In the validation series, the AUC values in premenopausal and postmenopausal women were 0.583 [0.520; 0.646] and 0.621 [0.563; 0.679] using the kNN and 0.565 [0.500; 0.631] and 0.617 [0.561; 0.673] using the Cox model. The E/O ratios were 1.26 and 1.28 in premenopausal women and 1.44 and 1.40 in postmenopausal women. The applied Gail model provided AUC values of 0.614 [0.554; 0.675] and 0.549 [0.495; 0.604] and E/O ratios of 0.78 and 1.12. This study shows that the prediction performances differed according to menopausal status when using parametric statistical tools. The k-nearest-neighbor approach performed well, and discrimination was improved in postmenopausal women compared with the Gail model.

Original languageEnglish
Pages (from-to)415-426
Number of pages12
JournalBreast Cancer Research and Treatment
Issue number2
Publication statusPublished - Apr 2015
Externally publishedYes


  • Breast cancer
  • Calibration
  • Discrimination
  • Gail model
  • Menopausal status
  • Nearest-neighbor algorithm
  • Postmenopausal women
  • Premenopausal women
  • Proportional hazard Cox regression
  • Risk score
  • Women


Dive into the research topics of 'A comparison between different prediction models for invasive breast cancer occurrence in the French E3N cohort'. Together they form a unique fingerprint.

Cite this