ALTRUIST: a Python package to emulate a Virtual Digital Cohort Study using social media data

Charline Bour, Abir Elbeji, Luigi De Giovanni, Adrian Ahne, Guy Fagherazzi*

*Corresponding author for this work

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Epidemiological cohort studies play a crucial role in identifying risk factors for various outcomes among participants. These studies are often time-consuming and costly due to recruitment and long-term follow-up. Social media (SM) data has emerged as a valuable complementary source for digital epidemiology and health research, as online communities of patients regularly share information about their illnesses. Unlike traditional clinical questionnaires, SM offer unstructured but insightful information about patients' disease burden. Yet, there is limited guidance on analyzing SM data as a prospective cohort. We presented the concept of virtual digital cohort studies (VDCS) as an approach to replicate cohort studies using SM data. In this paper, we introduce ALTRUIST, an open-source Python package enabling standardized generation of VDCS on SM. ALTRUIST facilitates data collection, preprocessing, and analysis steps that mimic a traditional cohort study. We provide a practical use case focusing on diabetes to illustrate the methodology. By leveraging SM data, which offers large-scale and cost-effective information on users' health, we demonstrate the potential of VDCS as an essential tool for specific research questions. ALTRUIST is customizable and can be applied to data from various online communities of patients, complementing traditional epidemiological methods and promoting minimally disruptive health research.

Original languageEnglish
Pages (from-to)568-575
Number of pages7
JournalIEEE Transactions on Big Data
Volume10
Issue number4
DOIs
Publication statusPublished - 5 Feb 2024

Keywords

  • Blogs
  • Cohort
  • Computational modeling
  • Digital Health
  • Diseases
  • Natural Language Processing
  • Python
  • Recruitment
  • Social Media
  • Social networking (online)
  • Sociology
  • Statistics

Fingerprint

Dive into the research topics of 'ALTRUIST: a Python package to emulate a Virtual Digital Cohort Study using social media data'. Together they form a unique fingerprint.

Cite this