TY - JOUR
T1 - A deep neural network approach to predicting clinical outcomes of neuroblastoma patients
AU - Tranchevent, Léon Charles
AU - Azuaje, Francisco
AU - Rajapakse, Jagath C.
N1 - PROCEEDINGS PAPER from 18th International Conference on Bioinformatics (InCoB 2019) Jakarta, Indonesia. 10-12 September 2019
PY - 2019/12/20
Y1 - 2019/12/20
N2 - Background: The availability of high-throughput omics datasets from large patient cohorts has allowed the development of methods that aim at predicting patient clinical outcomes, such as survival and disease recurrence. Such methods are also important to better understand the biological mechanisms underlying disease etiology and development, as well as treatment responses. Recently, different predictive models, relying on distinct algorithms (including Support Vector Machines and Random Forests) have been investigated. In this context, deep learning strategies are of special interest due to their demonstrated superior performance over a wide range of problems and datasets. One of the main challenges of such strategies is the "small n large p" problem. Indeed, omics datasets typically consist of small numbers of samples and large numbers of features relative to typical deep learning datasets. Neural networks usually tackle this problem through feature selection or by including additional constraints during the learning process. Methods: We propose to tackle this problem with a novel strategy that relies on a graph-based method for feature extraction, coupled with a deep neural network for clinical outcome prediction. The omics data are first represented as graphs whose nodes represent patients, and edges represent correlations between the patients' omics profiles. Topological features, such as centralities, are then extracted from these graphs for every node. Lastly, these features are used as input to train and test various classifiers. Results: We apply this strategy to four neuroblastoma datasets and observe that models based on neural networks are more accurate than state of the art models (DNN: 85%-87%, SVM/RF: 75%-82%). We explore how different parameters and configurations are selected in order to overcome the effects of the small data problem as well as the curse of dimensionality. Conclusions: Our results indicate that the deep neural networks capture complex features in the data that help predicting patient clinical outcomes.
AB - Background: The availability of high-throughput omics datasets from large patient cohorts has allowed the development of methods that aim at predicting patient clinical outcomes, such as survival and disease recurrence. Such methods are also important to better understand the biological mechanisms underlying disease etiology and development, as well as treatment responses. Recently, different predictive models, relying on distinct algorithms (including Support Vector Machines and Random Forests) have been investigated. In this context, deep learning strategies are of special interest due to their demonstrated superior performance over a wide range of problems and datasets. One of the main challenges of such strategies is the "small n large p" problem. Indeed, omics datasets typically consist of small numbers of samples and large numbers of features relative to typical deep learning datasets. Neural networks usually tackle this problem through feature selection or by including additional constraints during the learning process. Methods: We propose to tackle this problem with a novel strategy that relies on a graph-based method for feature extraction, coupled with a deep neural network for clinical outcome prediction. The omics data are first represented as graphs whose nodes represent patients, and edges represent correlations between the patients' omics profiles. Topological features, such as centralities, are then extracted from these graphs for every node. Lastly, these features are used as input to train and test various classifiers. Results: We apply this strategy to four neuroblastoma datasets and observe that models based on neural networks are more accurate than state of the art models (DNN: 85%-87%, SVM/RF: 75%-82%). We explore how different parameters and configurations are selected in order to overcome the effects of the small data problem as well as the curse of dimensionality. Conclusions: Our results indicate that the deep neural networks capture complex features in the data that help predicting patient clinical outcomes.
KW - Clinical outcome prediction
KW - Deep learning
KW - Deep neural network
KW - Disease prediction
KW - Graph topology
KW - Machine learning
KW - Network-based methods
UR - http://www.scopus.com/inward/record.url?scp=85077073432&partnerID=8YFLogxK
UR - https://www.ncbi.nlm.nih.gov/pubmed/31856829
U2 - 10.1186/s12920-019-0628-y
DO - 10.1186/s12920-019-0628-y
M3 - Conference article
C2 - 31856829
AN - SCOPUS:85077073432
SN - 1755-8794
VL - 12
JO - BMC Medical Genomics
JF - BMC Medical Genomics
M1 - 178
ER -