TY - JOUR
T1 - A general computational approach to predicting synergistic transcriptional cores that determine cell subpopulation identities
AU - Okawa, Satoshi
AU - Del Sol, Antonio
N1 - Funding Information:
Funding for open access charge: University of Luxembourg.
Publisher Copyright:
© The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License
PY - 2019/4/23
Y1 - 2019/4/23
N2 - Advances in single-cell RNA-sequencing techniques reveal the existence of distinct cell subpopulations. Identification of transcription factors (TFs) that define the identity of these subpopulations poses a challenge. Here, we postulate that identity depends on background subpopulations, and is determined by a synergistic core combination of TFs mainly uniquely expressed in each subpopulation, but also TFs more broadly expressed across background subpopulations. Building on this view, we develop a new computational method for determining such synergistic identity cores of subpopulations within a given cell population. Our method utilizes an information-theoretic measure for quantifying transcriptional synergy, and implements a novel algorithm for searching for optimal synergistic cores. It requires only single-cell RNA-seq data as input, and does not rely on any prior knowledge of candidate genes or gene regulatory networks. Hence, it can be directly applied to any cellular systems, including those containing novel subpopulations. The method is capable of recapitulating known experimentally validated identity TFs in eight published single-cell RNA-seq datasets. Furthermore, some of these identity TFs are known to trigger cell conversions between subpopulations. Thus, this methodology can help design strategies for cell conversion within a cell population, guiding experimentalists in the field of stem cell research and regenerative medicine.
AB - Advances in single-cell RNA-sequencing techniques reveal the existence of distinct cell subpopulations. Identification of transcription factors (TFs) that define the identity of these subpopulations poses a challenge. Here, we postulate that identity depends on background subpopulations, and is determined by a synergistic core combination of TFs mainly uniquely expressed in each subpopulation, but also TFs more broadly expressed across background subpopulations. Building on this view, we develop a new computational method for determining such synergistic identity cores of subpopulations within a given cell population. Our method utilizes an information-theoretic measure for quantifying transcriptional synergy, and implements a novel algorithm for searching for optimal synergistic cores. It requires only single-cell RNA-seq data as input, and does not rely on any prior knowledge of candidate genes or gene regulatory networks. Hence, it can be directly applied to any cellular systems, including those containing novel subpopulations. The method is capable of recapitulating known experimentally validated identity TFs in eight published single-cell RNA-seq datasets. Furthermore, some of these identity TFs are known to trigger cell conversions between subpopulations. Thus, this methodology can help design strategies for cell conversion within a cell population, guiding experimentalists in the field of stem cell research and regenerative medicine.
UR - http://www.scopus.com/inward/record.url?scp=85064974669&partnerID=8YFLogxK
U2 - 10.1093/nar/gkz147
DO - 10.1093/nar/gkz147
M3 - Article
C2 - 30820550
AN - SCOPUS:85064974669
SN - 0305-1048
VL - 47
SP - 3333
EP - 3343
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - 7
ER -