Serial analysis of gene expression (SAGE) allows a detailed, simultaneous analysis of thousands of genes without the need for prior, complete gene sequence information. However, due to its inherent complexity and the lack of complete structural and function knowledge, mining vast collections of SAGE data to extract useful knowledge poses great challenges to traditional analytical techniques. Moreover, SAGE data are characterized by a specific statistical model that has not been incorporated into traditional data analysis techniques. The analysis of SAGE data requires advanced, intelligent computational techniques, which consider the underlying biology and the statistical nature of SAGE data. By addressing the statistical properties demonstrated by SAGE data, this paper presents a new self-adaptive neural network, Poisson-based growing self-organizing map (PGSOM), which implements novel weight adaptation and neuron growing strategies. An empirical study of key dynamic mechanisms of PGSOM is presented. It was tested on three datasets, including synthetic and experimental SAGE data. The results indicate that, in comparison to traditional techniques, the PGSOM offers significant advantages in the context of pattern discovery and visualization in SAGE data. The pattern discovery and visualization platform discussed in this paper can be applied to other problem domains where the data are better approximated by a Poisson distribution.
|Number of pages
|IEEE Transactions on Information Technology in Biomedicine
|Published - Jul 2008
- Clustering analysis
- Pattern discovery and visualization
- Self-adaptive neural networks (SANNs)
- Serial analysis of gene expression (SAGE)