Statistical analysis of correlated expression data from high throughput experiments
Wang, Peng ; Lyu, Pengfei ; Peddada, Shyamal ; Cao, Hongyuan
Wang, Peng
Lyu, Pengfei
Peddada, Shyamal
Cao, Hongyuan
Supervisor
Department
Statistics and Data Science
Embargo End Date
Type
Journal article
Date
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Data obtained from high throughput experiments often exhibit complex dependencies among features. These dependencies arise from various sources, including genetic correlation, batch effects, technical replicates, and shared biological pathways. Ignoring these dependencies can lead to inflated false discovery rate (FDR), reduced statistical power, and biased biological interpretations. Properly accounting for these dependencies is crucial for accurate detection of biological signals. We propose a new method called Analysis of Correlated Expressions (ACE) to compare the mean expression of features between two groups. ACE is based on a factor analytic model that accounts for dependence among features and also incorporates heterogeneity of variances between groups, a common feature of high throughput data. Furthermore, ACE does not require the data to be normally distributed. It is scalable and free of any unknown tuning parameters. Extensive simulation studies indicate that it is more powerful than many existing methods while controlling the FDR. Application of ACE to a microRNA dataset, a neuroblastoma gene expression dataset, and a Huntington's disease dataset resulted in some novel findings that were missed by existing methods.
Citation
P. Wang, P. Lyu, S. Peddada, H. Cao, "Statistical analysis of correlated expression data from high throughput experiments," Genetics, vol. 232, no. 1, pp. iyaf060-iyaf060, 2025, https://doi.org/10.1093/genetics/iyaf060.
Source
Genetics
Conference
Keywords
31 Biological Sciences, 3102 Bioinformatics and Computational Biology, 3105 Genetics, Algorithms, Computer Simulation, Data Interpretation, Statistical, Gene Expression Profiling, Humans, Huntington Disease, MicroRNAs, Models, Statistical, Neuroblastoma
Subjects
Source
Publisher
Oxford University Press
