Item

Bridging the gap: Enhancing the generalizability of epigenetic clocks through transfer learning

Luo, Lan
Shang, Lulu
Goodrich, Jaclyn M
Peterson, Karen E
Song, Peter XK
Supervisor
Department
Statistics and Data Science
Embargo End Date
Type
Journal article
Date
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Changes in DNA methylation patterns exhibit a high correlation with chronological age. Epigenetic clocks, developed through statistical models that estimate epigenetic age using the methylation levels of cytosine-guanine dinucleotide (CpG) sites, have emerged as powerful tools for understanding aging and age-related diseases. Despite their popularity, the generalizability of these clocks across diverse populations remains a challenge. Some of the widely used epigenetic clocks, such as Horvath’s clock (Genome Biol.14 (2013) 1–20) and the PedBE clock (Proc. Natl. Acad. Sci. USA117 (2020) 23329–23335), are shown to perform poorly in our target cohort. This loss of prediction accuracy raises concerns about their viability in calculating biological age in distinct demographic and ethnic groups. Technically, the feature space of existing clocks is yielded with an obsolete technique, potentially leading to systematic bias in the analysis of all target data generated by the EPIC 850K array. To address both population heterogeneity and technological advances, we adopt a transfer learning framework to calibrate existing epigenetic clocks by borrowing shared knowledge from diverse datasets. Furthermore, our transfer learning is built on kriging- and DNN-based methods for feature adaptation, to close the gap between existing clocks and our target data. We analyze data collected from 523 blood samples from a cohort of children and adolescents in the Early Life Exposure in Mexico to Environmental Toxicants (ELEMENT) study and show that our proposed transfer learning methods significantly improve prediction performance compared to existing clocks. Performance is further enhanced by using the CpG sites profiled on the higher-resolution EPIC array. More importantly, calibrated clocks produce epigenetic age accelerations that correlate better with stages of sexual maturation. Our methodology demonstrates the potential to bridge the gap between different DNA methylation datasets and various profiling platforms, thereby enhancing the applicability of epigenetic clocks across diverse population groups and contributing to more accurate aging research.
Citation
L. Luo, L. Shang, J.M. Goodrich, K.E. Peterson, P.X.K. Song, "Bridging the gap: Enhancing the generalizability of epigenetic clocks through transfer learning," The Annals of Applied Statistics, vol. 20, no. 1, pp. 68-89, 2026, https://doi.org/10.1214/26-aoas2136.
Source
The Annals of Applied Statistics
Conference
Keywords
49 Mathematical Sciences, 4905 Statistics
Subjects
Source
Publisher
Institute of Mathematical Statistics
Full-text link