Item

SCMBench: benchmarking domain-specific and foundation models for single-cell multi-omics data integration

Wang, Yixuan
Fan, Yimin
Wang, Xuesong
Yu, Tingyang
Zong, Yongshuo
Liu, Xinyuan
Zhong, Gaoyang
Liu, Meitong
Li, Qing
Lee, Kin Hei
... show 10 more
Research Projects
Organizational Units
Journal Issue
Abstract
Recent advancements in single-cell sequencing technologies have led to the generation of vast amounts of multi-omics data, spurring the development of numerous integration tools. While multi-omics integration has significantly advanced cell research, there is still a lack of comprehensive evaluations and guidelines for these tools. This study benchmarks Domain-specific Models (DMs) and Foundation Models (FMs) for multi-omics data integration, assessing 23 methods with optimized hyperparameters on integration accuracy, biomarker detection, trajectory inference, and quantitative batch effect correction. We address current gaps in assessing the efficacy and limitations of FMs compared to DMs in the multi-omics integration task. Importantly, our comprehensive analysis goes beyond basic integration accuracy, focusing on the preservation of cellular characteristics, transcriptomic biomarkers, epigenomic regulatory elements, and development trajectories. This holistic approach enables researchers to extract meaningful insights from integration results, facilitating a deeper understanding of individual cells. Generally, we find FMs fall short of state-of-the-art DMs in this field. To bridge this performance gap, we propose a lightweight adaptation strategy that enhances their effectiveness in this task. Our findings serve as a guide for researchers in selecting suitable integration methods for specific single-cell analysis objectives and provide insights for future model design.
Citation
Y. Wang, Y. Fan, X. Wang, T. Yu, Y. Zong, X. Liu , et al., "SCMBench: benchmarking domain-specific and foundation models for single-cell multi-omics data integration," Nature Communications, 2026, https://doi.org/10.1038/s41467-026-72570-x.
Source
Nature Communications
Conference
Keywords
31 Biological Sciences, 3102 Bioinformatics and Computational Biology, 32 Biomedical and Clinical Sciences
Subjects
Source
Publisher
Springer Nature
Full-text link