Knowledge-Guided Multi-Modality Transformer for Multi-Label Genetic Mutation Prediction
Huang, Gexin ; Wu, Chenfei ; Li, Mingjie ; Chang, Xiaojun ; Sun, Ying ; Xing, Lei ; Liang, Xiaodan ; Lin, Liang ; Yang, Guang ; Zhao, Shen
Huang, Gexin
Wu, Chenfei
Li, Mingjie
Chang, Xiaojun
Sun, Ying
Xing, Lei
Liang, Xiaodan
Lin, Liang
Yang, Guang
Zhao, Shen
Supervisor
Department
Computer Vision
Embargo End Date
Type
Journal article
Date
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Genetic mutations are clinically significant biomarkers that guide cancer diagnosis and treatment. Predicting genetic mutations from whole slide images (WSIs) provides a cost-effective alternative to traditional genetic testing, but existing methods relying on multiple binary classifiers are inefficient in modeling intrinsic biological relationships between genes and inevitably suffer from class imbalance. We present the Biological-knowledge-enhanced PathGenomic multi-label Transformer (BPGT), the first multi-label framework for genetic mutation prediction from WSIs that explicitly incorporates intrinsic biological knowledge to guide feature learning. BPGT jointly models inter-gene dependencies and spatial pathology features via: (1) The gene encoder constructs biologically informed gene priors through two carefully designed sub-modules: (a) A gene graph whose node features combine the genes’ linguistic descriptions and cancer phenotypes, and whose edges are defined by pathway associations and mutation consistencies. (b) A knowledge association module fuses linguistic and biomedical knowledge into gene priors via transformer-based graph representation learning, capturing the intrinsic relationships among mutations of different genes. (2) The label decoder integrates these knowledge-driven gene priors with spatially relevant WSI regions via a modality fusion mechanism, and employs a comparative multi-label loss to improve discrimination between mutation profiles. These designs enable BPGT to address label imbalance, capture co-mutation patterns, and leverage non-visual domain knowledge in an end-to-end learning paradigm. We validate BPGT across nine cancers from TCGA and two from CPTAC, comprising over and 48 million image patches. Across diverse cancers and genes, BPGT consistently outperforms state-of-the-art methods in genetic mutation prediction accuracy and generalization.
Citation
Huang, G., Wu, C., Li, M., Chang, X., Sun, Y., Xing, L., Liang, X., Lin, L., Yang, G., Zhao, S. (2026). Knowledge-Guided Multi-Modality Transformer for Multi-Label Genetic Mutation Prediction. Pattern Recognition, 175, 113047-113047. https://doi.org/10.1016/j.patcog.2026.113047
Source
Pattern Recognition
Conference
Keywords
46 Information and Computing Sciences, 4603 Computer Vision and Multimedia Computation, 4605 Data Management and Data Science, 4611 Machine Learning
Subjects
Source
Publisher
Elsevier
