Item

RRGMambaFormer: A hybrid Transformer-Mamba architecture for radiology report generation

Li, Hongzhao
Liu, Siwei
Wang, Hui
Jiang, Xiaoheng
Jiu, Mingyuan
Chen, Li
Lu, Yang
Li, Shupan
Xu, Mingliang
Supervisor
Department
Machine Learning
Embargo End Date
Type
Journal article
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Radiology report generation (RRG) is a critical yet time-consuming task in clinical practice, requiring a high level of expertise to ensure accuracy. Automating this process using generative AI has the potential to significantly enhance efficiency and alleviate the burden on radiologists. In this paper, we introduce RRGMambaFormer, a novel hybrid architecture specifically designed for cross-modal radiology report generation that processes both medical images and text data. Our architecture synergistically combines Transformer modeling with Mamba sequence processing technology to handle the unique challenges of cross-modal medical data generation. This approach addresses key challenges in processing cross-modal medical data, including modality alignment, semantic gap bridging, and the computational inefficiency of traditional Transformer architectures. By replacing traditional positional encoding with a Mamba preprocessing block and incorporating an additional Mamba layer in the decoder, RRGMambaFormer enhances dynamic adaptability to both visual and textual modalities, improves sequence modeling, and reduces computational overhead. Furthermore, a multi-granularity contextual memory block is introduced to balance local visual details with global textual context, ensuring semantic consistency between image features and generated reports. Experimental results demonstrate that RRGMambaFormer not only improves the quality of radiology reports through better cross-modal feature integration but also significantly reduces computational complexity and inference time, achieving superior performance with fewer model parameters compared to existing Transformer-based models. Our source code is available here.
Citation
H. Li et al., “RRGMambaFormer: A hybrid Transformer-Mamba architecture for radiology report generation,” Expert Syst Appl, vol. 279, p. 127419, Jun. 2025, doi: 10.1016/J.ESWA.2025.127419.
Source
Expert Systems with Applications
Conference
Keywords
Mamba, Radiology report generation, Transformer
Subjects
Source
Publisher
Elsevier
Full-text link