Item

Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification

Alkhunaizi, Naif
Almalik, Faris
Al-Refai, Rouqaiah
Naseer, Muzammal
Nandakumar, Karthik
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
With the advent of large pre-trained transformer models, fine-tuning these models for various downstream tasks is a critical problem. Paucity of training data, the existence of data silos, and stringent privacy constraints exacerbate this fine-tuning problem in the medical imaging domain, creating a strong need for algorithms that enable collaborative fine-tuning of pre-trained models. Moreover, the large size of these models necessitates the use of parameter-efficient fine-tuning (PEFT) to reduce the communication burden in federated learning. In this work, we systematically investigate various federated PEFT strategies for adapting a Vision Transformer (ViT) model (pre-trained on a large natural image dataset) for medical image classification. Apart from evaluating known PEFT techniques, we introduce new federated variants of PEFT algorithms such as visual prompt tuning (VPT), low-rank decomposition of visual prompts, stochastic block attention fine-tuning, and hybrid PEFT methods like low-rank adaptation (LoRA)+VPT. Moreover, we perform a thorough empirical analysis to identify the optimal PEFT method for the federated setting and understand the impact of data distribution on federated PEFT, especially for out-of-domain (OOD) and non-IID data. The key insight of this study is that while most federated PEFT methods work well for in-domain transfer, there is a substantial accuracy vs. efficiency trade-off when dealing with OOD and non-IID scenarios, which is commonly the case in medical imaging. Specifically, every order of magnitude reduction in fine-tuned/exchanged parameters can lead to a 4% drop in accuracy. Thus, the choice of the initial model is critical for the effectiveness of federated PEFT - rather than starting with general vision models, it is preferable to use medical foundation models (if available) learned using in-domain medical image data. Code: https://github.com/Naiftt/PEFT.
Citation
N. Alkhunaizi, F. Almalik, R. Al-Refai, M. Naseer, and K. Nandakumar, “Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , vol. 15274 LNCS, pp. 236–245, 2025, doi: 10.1007/978-3-031-77610-6_22.
Source
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Conference
International Conference on Medical Image Computing and Computer-Assisted Intervention
Keywords
Federated Learning, Out-of-Domain Transfer, Parameter-Efficient Fine-tuning, Vision Transformers
Subjects
Source
International Conference on Medical Image Computing and Computer-Assisted Intervention
Publisher
Springer Nature
Full-text link