AgroGPT : Efficient Agricultural Vision-Language Model with Expert Tuning
Awais, Muhammad ; Alharthi, Ali Husain Salem Abdulla ; Kumar, Amandeep ; Cholakkal, Hisham ; Anwer, Rao Muhammad
Awais, Muhammad
Alharthi, Ali Husain Salem Abdulla
Kumar, Amandeep
Cholakkal, Hisham
Anwer, Rao Muhammad
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Significant progress has been made in advancing large multimodal conversational models (LMMs), capitalizing on vast repositories of image-text data available online. Despite this progress, these models often encounter substantial domain gaps, hindering their ability to engage in complex conversations across new domains. Recent efforts have aimed to mitigate this issue, albeit relying on domainspecific image-text data to curate instruction-tuning data. However, many domains, such as agriculture, lack such vision-language data. In this work, we propose an approach to construct instruction-tuning data that harnesses vision-only data for the agriculture domain. We utilize diverse agricultural datasets spanning multiple domains, curate class-specific information, and employ large language models (LLMs) to construct an expert-tuning set, resulting in a 70k AgroInstruct. Subsequently, we expert-tuned and created AgroGPT, an efficient LMM that can hold complex agriculture-related conversations and provide useful insights. We also develop AgroEvals for evaluation and compare AgroGPT's performance with large open and closedsource models. AgroGPT excels at identifying fine-grained agricultural concepts, can act as an agriculture expert, and provides helpful information for multimodal agriculture questions. The code, datasets, and models are available at https://github.com/awaisrauf/agroGPT.
Citation
M. Awais, A. H. Salem Abdulla Alharthi, A. Kumar, H. Cholakkal and R. M. Anwer, "AgroGPT : Efficient Agricultural Vision-Language Model with Expert Tuning," 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, AZ, USA, 2025, pp. 5687-5696, doi: 10.1109/WACV61041.2025.00555.
Source
Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
Conference
2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025
Keywords
Subjects
Source
2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025
Publisher
IEEE
