Loading...
Thumbnail Image
Item

Re-thinking Vision Transformers for Remote Sensing Scene Understanding

Noman, Mubashir
Department
Computer Vision
Embargo End Date
Type
Dissertation
Date
2024
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
This thesis proposes new vision transformer based methodologies for remote sensing (RS) scene understanding problems. The first contribution is the introduction of a new approach for RS change detection (CD) task achieves faster convergence and addresses the need of relying on pre-training on external CD data and then fine-tuning on the target benchmark. The 2nd contribution is an efficient CD method that leverages rich contextual information to precisely estimate change regions. The 3rd contribution is a new a change encoder that leverages local and global feature representations to capture both subtle and large change feature information to precisely estimate the change regions. The fourth contribution is the introduction of a method that leverages LMM to describe the changes between the RS images. Lastly, this thesis looks into transformers pre-training for multi-spectral satellite imagery and leverage multi-scale information that is effectively utilized with multiple modalities.
Citation
M. Noman, "Re-thinking Vision Transformers for Remote Sensing Scene Understanding", PhD Dissertation, Computer Vision, MBZUAI, Abu Dhabi, UAE, 2024
Source
Conference
Keywords
Subjects
Source
Publisher
DOI
Full-text link