Automated Generation of Chest X-Ray Reports
Nazarov, Otabek
Nazarov, Otabek
Author
Supervisor
Department
Machine Learning
Embargo End Date
Type
Thesis
Date
2022
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
In this work, we focus on (i) understanding the relative importance of encoder and decoder components, and (ii) developing a new reward for REINFORCE-based model optimization to improve the clinical accuracy of the reports. We analyze four different image encoding approaches: direct, fine-grained, CLIP-based, and Cluster-CLIP-based encodings in conjunction with three different decoders on the large-scale MIMIC-CXR dataset. Among these encoders, the cluster CLIP visual encoder is a novel approach that aims to generate more discriminative and explainable representations. CLIP-based encoders produce comparable results to traditional CNN-based encoders in terms of NLP metrics, while fine-grained encoding outperforms all other encoders both in terms of NLP and clinical accuracy metrics, thereby validating the Importance of image encoders to extract semantic information effectively. We also propose a new reward for REINFORCE-based optimization. The reward relies on question-answering (QA) transformer models. QA model selects the most relevant spans of the generated reports and the model is optimized with respect to those important spans. The QA-based reward doesn t perform as well as other existing rewards in the REINFORCE-based optimization, but we outline its current weaknesses and propose further modifications for its improvement.
Citation
N. Otabek, "Automated Generation of Chest X-Ray Reports", M.S. Thesis, Machine Learning, MBZUAI, Abu Dhabi, UAE, 2022.
