Loading...
Thumbnail Image
Item

Cluster-SCP: Similarity and Contrastive Learning to Enhance Pseudo Labels for Fine Tuning under Few Labels

Alsuhaibani, Abdullah
Alalawi, Abdulrahman
Razzak, Imran
Jameel, Shoaib
Wang, Xianzhi
Xu, Guandong
Supervisor
Department
Computational Biology
Embargo End Date
Type
Journal article
Date
License
http://creativecommons.org/licenses/by/4.0/
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Language models often underperform when fine tuned with limited labelled data. To reduce dependence on costly annotation, recent studies have explored clustering and pseudo labelling to leverage unlabeled text with only a few labelled examples for fine tuning. However, these methods face persistent challenges: clustering errors and pseudo-label mismatches can degrade fine-tuning performance. We propose Cluster-SCP, a novel framework that improves intra-cluster coherence and reduces cluster quantity to generate more reliable pseudo labels. The framework begins with K-Means initialization and applies an iterative refinement process, Intermediate Pseudo Labels, comprising two stages: embedding reassignment to minimize clustering errors, and cluster merging via contrastive learning with graph readouts. The refined pseudo labels are first used to fine tune language model, which is subsequently refined using a small set of labels. Experiments on three benchmark datasets show that Cluster-SCP consistently outperforms baseline and state-of-the-art methods, achieving at least a 5.3% accuracy improvement while reducing the number of clusters by 22%.
Citation
A. Alsuhaibani, A. Alalawi, I. Razzak, S. Jameel, X. Wang, G. Xu, "Cluster-SCP: Similarity and Contrastive Learning to Enhance Pseudo Labels for Fine Tuning under Few Labels," World Wide Web, vol. 29, no. 3, pp. 28-28, 2026, https://doi.org/10.1007/s11280-026-01415-w.
Source
World Wide Web
Conference
Keywords
46 Information and Computing Sciences, 4611 Machine Learning
Subjects
Source
Publisher
Springer Nature
Full-text link