Item

Learning Compact Discriminant Representation via Low-rank Bilinear Pooling

Song, Kun
Li, Hao
Cheng, Gong
Han, Junwei
Nie, Feiping
Gu, Bin
Karray, Fakhri
Supervisor
Department
Machine Learning
Embargo End Date
Type
Journal article
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
In this paper, we explain the mechanism of bilinear pooling as a module of hard sample generation, and find that bilinear pooling significantly expands variances of the first-order vectors when it produces discriminative bilinear features. In conjunction with the extremely high dimensionality of the obtained bilinear features, those variances lead to overfitting in subsequent learning models. To solve this issue, we construct a bi-level optimization problem, where the high-level problem is the supervised classification loss, and the low-level problem is the principal component analysis (PCA). Then, we find that PCA on bilinear features is equivalent to spectral clustering, which allows us to mathematically prove that the first log2(C) principal components can support the discriminant information of C classes. By removing the rest principal components, the dimensionality and variances are simultaneously reduced. To the best of our knowledge, this is the first work providing a lower bound for dimension reduction for bilinear pooling. However, the PCA projection matrix L is prone to overfitting due to having many parameters. To address this issue, we propose a rank-k general bilinear projection (RK-GBP) that decomposes L into two small matrices U and V, whose learnable parameters are smaller. Different from traditional bilinear projections used in factorized bilinear pooling (FBiP), our RK-GBP can preserve the orthogonality of columns in L by constraining the orthogonality of columns in U and V. For computational efficiency, we relax the PCA in the low-level task into a dictionary learning problem, obtaining the rank-k orthogonal factorization bilinear pooling (RK-OFBP). The RK-OFBP can be considered as a general form of current factorization bilinear pooling methods (e.g. Hadamard product-based ones). Finally, we evaluate our approach on fine-grained images and large-scale datasets, demonstrating that our proposed method not only produces extremely low-dimensional features but also outperforms other methods in classification tasks. For example, our RK-OFBP can employ 32-dimensional vectors to achieve comparable results to B-CNN [1] (dimension: 512*512) for the 200-class classification task.
Citation
K. Song et al., "Learning Compact Discriminant Representation via Low-rank Bilinear Pooling," in IEEE Transactions on Pattern Analysis and Machine Intelligence, doi: 10.1109/TPAMI.2025.3601355
Source
IEEE transactions on pattern analysis and machine intelligence
Conference
Keywords
Rank-k Bilinear Projection, Bilinear Pooling, Linear Dimensionality Reduction, Normalized Cuts, Spectral Clustering
Subjects
Source
Publisher
IEEE
Full-text link