Loading...
Resistant convex clustering: How does the fusion penalty enhance resistance?
Sun, Qiang ; Zhang, Archer Gong ; Liu, Chenyu ; Tan, Kean Ming
Sun, Qiang
Zhang, Archer Gong
Liu, Chenyu
Tan, Kean Ming
Supervisor
Department
Statistics and Data Science
Embargo End Date
Type
Journal article
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Convex clustering is a convex relaxation of the k-means and hierarchical clustering. It involves solving a convex optimization problem with the objective function being a squared error loss plus a fusion penalty that encourages the estimated centroids for observations in the same cluster to be identical. However, when data are contaminated, convex clustering with a squared error loss fails even when there is only one arbitrary outlier. To address this challenge, we propose a resistant convex clustering method. Theoretically, we show that the new estimator is resistant to arbitrary outliers: it does not break down until more than half of the observations are arbitrary outliers. Perhaps surprisingly, the fusion penalty can help enhance resistance by fusing the estimators to the cluster centers of uncontaminated samples, but not the other way around. Numerical studies demonstrate the competitive performance of the proposed method. The R package is available at Rcvxclustr. © 2025, Institute of Mathematical Statistics.
Citation
Q. Sun, A. G. Zhang, C. Liu, and K. M. Tan, “Resistant convex clustering: How does the fusion penalty enhance resistance?,” https://doi.org/10.1214/25-EJS2359, vol. 19, no. 1, pp. 1199–1230, Jan. 2025, doi: 10.1214/25-EJS2359.
Source
Electronic Journal of Statistics
Conference
Keywords
Breakdown point, fusion penalty, outliers, resistance, robustness
Subjects
Source
Publisher
Institute of Mathematical Statistics and Bernoulli Society
