Item

Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis

Luo, Ruichen
Stich, Sebastian Urban
Horváth, Samuel
Takáč, Martin
Supervisor
Department
Machine Learning
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
LocalSGD and SCAFFOLD are widely used methods in distributed stochastic optimization, with numerous applications in machine learning, large-scale data processing, and federated learning. However, rigorously establishing their theoretical advantages over simpler methods, such as minibatch SGD (MbSGD), has proven challenging, as existing analyses often rely on strong assumptions, unrealistic premises, or overly restrictive scenarios. In this work, we revisit the convergence properties of LocalSGD and SCAFFOLD under a variety of existing or weaker conditions, including gradient similarity, Hessian similarity, weak convexity, and Lipschitz continuity of the Hessian. Our analysis shows that (i) LocalSGD achieves faster convergence compared to MbSGD for weakly convex functions without requiring stronger gradient similarity assumptions; (ii) LocalSGD benefits significantly from higher-order similarity and smoothness; and (iii) SCAFFOLD demonstrates faster convergence than MbSGD for a broader class of non-quadratic functions. These theoretical insights provide a clearer understanding of the conditions under which LocalSGD and SCAFFOLD outperform MbSGD.
Citation
R. Luo, S. U. Stich, S. Horváth, and M. Takáč, “Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis,” in Proc. 28th Int. Conf. Artif. Intell. Stat. (AISTATS 2025), vol. 258, Proc. Mach. Learn. Res., Mai Khao, Thailand, May 3–5, 2025, pp. 2539–2547.
Source
Proceedings of Machine Learning Research
Conference
28th International Conference on Artificial Intelligence and Statistics, AISTATS 2025
Keywords
Data Handling, Federated Learning, Learning Systems, Machine Learning, Optimization, Condition, Convergence Properties, Convex Functions, Fast Convergence, High-order, Large-scale Data Processing, Lipschitz Continuity, Machine-learning, Simple Method, Stochastic Optimizations, Stochastic Systems
Subjects
Source
28th International Conference on Artificial Intelligence and Statistics, AISTATS 2025
Publisher
ML Research Press
DOI
Full-text link