A Two-Level Dirichlet Framework for Heterogeneous Federated Network
Arisdakessian, Sarhad ; Wehbi, Osama ; Abdel Wahab, Omar ; Mourad, Azzam ; Otrok, Hadi ; Guizani, Mohsen
Arisdakessian, Sarhad
Wehbi, Osama
Abdel Wahab, Omar
Mourad, Azzam
Otrok, Hadi
Guizani, Mohsen
Supervisor
Department
Machine Learning
Embargo End Date
Type
Journal article
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Real-world federated learning (FL) deployments commonly confront heterogeneous data distributions across geographically dispersed or otherwise diverse client devices, posing major challenges for global model convergence and service performance. From a distributed information network and management perspective, effectively orchestrating model training under these non-IID conditions is crucial for preserving both training efficiency and system reliability. To evaluate FL algorithms under realistic conditions, researchers frequently perform synthetic non-IID partitioning of benchmark datasets, with Dirichlet-based label splits being one of the most common techniques. While this single-level Dirichlet partitioning successfully simulates label imbalance, it fails to capture the rich, subpopulation-level heterogeneity present in many real-life scenarios. This limitation arises because single-level Dirichlet partitioning only skews label distributions across clients, without considering the underlying feature-space variations that naturally emerge within subpopulations. In this paper, we propose a feature-aware two-level hierarchical Dirichlet distribution approach as an advanced alternative to the traditional Dirichlet partition in the realm of FL. Our method first clusters the dataset in a feature-embedding space, then applies two-tiered Dirichlet sampling at both the cluster and the within-cluster levels. Ultimately, this hierarchical Dirichlet technique has direct implications for the distributed network, offering a robust testbed for optimizing federated training workflows and maintaining system-wide performance in distributed learning applications. Experiments and simulations highlight that our approach yields more realistic data partitions, thereby stress-testing federated algorithms more thoroughly.
Citation
S. Arisdakessian, O. Wehbi, O. A. Wahab, A. Mourad, H. Otrok and M. Guizani, "A Two-Level Dirichlet Framework for Heterogeneous Federated Network," in IEEE Transactions on Network Science and Engineering, doi: 10.1109/TNSE.2025.3597541
Source
IEEE Transactions on Network Science and Engineering
Conference
Keywords
Federated Networks, Federated Learning, Non-IID Data, Distributed Learning Networks, Data Heterogeneity, Probabilistic Data Partitioning, Networked Systems, Complex Networks
Subjects
Source
Publisher
IEEE
