基于AP聚类算法的联邦学习聚合算法

doi:10.3969/j.issn.1006-2475.2024.04.002

摘要/Abstract

摘要：
摘要：在传统的联邦学习中，多个客户端的本地模型由其隐私数据独立训练，中心服务器通过聚合本地模型生成共享的全局模型。然而，由于非独立同分布（Non-IID）数据等统计异质性，一个全局模型往往无法适应每个客户端。为了解决这个问题，本文提出一种针对Non-IID数据的基于AP聚类算法的联邦学习聚合算法（APFL）。在APFL中，服务器会根据客户端的数据特征，计算出每个客户端之间的相似度矩阵，再利用AP聚类算法对客户端划分不同的集群，构建多中心框架，为每个客户端计算出适合的个性化模型权重。将本文算法在FMINST数据集和CIFAR10数据集上进行实验，与传统联邦学习FedAvg相比，APFL在FMNIST数据集上提升了1.88个百分点，在CIFAR10数据集上提升了6.08个百分点。实验结果表明，本文所提出的APFL在Non-IID数据上可以提高联邦学习的精度性能。

关键词: 关键词：联邦学习, 非独立同分布, AP聚类算法

Abstract:
Abstract: In traditional federation learning， multiple clients’ local models are trained independently from their private data， and the central server generates a shared global model by aggregating the local models. However， due to statistical heterogeneity such as non-independent identically distributed （Non-IID） data， a global model often cannot be adapted to each client. To address this problem， this paper proposes an AP clustering algorithm-based federation learning aggregation algorithm （APFL） for Non-IID data. In APFL， the server calculates the similarity matrix between each client based on the data characteristics of the clients， and then uses the AP clustering algorithm to divide the clients into different clusters and construct a polycentric framework to calculate the suitable personalized model weights for each client. This algorithm is experimented on FMINST dataset and CIFAR10 dataset， and APFL improves 1.88 percentage points on FMNIST dataset and 6.08 percentage points on CIFAR10 dataset compared with traditional Federated Learning FedAvg. The results show that the proposed APFL improves the accuracy performance of Federated Learning on Non-IID data in this paper.

Key words: Key words: federal learning： non-independent identical distribution： AP clustering algorithm

中图分类号:

TP309

敖博超, 范冰冰. 基于AP聚类算法的联邦学习聚合算法[J]. 计算机与现代化, 2024, 0(04): 5-11.

AO Bochao, FAN Bingbing. Federated Learning Aggregation Algorithm Based on AP Clustering Algorithm[J]. Computer and Modernization, 2024, 0(04): 5-11.

参考文献

［1］周志华. 机器学习［M］. 北京:清华大学出版社， 2016.
［2］ MATER A C， COOTE M L. Deep learning in chemistry［J］. Journal of Chemical Information and Modeling， 2019，59（6）:2545-2559.
［3］刘铁岩，陈薇，王太峰，等. 分布式机器学习算法、理论与实践［M］. 北京:机械工业出版社， 2018.
［4］ SO J， GULER B， AVESTIMEHR A S. CodedPrivateML: A fast and privacy-preserving framework for distributed machine learning［J］. IEEE Journal on Selected Areas in Information Theory， 2021，2（1）:441-451.
［5］ SHOKRI R， SHMATIKOV V. Privacy-preserving deep learning［C］// Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM， 2015:1310-1321.
［6］ YANG Q， LIU Y， CHEN T J， et al. Federated machine learning: Concept and applications［J］. ACM Transactions on Intelligent Systems and Technology （TIST）， 2019，10（2）. DOI: 10.1145/3298981.
［7］ HSIEH K， PHANISHAYEE A， MUTLU O， et al. The non-IID data quagmire of decentralized machine learning［C］// Proceedings of the 37th International Conference on Machine Learning. ACM， 2020:4387-4398.
［8］ ZHENG S X， MENG Q， WANG T F， et al. Asynchronous stochastic gradient descent with delay compensation［C］// Proceedings of the 34th International Conference on Machine Learning. ACM， 2017:4120-4129.
［9］ WANG S Q， TUOR T， SALONIDIS T， et al. Adaptive federated learning in resource constrained edge computing systems［J］. IEEE Journal on Selected Areas in Communications， 2019，37（6）:1205-1221.
［10］郭桂娟，田晖，皮慧娟，等. 面向非独立同分布数据的联邦学习研究进展［J］. 小型微型计算机系统， 2023，44（11）:2442-2449.
［11］ NGUYEN H T， SEHWAG V， HOSSEINALIPOUR S， et al. Fast-convergent federated learning［J］. IEEE Journal on Selected Areas in Communications， 2021，39（1）:201-218.
［12］ MCMAHAN H B， MOORE E， RAMAGE D， et al. Communication-efficient learning of deep networks from decentralized data［C］// Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. JMLR， 2017:1273-1282.
［13］ ZHAO Y， LI M， LAI L Z， et al. Federated learning with non-IID data［J］. arXiv preprint arXiv:1806.00582， 2018.
［14］ LI T， SAHU A K， ZAHEER M， et al. Federated optimization in heterogeneous networks［J］. arXiv preprint arXiv:1812.06127， 2018.
［15］ KARIMIREDDY S P， KALE S， MOHRI M， et al. Scaffold: Stochastic controlled averaging for federated learning［C］// Proceedings of the 37th International Conference on Machine Learning. JMLR， 2020:5132-5143.
［16］ WANG J Y， LIU Q H， LIANG H， et al. Tackling the objective inconsistency problem in heterogeneous federated optimization［J］. arXiv preprint arXiv:2007.07481， 2020.
［17］ LIN T， KONG L J， STICH S U， et al. Ensemble distillation for robust model fusion in federated learning［J］. arXiv preprint arXiv:2006.07242， 2020.
［18］ MANSOUR Y， MOHRI M， RO J， et al. Three approaches for personalization with applications to federated learning［J］. arXiv preprint arXiv:2002.10619， 2020.
［19］ KOPPARAPU K， LIN E. FedFMC: Sequential efficient federated learning on non-IID data［J］. arXiv preprint arXiv:2006.10937， 2020.
［20］ GHOSH A， HONG J， YIN D， et al. Robust federated learning in a heterogeneous environment［J］. arXiv preprint arXiv:1906.06629， 2019.
［21］ GHOSH A， CHUNG J， YIN D， et al. An efficient framework for clustered federated learning［J］. arXiv preprint arXiv:2006.04088， 2020.
［22］ BRIGGS C， FAN Z， ANDRAS P. Federated learning with hierarchical clustering of local updates to improve training on non-IID data［C］// Proceedings of the 2020 International Joint Conference on Neural Networks （IJCNN）. IEEE， 2020. DOI: 10.1109/IJCNN48605.2020.9207469.
［23］ SATTLER F， MULLER K R， SAMEK W. Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints［J］. IEEE Transactions on Neural Networks and Learning Systems， 2021，32（8）:3710-3722.
［24］ LONG G D， XIE M， SHEN T， et al. Multi-center federated learning: Clients clustering for better personalization［J］. arXiv preprint arXiv:2108.08647， 2021.
［25］常黎明，刘颜红，徐恕贞. 基于数据分布的聚类联邦学习［J］. 计算机应用研究， 2023，40（6）:1697-1701.
［26］ FREY B J， DUECK D. Clustering by passing messages between data points［J］. Science， 2007，315（5814）:972-976.
［27］ ALAMI N， MEKNASSI M， EN-NAHNAHI N， et al. Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling［J］. Expert Systems with Applications， 2021，172. DOI: 10.1016/j.eswa.2021.114652.

[1]	曾钟静昕, 甘刚. 基于卷积自编码器的侧信道分析[J]. 计算机与现代化, 2024, 0(03): 110-114.
[2]	韩冬松, 沙乐天, 赵创业. 基于蠕虫和代理的工控系统攻击建模[J]. 计算机与现代化, 2023, 0(10): 107-114.