Selected Publications
2019 - Present
See Google Scholar for a comprehensive list of publications.
2022
2022
- Hybrid Local SGD for Federated Learning with Heterogeneous CommunicationsYuanxiong Guo, Ying Sun , Rui Hu , and 1 more authorIn International Conference on Learning Representations , 2022
Communication is a key bottleneck in federated learning where a large number of edge devices collaboratively learn a model under the orchestration of a central server without sharing their own training data. While local SGD has been proposed to reduce the number of FL rounds and become the algorithm of choice for FL, its total communication cost is still prohibitive when each device needs to communicate with the remote server repeatedly for many times over bandwidth-limited networks. In light of both device-to-device (D2D) and device-to-server (D2S) cooperation opportunities in modern communication networks, this paper proposes a new federated optimization algorithm dubbed hybrid local SGD (HL-SGD) in FL settings where devices are grouped into a set of disjoint clusters with high D2D communication bandwidth. HL-SGD subsumes previous proposed algorithms such as local SGD and gossip SGD and enables us to strike the best balance between model accuracy and runtime. We analyze the convergence of HL-SGD in the presence of heterogeneous data for general nonconvex settings. We also perform extensive experiments and show that the use of hybrid model aggregation via D2D and D2S communications in HL-SGD can largely speed up the training time of federated learning.
- IoT Device Friendly and Communication-Efficient Federated Learning via Joint Model Pruning and QuantizationPavana Prakash , Jiahao Ding , Rui Chen , and 5 more authorsIEEE Internet of Things Journal, 2022
Federated learning (FL) through its novel applications and services has enhanced its presence as a promising tool in the Internet of Things (IoT) domain. Specifically, in a multiaccess edge computing setup with a host of IoT devices, FL is most suitable since it leverages distributed client data to train high-performance deep learning (DL) models while keeping the data private. However, the underlying deep neural networks (DNNs) are huge, preventing its direct deployment onto resource-constrained computing and memory-limited IoT devices. Besides, frequent exchange of model updates between the central server and clients in FL could result in a communication bottleneck. To address these challenges, in this article, we introduce GWEP, a model compression-based FL method. It utilizes joint quantization and model pruning to reap the benefits of DNNs while meeting the capabilities of resource-constrained devices. Consequently, by reducing the computational, memory, and network footprint of FL, the low-end IoT devices may be able to participate in the FL process. In addition, we provide theoretical guarantees of FL convergence. Through empirical evaluations, we demonstrate that our approach significantly outperforms the baseline algorithms by being up to 10.23 times faster with 11 times lesser communication rounds, while achieving high-model compression, energy efficiency, and learning performance.
- Constructing Mobile Crowdsourced COVID-19 Vulnerability Map with Geo-IndistinguishabilityRui Chen , Liang Li , Ying Ma , and 4 more authorsIEEE Internet of Things Journal, 2022
Preventing COVID-19 disease from spreading in communities will require proactive and effective healthcare resources allocations, such as vaccinations. A fine-grained COVID-19 vulnerability map will be essential to detect the high-risk communities and guild the effective vaccine policy. A mobile-crowdsourcing-based self-reporting approach is a promising solution. However, an accurate mobile-crowdsourcing-based map construction requests participants to report their actual locations, raising serious privacy concerns. To address this issue, we propose a novel approach to effectively construct a reliable community-level COVID-19 vulnerability map based on mobile crowdsourced COVID-19 self-reports without compromising participants’ location privacy. We design a geo-perturbation scheme where participants can locally obfuscate their locations with the geo-indistinguishability guarantee to protect their location privacy against any adversaries’ prior knowledge. To minimize the data utility loss caused by location perturbation, we first design an unbiased vulnerability estimator and formulate the location perturbation probability generation into a convex optimization. Its objective is to minimize the estimation error of the direct vulnerability estimator under the constraints of geo-indistinguishability. Given the perturbed locations, we integrate the perturbation probabilities with the spatial smoothing method to obtain reliable community-level vulnerability estimations that are robust to a small-sampling-size problem incurred by location perturbation. Considering the fast-spreading nature of coronavirus, we integrate the vulnerability estimates into the modified susceptible-infected-removed (SIR) model with vaccination for building a future trend map. It helps to provide a guideline for vaccine allocation when supply is limited. Extensive simulations based on real-world data demonstrate the proposed scheme superiority over the peer designs satisfying geo-indistinguishability in terms of estimation accuracy and reliability.
- Agent-Level Differentially Private Federated Learning via Compressed Model PerturbationYuanxiong Guo, Rui Hu , and Yanmin GongIn IEEE Conference on Communications and Network Security , 2022
Federated learning (FL) involves a network of distributed agents that collaboratively learn a common model without sharing their raw data. Privacy and communication are two critical concerns of federated learning, but they are often treated separately in the literature. While random noise can be added during the federated learning process to defend against privacy inference attacks, its magnitude is linearly proportional to the model size, which can be very large for modern deep neural networks and lead to severe degradation in model accuracy. On the other hand, various compression techniques have been proposed to improve the communication efficiency of federated learning, but their interplay with privacy protection is largely ignored. Motivated by the observation that privacy protection and communication reduction are closely related in the context of FL, we propose a new federated learning scheme called CMP-Fed that achieves agent-level differential privacy with high model accuracy by leveraging the communication compression techniques in FL with large model sizes. The key component of CMP-Fed is compressed model perturbation (CMP), which first compresses the shared model updates before perturbing them with random noise at each communication round of federated learning. Experimental results based on Fashion-MNIST dataset show that CMP-Fed can largely outperform the existing differentially private federated learning schemes in terms of model accuracy under the same privacy guarantee while still enjoying the communication benefit of model compression.
- Scalable and Low-Latency Federated Learning with Cooperative Mobile Edge NetworkingZhenxiao Zhang , Zhidong Gao , Yuanxiong Guo, and 1 more authorIEEE Transactions on Mobile Computing, 2022
Federated learning (FL) enables collaborative model training without centralizing data. However, the traditional FL framework is cloud-based and suffers from high communication latency. On the other hand, the edge-based FL framework that relies on an edge server co-located with access point for model aggregation has low communication latency but suffers from degraded model accuracy due to the limited coverage of edge server. In light of high-accuracy but high-latency cloud-based FL and low-latency but low-accuracy edge-based FL, this paper proposes a new FL framework based on cooperative mobile edge networking called cooperative federated edge learning (CFEL) to enable both high-accuracy and low-latency distributed intelligence at mobile edge networks. Considering the unique two-tier network architecture of CFEL, a novel federated optimization method dubbed cooperative edge-based federated averaging (CE-FedAvg) is further developed, wherein each edge server both coordinates collaborative model training among the devices within its own coverage and cooperates with other edge servers to learn a shared global model through decentralized consensus. Experimental results based on benchmark datasets show that CFEL can largely speed up the convergence speed and reduce the training time to achieve a target model accuracy compared with prior FL frameworks.
2021
2021
- Aggregation-Based Colocation Datacenter Energy Management in Wholesale MarketsYuanxiong Guo, Miao Pan , and Yanmin GongIEEE Transactions on Cloud Computing, 2021
In this paper, we study how colocation datacenter energy cost can be effectively reduced in the wholesale electricity market via cooperative power procurement. Intuitively, by aggregating workloads and renewables across a group of tenants in a colocation datacenter, the overall power demand uncertainty of the colocation datacenter can be reduced, resulting in less chance of being penalized when participating in the wholesale electricity market. We use cooperative game theory to model the cooperative electricity procurement process of tenants as a cooperative game, and show the cost saving benefits of aggregation. Then, a cost allocation scheme based on the marginal contribution of each tenant to the total expected cost is proposed to distribute the aggregation benefits among the participating tenants. Besides, we propose proportional cost allocation scheme to distribute the aggregation benefits among the participating tenants after realizations of power demand and market prices. Finally, numerical experiments based on real-world traces are conducted to illustrate the benefits of aggregation compared to noncooperative power procurement.
- Data-Driven Spectrum Trading with Secondary Users’ Differential Privacy PreservationJingyi Wang , Xinyue Zhang , Qixun Zhang , and 4 more authorsIEEE Transactions on Dependable and Secure Computing, 2021
Spectrum trading benefits both secondary users (SUs) and primary users (PUs), while it poses great challenges to maximize PUs’ revenue, since SUs’ demands are uncertain and individual SU’s traffic portfolio contains private information. In this paper, we propose a data-driven spectrum trading scheme which maximizes PUs’ revenue and preserves SUs’ demand differential privacy. Briefly, we introduce a novel network architecture consisting of the primary service provider (PSP), the secondary service provider (SSP) and the secondary traffic estimator and database (STED). Under the proposed architecture, PSP aggregates available spectrum from PUs, and sells the spectrum to SSP at fixed wholesale price, directly to SUs at spot price, or both. The PSP has to accurately estimate SUs’ demands. To estimate SUs’ demand, the STED exploits data-driven approach to choose sampled SUs to construct the reference distribution of SUs’ demands, and utilizes reference distribution to estimate the demand distribution of all SUs. Moreover, the STED adds noises to preserve the demand differential privacy of sampled SUs before it answers the demand estimation queries from the PSP. With the estimated SUs’ demand, we formulate the revenue maximization problem into a risk-averse optimization, develop feasible solutions, and verify its effectiveness through both theoretical proof and simulations.
- Incentivizing differentially private federated learning: A multi-dimensional contract approachMaoqiang Wu , Dongdong Ye , Jiahao Ding , and 3 more authorsIEEE Internet of Things Journal, 2021
Federated learning is a promising tool in the Internet-of-Things (IoT) domain for training a machine learning model in a decentralized manner. Specifically, the data owners (e.g., IoT device consumers) keep their raw data and only share their local computation results to train the global model of the model owner (e.g., an IoT service provider). When executing the federated learning task, the data owners contribute their computation and communication resources. In this situation, the data owners have to face privacy issues where attackers may infer data property or recover the raw data based on the shared information. Considering these disadvantages, the data owners will be reluctant to use their data to participate in federated learning without a well-designed incentive mechanism. In this article, we deliberately design an incentive mechanism jointly considering the task expenditure and privacy issue of federated learning. Based on a differentially private federated learning (DPFL) framework that can prevent the privacy leakage of the data owners, we model the contribution as well as the computation, communication, and privacy costs of each data owner. The three types of costs are data owners’ private information unknown to the model owner, which thus forms an information asymmetry. To maximize the utility of the model owner under such information asymmetry, we leverage a 3-D contract approach to design the incentive mechanism. The simulation results validate the effectiveness of the proposed incentive mechanism with the DPFL framework compared to other baseline mechanisms.
- Data-Driven Caching with Users’ Content Preference Privacy in Information-Centric NetworksXinyue Zhang , Hongning Li , Jingyi Wang , and 4 more authorsIEEE Transactions on Wireless Communications, 2021
Information-centric networking (ICN) as an emerging networking paradigm has recently gained significant attention, due to the improvement of content delivery efficiency. The built-in network storage for caching is a key component in ICN to provide low latency service and reduce high backhaul traffic by caching popular content. However, users’ content preference contains individual sensitive characteristics which is distinguishable from others. Therefore, in this work, we propose a data-driven caching revenue maximization problem with the considerations of users’ local differential privacy. Specifically, we employ dBitFlip, a local differential privacy (LDP) mechanism, to locally add differential private noise to the users’ preference content information. We leverage data-driven approach to predict the content popularity based on the reference distribution constructed by the reported noisy preference content data from users, mathematically present the distance between the noisy reference distribution and the true distribution by the tolerance level, and prove the relationship among the tolerance level, differential privacy budget and the confidence level. We provide feasible solutions to the proposed revenue maximization problem, and conduct simulations to show the effectiveness of the proposed scheme.
- Federated Learning with Sparsification-Amplified Privacy and Adaptive OptimizationRui Hu , Yanmin Gong , and Yuanxiong GuoIn IJCAI , 2021
Federated learning (FL) enables distributed agents to collaboratively learn a centralized model without sharing their raw data with each other. However, data locality does not provide sufficient privacy protection, and it is desirable to facilitate FL with rigorous differential privacy (DP) guarantee. Existing DP mechanisms would introduce random noise with magnitude proportional to the model size, which can be quite large in deep neural networks. In this paper, we propose a new FL framework with sparsification-amplified privacy. Our approach integrates random sparsification with gradient perturbation on each agent to amplify privacy guarantee. Since sparsification would increase the number of communication rounds required to achieve a certain target accuracy, which is unfavorable for DP guarantee, we further introduce acceleration techniques to help reduce the privacy cost. We rigorously analyze the convergence of our approach and utilize Renyi DP to tightly account the end-to-end DP guarantee. Extensive experiments on benchmark datasets validate that our approach outperforms previous differentially-private FL approaches in both privacy guarantee and communication efficiency.
- PRAM: a Practical Sybil-Proof Auction Mechanism for Dynamic Spectrum Access with Untruthful AttackersXuewen Dong , Yuanyu Zhang , Yuanxiong Guo, and 3 more authorsIEEE Transactions on Mobile Computing, 2021
Auction is becoming increasingly popular for dynamic spectrum access (DSA), while it is extremely vulnerable to sybil attacks. Existing studies on sybil-proof DSA auction impractically assume that attackers bid truthfully based on true appraisals. This paper, for the first time, considers untruthful attackers and investigates the sybil-proof auction design in such more hazardous scenarios. To justify the new assumption, we first show that attackers obtain higher utilities by bidding untruthfully, especially in networks with inadequate channels. Based on this novel finding, we then design a practical sybil attack model named EqualSumBid Sybil, where attackers follow an equal-sum rule (i.e., the sum bid value of the multiple identities of an attacker equals the bid value when it bids with only one identity) instead of their true appraisals. To ensure efficient DSA under the new attack, we finally propose the PRAM, a Practical sybil-pRoof Auction Mechanism, where suspicious identity merging and bid-independent bidder sorting methods are introduced to alleviate the effect of untruthfulness on spectrum auction. Furthermore, winner selection and payment methods are designed to resist the EqualSumBid Sybil attack. Theoretical analyses and numerical results show that PRAM not only resists the EqualSumBid Sybil attack but also achieves individual rationality and truthfulness.
- Concentrated Differentially Private Federated Learning With Performance AnalysisRui Hu , Yuanxiong Guo, and Yanmin GongIEEE Open Journal of the Computer Society, 2021
Federated learning engages a set of edge devices to collaboratively train a common model without sharing their local data and has advantage in user privacy over traditional cloud-based learning approaches. However, recent model inversion attacks and membership inference attacks have demonstrated that shared model updates during the interactive training process could still leak sensitive user information. Thus, it is desirable to provide rigorous differential privacy (DP) guarantee in federated learning. The main challenge to providing DP is to maintain high utility of federated learning model with repeatedly introduced randomness of DP mechanisms, especially when the server is not fully trusted. In this paper, we investigate how to provide DP to the most widely adopted federated learning scheme, federated averaging. Our approach combines local gradient perturbation, secure aggregation, and zero-concentrated differential privacy (zCDP) for better utility and privacy protection without a trusted server. We jointly consider the performance impacts of randomnesses introduced by the DP mechanism, client sampling and data subsampling in our approach, and theoretically analyze the convergence rate and end-to-end DP guarantee with non-convex loss functions. We also demonstrate that our proposed method has good utility-privacy trade-off through extensive numerical experiments on the real-world dataset.
2020
2020
- Joint Task Offloading and Resource Allocation in UAV-Enabled Mobile Edge ComputingZhe Yu , Yanmin Gong , Shimin Gong , and 1 more authorIEEE Internet of Things Journal, 2020
Mobile edge computing (MEC) is an emerging technology to support resource-intensive yet delay-sensitive applications using small cloud-computing platforms deployed at the mobile network edges. However, the existing MEC techniques are not applicable to the situation where the number of mobile users increases explosively or the network facilities are sparely distributed. In view of this insufficiency, unmanned aerial vehicles (UAVs) have been employed to improve the connectivity of ground Internet of Things (IoT) devices due to their high altitude. This article proposes an innovative UAV-enabled MEC system involving the interactions among IoT devices, UAV, and edge clouds (ECs). The system deploys and operates a UAV properly to facilitate the MEC service provisioning to a set of IoT devices in regions where the existing ECs cannot be accessible to IoT devices due to terrestrial signal blockage or shadowing. The UAV and ECs in the system collaboratively provide MEC services to the IoT devices. For optimal service provisioning in this system, we formulate an optimization problem aiming at minimizing the weighted sum of the service delay of all IoT devices and UAV energy consumption by jointly optimizing UAV position, communication and computing resource allocation, and task splitting decisions. However, the resulting optimization problem is highly nonconvex and thus, difficult to solve optimally. To tackle this problem, we develop an efficient algorithm based on the successive convex approximation to obtain suboptimal solutions. Numerical experiments demonstrate that our proposed collaborative UAV-EC offloading scheme largely outperforms baseline schemes that solely rely on UAV or ECs for MEC in IoT.
- Personalized federated learning with differential privacyRui Hu , Yuanxiong Guo, Hongning Li , and 2 more authorsIEEE Internet of Things Journal, 2020
To provide intelligent and personalized services on smart devices, machine learning techniques have been widely used to learn from data, identify patterns, and make automated decisions. Machine learning processes typically require a large amount of representative data that are often collected through crowdsourcing from end users. However, user data could be sensitive in nature, and training machine learning models on these data may expose sensitive information of users, violating their privacy. Moreover, to meet the increasing demand of personalized services, these learned models should capture their individual characteristics. This article proposes a privacy-preserving approach for learning effective personalized models on distributed user data while guaranteeing the differential privacy of user data. Practical issues in a distributed learning system such as user heterogeneity are considered in the proposed approach. In addition, the convergence property and privacy guarantee of the proposed approach are rigorously analyzed. The experimental results on realistic mobile sensing data demonstrate that the proposed approach is robust to user heterogeneity and offers a good tradeoff between accuracy and privacy.
- Private Empirical Risk Minimization with Analytic Gaussian Mechanism for Healthcare SystemJiahao Ding , Sai Mounika Errapotu , Yuanxiong Guo, and 3 more authorsIEEE Transactions on Big Data, 2020
With the wide range application of machine learning in healthcare for helping humans drive crucial decisions, data privacy becomes an inevitable concern due to the utilization of sensitive data such as patients records and registers of a company. Thus, constructing a privacy preserving machine learning model while still maintaining high accuracy becomes a challenging problem. In this paper, we propose two differentially private algorithms, i.e., Output Perturbation with aGM (OPERA) and Gradient Perturbation with aGM (GRPUA) for empirical risk minimization, a useful method to obtain a globally optimal classifier, by leveraging the analytic Gaussian mechanism (aGM) to achieve privacy preservation of sensitive medical data in a healthcare system. We theoretically analyze and prove utility upper bounds of proposed algorithms and compare them with prior algorithms in the literature. The analyses show that in the high privacy regime, our proposed algorithms can achieve a tighter utility bound for both settings: strongly convex and non-strongly convex loss functions. Besides, we evaluate the proposed private algorithms on three benchmark datasets, i.e., Adult, BANK and IPUMS-BR. The simulation results demonstrate that our approaches can achieve higher accuracy and lower objective values compared with existing ones in all three datasets while providing differential privacy guarantees.
2019
2019
- Dynamic multi-tenant coordination for sustainable colocation data centersYuanxiong Guo, Miao Pan , Yanmin Gong , and 1 more authorIEEE Transactions on Cloud Computing, 2019
Colocation data centers are an important type of data centers that have some unique challenges in managing their energy consumption. Tenants in a colocation data center usually manage their servers independently without coordination, leading to inefficiency. To address this issue, we propose a formulation of coordinated energy management for colocation data centers. Considering the randomness of workload arrival and electricity cost function, we formulate it as a stochastic optimization problem, and then develop an online algorithm to solve it efficiently. Our algorithm is based on Lyapunov optimization, which only needs to track the instantaneous values of the underlying random factors without requiring any knowledge of the statistics or future information. Moreover, alternating direction method of multipliers (ADMM) is utilized to implement our algorithm in a decentralized way, making it easy to be implemented in practice. We analyze the performance of our online algorithm, proving that it is asymptotically optimal and robust to the statistics of the involved random factors. Moreover, extensive trace-based simulations are conducted to illustrate the effectiveness of our approach.
- Beef Up the Edge: Spectrum-Aware Placement of Edge Computing Services for the Internet of ThingsHaichuan Ding , Yuanxiong Guo, Xuanheng Li , and 1 more authorIEEE Transactions on Mobile Computing, 2019
In this paper, we introduce a network entity called point of connection (PoC), which is equipped with customized powerful communication, computing, and storage (CCS) capabilities, and design a data transportation network (DART) of interconnected PoCs to facilitate the provision of Internet of Things (IoT) services. By exploiting the powerful CCS capabilities of PoCs, DART brings both communication and computing services much closer to end devices so that resource-constrained IoT devices could have access to the desired communication and computing services. To achieve the design goals of DART, we further study the spectrum-aware placement of edge computing services. We formulate the service placement as a stochastic mixed-integer optimization problem and propose an enhanced coarse-grained fixing procedure to facilitate efficient solution finding. Through extensive simulations, we demonstrate the effectiveness of the resulting spectrum-aware service placement strategies and the proposed solution approach.
- DP-ADMM: ADMM-based distributed learning with differential privacyZonghao Huang , Rui Hu , Yuanxiong Guo, and 2 more authorsIEEE Transactions on Information Forensics and Security, 2019
Alternating direction method of multipliers (ADMM) is a widely used tool for machine learning in distributed settings where a machine learning model is trained over distributed data sources through an interactive process of local computation and message passing. Such an iterative process could cause privacy concerns of data owners. The goal of this paper is to provide differential privacy for ADMM-based distributed machine learning. Prior approaches on differentially private ADMM exhibit low utility under high privacy guarantee and assume the objective functions of the learning problems to be smooth and strongly convex. To address these concerns, we propose a novel differentially private ADMM-based distributed learning algorithm called DP-ADMM, which combines an approximate augmented Lagrangian function with time-varying Gaussian noise addition in the iterative process to achieve higher utility for general objective functions under the same differential privacy guarantee. We also apply the moments accountant method to analyze the end-to-end privacy loss. The theoretical analysis shows that the DP-ADMM can be applied to a wider class of distributed learning problems, is provably convergent, and offers an explicit utility-privacy tradeoff. To our knowledge, this is the first paper to provide explicit convergence and utility properties for differentially private ADMM-based distributed learning algorithms. The evaluation results demonstrate that our approach can achieve good convergence and model accuracy under high end-to-end differential privacy guarantee.
- Exploiting backscatter-aided relay communications with hybrid access model in device-to-device networksShimin Gong , Lin Gao , Jing Xu , and 3 more authorsIEEE Transactions on Cognitive Communications and Networking, 2019
The backscatter and active RF radios can complement each other and bring potential performance gain. In this paper, we envision a dual-mode radio structure that allows each device to make smart decisions on mode switch between backscatter communications (i.e., the passive mode) or RF communications (i.e., the active mode), according to the channel and energy conditions. The flexibility in mode switching also makes it more complicated for transmission control and network optimization. To exploit the radio diversity gain, we consider a wireless powered device-to-device network of hybrid radios and propose a sum throughput maximization by jointly optimizing energy beamforming and transmission scheduling in two radio modes. We further exploit the user cooperation gain by allowing the passive radios to relay for the active radios. As such, the sum throughput maximization is reformulated into a non-convex. We first present a sub-optimal algorithm based on successive convex approximation, which optimizes the relays’ reflection coefficients by iteratively solving semi-definite programs. We also devise a set of heuristic algorithms with reduced computational complexity, which are shown to significantly improve the sum throughput and amenable for practical implementation.