论文标题
基于相似性的标签推理攻击反对培训和分裂学习推理
Similarity-based Label Inference Attack against Training and Inference of Split Learning
论文作者
论文摘要
分裂学习是保存隐私分布式学习的有希望的范式。可以将学习模型切成多个部分,通过仅在切割层的中间结果交换中间结果,以在参与者进行协作培训。了解分裂学习的安全性能对于许多对隐私敏感的应用程序至关重要。本文表明,所交换的中间结果,包括粉碎的数据(即从原始数据中提取的功能)和在培训和推断分裂学习期间的梯度已经可以揭示专用标签。我们数学分析了潜在的标签泄漏,并分别提出了梯度和粉碎数据的余弦和欧几里得相似性测量。然后,两种相似性测量结果被证明是在欧几里得空间中统一的。根据相似性度量,我们设计了三个标签推理攻击,以在训练阶段和推理阶段有效地恢复专用标签。实验结果验证了提出的方法可以达到标签攻击的接近100%精度。拟议的攻击仍然可以实现针对各种最新防御机制的准确预测,包括DP-SGD,标签差异隐私,梯度压缩和Marvell。
Split learning is a promising paradigm for privacy-preserving distributed learning. The learning model can be cut into multiple portions to be collaboratively trained at the participants by exchanging only the intermediate results at the cut layer. Understanding the security performance of split learning is critical for many privacy-sensitive applications. This paper shows that the exchanged intermediate results, including the smashed data (i.e., extracted features from the raw data) and gradients during training and inference of split learning, can already reveal the private labels. We mathematically analyze the potential label leakages and propose the cosine and Euclidean similarity measurements for gradients and smashed data, respectively. Then, the two similarity measurements are shown to be unified in Euclidean space. Based on the similarity metric, we design three label inference attacks to efficiently recover the private labels during both the training and inference phases. Experimental results validate that the proposed approaches can achieve close to 100% accuracy of label attacks. The proposed attack can still achieve accurate predictions against various state-of-the-art defense mechanisms, including DP-SGD, label differential privacy, gradient compression, and Marvell.
