Download
Abstract
Train-time data poisoning attacks compromise machine learning models by introducing adversarial examples during training, causing misclassification. We propose universal data purification methods using a stochastic transform Ψ(x), implemented via iterative Langevin dynamics of Energy-Based Models (EBMs) and Denoising Diffusion Probabilistic Models (DDPMs). Our approach purifies poisoned data with minimal impact on classifier generalization and achieves state-of-the-art defense performance without requiring specific knowledge of the attack or classifier.
Citation
Bhat, Sunay, Jeffrey Jiang, Omead Pooladzandi, Alexander Branch, and Gregory Pottie. 2024. “PureGen: Universal Data Purification for Train-Time Poison Defense via Generative Model Dynamics.” arXiv preprint arXiv:2405.18627.
@article{bhat2024puregen,
title={PureGen: Universal Data Purification for Train-Time Poison Defense via Generative Model Dynamics},
author={Bhat, Sunay and Jiang, Jeffrey and Pooladzandi, Omead and Branch, Alexander and Pottie, Gregory},
journal={arXiv preprint arXiv:2405.18627},
year={2024}
}