Youran Ye, Dejin Wang, Ajinkya Bhandare
Selective Adversarial Training reduces computational costs by perturbing only critical samples, achieving comparable or better robustness than full PGD adversarial training.
Adversarial training is a method used to make machine learning models more robust against attacks, but it can be very computationally expensive. This paper introduces a new approach called Selective Adversarial Training, which focuses on only the most important samples that are likely to affect the model's robustness, rather than processing every sample equally. By selecting samples that are near the decision boundary or have gradients that align with the main optimization direction, the method reduces the computational burden significantly. The experiments show that this approach can achieve similar or even better robustness compared to traditional methods, while cutting computation time in half.