ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Gray-box adversarial training

Vivek, BS and Mopuri, KR and Babu, RV (2018) Gray-box adversarial training. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8 - 14 September 2018, Munich, pp. 213-228.

ECCV 2018_11219_213-228_2018.pdf - Published Version

Download (2MB) | Preview
Official URL: https://doi.org/10.1007/978-3-030-01267-0_13


Adversarial samples are perturbed inputs crafted to mislead the machine learning systems. A training mechanism, called adversarial training, which presents adversarial samples along with clean samples has been introduced to learn robust models. In order to scale adversarial training for large datasets, these perturbations can only be crafted using fast and simple methods (e.g., gradient ascent). However, it is shown that adversarial training converges to a degenerate minimum, where the model appears to be robust by generating weaker adversaries. As a result, the models are vulnerable to simple black-box attacks. In this paper we, (i) demonstrate the shortcomings of existing evaluation policy, (ii) introduce novel variants of white-box and black-box attacks, dubbed “gray-box adversarial attacks” based on which we propose novel evaluation method to assess the robustness of the learned models, and (iii) propose a novel variant of adversarial training, named “Gray-box Adversarial Training” that uses intermediate versions of the models to seed the adversaries. Experimental evaluation demonstrates that the models trained using our method exhibit better robustness compared to both undefended and adversarially trained models.

Item Type: Conference Paper
Publication: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Publisher: Springer Verlag
Additional Information: The copyright for this article belongs to the Authors.
Keywords: Artificial intelligence; Learning systems, Adversarial perturbations; Evaluation policy; Experimental evaluation; Gradient ascent; Large datasets; Machine learning models; On-machines; Robust models, Computer vision
Department/Centre: Division of Interdisciplinary Sciences > Computational and Data Sciences
Date Deposited: 02 Sep 2022 04:30
Last Modified: 02 Sep 2022 04:30
URI: https://eprints.iisc.ac.in/id/eprint/76359

Actions (login required)

View Item View Item