Tuesday Dec 26, 2023
arxiv preprint - Model-tuning Via Prompts Makes NLP Models Adversarially Robust
In this episode we discuss Model-tuning Via Prompts Makes NLP Models Adversarially Robust by Mrigank Raman, Pratyush Maini, J. Zico Kolter, Zachary C. Lipton, Danish Pruthi. The discussed paper presents a new method called Model-tuning Via Prompts (MVP) that significantly improves the adversarial robustness of pretrained language models over the standard multilayer perceptron fine-tuning (MLP-FT) approach. MVP appends a prompt to the input instead of an MLP head, leading to an average 8% performance increase against adversarial attacks across various datasets and models, and even surpassing state-of-the-art defenses by 3.5%. The research suggests that MVP's robustness gains stem from better alignment with pre-training tasks and avoidance of the vulnerabilities introduced by the random initialization of MLP parameters.
Comments (0)
To leave or reply to comments, please download free Podbean or
No Comments
To leave or reply to comments,
please download free Podbean App.