AIR-ML
Home
Research
News
Team
Project
Publication
Position
Contact
Jailbreak Attack
GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs
We introduce Generative Adversarial Suffix Prompter (GASP), a novel framework that combines human-readable prompt generation with Latent Bayesian Optimization (LBO) to improve adversarial suffix creation in a fully black-box setting.
Advik Raj Basani
,
Xiao Zhang
PDF
Cite
Code
ArXiv
OpenReview
Generative Adversarial Suffix Prompter (GASP)
In our NeurIPS'25 paper, we develop Generative Adversarial Suffix Prompter (GASP),a novel black-box attack framework that leverages latent Bayesian optimization to generate human-readable adversarial suffixes.
Advik Raj Basani
,
Xiao Zhang
PDF
Code
OpenReview
AdvSuffixes Dataset
Cite
×