AIR-ML
Home
Research
News
Team
Project
Publication
Contact
Article
Understanding Adversarially Robust Generalization via Weight-Curvature Index
We introduce the Weight-Curvature Index (WCI), a novel metric that captures the interplay between model parameters and loss landscape curvature to better understand and improve adversarially robust generalization in deep learning.
Yuelin Xu
,
Xiao Zhang
PDF
Cite
ArXiv
OpenReview
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
We propose AutoDefense, a response-filtering based multi-agent defense framework that filters harmful responses from LLMs.
Yifan Zeng
,
Yiran Wu
,
Xiao Zhang
,
Huazheng Wang
,
Qingyun Wu
PDF
Cite
Code
ArXiv
Transferable Availability Poisoning Attacks
We propose an availability poisoning attack for generating transferable poisoned data across different victim learners.
Yiyong Liu
,
Michael Backes
,
Xiao Zhang
PDF
Cite
Code
ArXiv
«
Cite
×