NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense
Yang Ouyang
|
Hengrui Gu
|
Shuhang Lin
|
Wenyue Hua
|
Jie Peng
|
Bhavya Kailkhura
|
Meijun Gao
|
Tianlong Chen
|
Kaixiong Zhou
|
Paper Details:
Month: April
Year: 2025
Location: Albuquerque, New Mexico
Venue:
NAACL |
Citations
URL
No Citations Yet
https://github.com/
Field Of Study