NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Break the Breakout: Reinventing LM Defense Against Jailbreak Attacks with Self-Refine
Heegyu Kim
|
Hyunsouk Cho
|
Paper Details:
Month: May
Year: 2025
Location: Albuquerque, New Mexico
Venue:
TrustNLP |
WS |
Citations
URL
No Citations Yet
https://github.com/HeegyuKim/refine-a-broken
https://www.ets.org/toeic.html
https://huggingface.co/PKU-Alignment/beaver-7b-v1.0-reward
https://huggingface.co/PKU-Alignment/beaver-7b-v1.0-cost
https://github.com/arobey1/smooth-llm
Field Of Study