NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Youliang Yuan
|
Wenxiang Jiao
|
Wenxuan Wang
|
Jen-tse Huang
|
Jiahao Xu
|
Tian Liang
|
Pinjia He
|
Zhaopeng Tu
|
Paper Details:
Month: July
Year: 2025
Location: Vienna, Austria
Venue:
ACL |
Citations
URL
No Citations Yet
https://www.anthropic.com/news/
https://github.com/tatsu-lab/alpaca_eval
https://llama.meta.com/llama3/
https://cdn
https://old.reddit.com/r/ChatGPT/
https://www.jailbreakchat.com/
https://github.com/llm-attacks/llm-attacks/
Field Of Study