Why Safeguarded Ships Run Aground? Aligned Large Language Models’ Safety Mechanisms Tend to Be Anchored in The Template Region
Chak Tou Leong |
Qingyu Yin |
Jian Wang |
Wenjie Li |
Paper Details:
Month: July
Year: 2025
Location: Vienna, Austria
Venue:
ACL |