from Benign import Toxic: Jailbreaking the Language Model via Adversarial Metaphors

Yu Yan | Sheng Sun | Zenghao Duan | Teli Liu | Min Liu | Zhiyi Yin | LeiJingyu LeiJingyu | Qi Li |

Paper Details:

Month: July
Year: 2025
Location: Vienna, Austria
Venue: ACL |