NLPExplorer

HiddenDetect: Detecting Jailbreak Attacks against Multimodal Large Language Models via Monitoring Hidden States

Yilei Jiang | Xinyan Gao | Tianshuo Peng | Yingshui Tan | Xiaoyong Zhu | Bo Zheng | Xiangyu Yue |

Paper Details:

Month: July
Year: 2025
Location: Vienna, Austria
Venue: ACL |

Citations

URL

No Citations Yet

https://github.com/leigest519/

Field Of Study