Rescorla-Wagner Steering of LLMs for Undesired Behaviors over Disproportionate Inappropriate Context

Rushi Wang | Jiateng Liu | Cheng Qian | Yifan Shen | Yanzhou Pan | Zhaozhuo Xu | Ahmed Abbasi | Heng Ji | Denghui Zhang |

Paper Details:

Month: November
Year: 2025
Location: Suzhou, China
Venue: EMNLP |

Citations

URL

No Citations Yet