NLPExplorer

PrivacyScalpel: Enhancing LLM Privacy via Interpretable Feature Intervention with Sparse Autoencoders

Ahmed Frikha | Muhammad Reza Ar Razi | Krishna Kanth Nakka | Ricardo Mendes | Xue Jiang | Xuebing Zhou |

Paper Details:

Month: November
Year: 2025
Location: Suzhou, China
Venue: BlackboxNLP | WS |

Citations

URL

No Citations Yet

No URLs Found

Field Of Study