NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models
Yi Zeng
|
Weiyu Sun
|
Tran Huynh
|
Dawn Song
|
Bo Li
|
Ruoxi Jia
|
Paper Details:
Month: November
Year: 2024
Location: Miami, Florida, USA
Venue:
EMNLP |
Citations
URL
No Citations Yet
https://trojandetection.ai/
https://huggingface.co/ethz-spylab/poisoned_
https://www.virtueai
https://trojandetection.ai/
https://github
https://github.com/fra31/
https://github.com/KrystofM/rlhf_
https://github.com/neverix/
https://github.com/vinid/
https://huggingface.co/datasets/
https://huggingface.co/datasets/
https://github.com/google-research/
https://github.com/CaoYuanpu/
https://www.alignmentforum
https://huggingface.co/datasets/tatsu-lab/
https://github.com/CommissarSilver/CVT/tree/
Field Of Study