NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient Analysis
Yueqi Xie
|
Minghong Fang
|
Renjie Pi
|
Neil Gong
|
Paper Details:
Month: August
Year: 2024
Location: Bangkok, Thailand
Venue:
ACL |
Citations
URL
No Citations Yet
https://github.com/xyq7/GradSafe
https://platform.openai.com/finetune
https://platform.openai.com/docs/guides/
https://perspectiveapi.com/
https://azure.microsoft.com/en-us/products/
https://www
https://openreview.net/forum?
https://www
https://scikit-learn.org/stable/
Field Of Study