NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Stephen Casper
Number of Papers:- 2
Number of Citations:- 0
First ACL Paper:- 2020
Latest ACL Paper:- 2023
Venues:-
EMNLP
BlackboxNLP
NLP4ConvAI
WS
ACL
Co-Authors:-
Abdelrhman Saleh
Dylan Hadfield Menell
Jacob Andreas
Kevin Liu
Stuart M Shieber
Similar Authors:-
2025
2023
2020
What Features in Prompts Jailbreak LLMs? Investigating the Mechanisms Behind Attacks
BlackboxNLP
WS
Nathalie Maria Kirch |
Constantin Niko Weisser |
Severin Field |
Helen Yannakoudakis |
Stephen Casper |
Cognitive Dissonance: Why Do Language Model Outputs Disagree with Internal Representations of Truthfulness?
EMNLP
Kevin Liu |
Stephen Casper |
Dylan Hadfield-Menell |
Jacob Andreas |
Probing Neural Dialog Models for Conversational Understanding
ACL
NLP4ConvAI
WS
Abdelrhman Saleh |
Tovly Deutsch |
Stephen Casper |
Yonatan Belinkov |
Stuart Shieber |
Linguistic
Task
Approach
Language
Dataset Type
.