NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
BlackboxNLP - 2025
Total Papers:- 33
Total Papers accross all years:- 176
Total Citations :- 0
«
1
2
3
»
Findings of the BlackboxNLP 2025 Shared Task: Localizing Circuits and Causal Variables in Language Models
Dana Arad |
Yonatan Belinkov |
Hanjie Chen |
Najoung Kim |
Hosein Mohebbi |
Aaron Mueller |
Gabriele Sarti |
Martin Tutek |
Fine-Grained Manipulation of Arithmetic Neurons
Wenyu Du |
Rui Zheng |
Tongxu Luo |
Stephen Chung |
Jie Fu |
The Unheard Alternative: Contrastive Explanations for Speech-to-Text Models
Lina Conti |
Dennis Fucci |
Marco Gaido |
Matteo Negri |
Guillaume Wisniewski |
Luisa Bentivogli |
Language Dominance in Multilingual Large Language Models
Nadav Shani |
Ali Basirat |
Exploring Large Language Models’ World Perception: A Multi-Dimensional Evaluation through Data Distribution
Zhi Li |
Jing Yang |
Ying Liu |
Not a nuisance but a useful heuristic: Outlier dimensions favor frequent tokens in language models
Iuri Macocco |
Nora Graichen |
Gemma Boleda |
Marco Baroni |
The Comparative Trap: Pairwise Comparisons Amplifies Biased Preferences of LLM Evaluators
Hawon Jeong |
ChaeHun Park |
Jimin Hong |
Hojoon Lee |
Jaegul Choo |
CE-Bench: Towards a Reliable Contrastive Evaluation Benchmark of Interpretability of Sparse Autoencoders
Alex Gulko |
Yusen Peng |
Sachin Kumar |
Emergent Convergence in Multi-Agent LLM Annotation
Angelina Parfenova |
Alexander Denzler |
Jürgen Pfeffer |
PrivacyScalpel: Enhancing LLM Privacy via Interpretable Feature Intervention with Sparse Autoencoders
Ahmed Frikha |
Muhammad Reza Ar Razi |
Krishna Kanth Nakka |
Ricardo Mendes |
Xue Jiang |
Xuebing Zhou |
Can LLMs Detect Ambiguous Plural Reference? An Analysis of Split-Antecedent and Mereological Reference
Dang Thi Thao Anh |
Rick Nouwen |
Massimo Poesio |
Mechanistic Fine-tuning for In-context Learning
Hakaze Cho |
Peng Luo |
Mariko Kato |
Rin Kaenbyou |
Naoya Inoue |
Evil twins are not that evil: Qualitative insights into machine-generated prompts
Nathanaël Carraz Rakotonirina |
Corentin Kervadec |
Francesca Franzon |
Marco Baroni |
Steering Prepositional Phrases in Language Models: A Case of with-headed Adjectival and Adverbial Complements in Gemma-2
Stefan Arnold |
Rene Gröbner |
A Pipeline to Assess Merging Methods via Behavior and Internals
Yutaro Sigrist |
Andreas Waldis |
Conference Topic Distribution
Linguistic
Task
Approach
Language
Dataset
Conference Citation Distribution
Conference Papers have no Citations yet
Topics