NLPExplorer

BlackboxNLP - 2024

Total Papers:- 36

Total Papers accross all years:- 176

Total Citations :- 0

1 2 3 »

Language Models Linearly Represent Sentiment

Curt Tigges | Oskar J. Hollinsworth | Atticus Geiger | Neel Nanda |

Probing Language Models on Their Knowledge Source

Zineddine Tighidet | Jiali Mei | Benjamin Piwowarski | Patrick Gallinari |

Attend First, Consolidate Later: On the Importance of Attention in Different LLM Layers

Amit Ben Artzy | Roy Schwartz |

Investigating Layer Importance in Large Language Models

Yang Zhang | Yanfei Dong | Kenji Kawaguchi |

On the alignment of LM language generation and human language comprehension

Lena Sophia Bolliger | Patrick Haller | Lena Ann Jäger |

How Language Models Prioritize Contextual Grammatical Cues?

Hamidreza Amirzadeh | Afra Alishahi | Hosein Mohebbi |

MultiContrievers: Analysis of Dense Retrieval Representations

Seraphina Goldfarb-Tarrant | Pedro Rodriguez | Jane Dwivedi-Yu | Patrick Lewis |

IvRA: A Framework to Enhance Attention-Based Explanations for Language Models with Interpretability-Driven Training

Sean Xie | Soroush Vosoughi | Saeed Hassanpour |

Multi-property Steering of Large Language Models with Dynamic Activation Composition

Daniel Scalena | Gabriele Sarti | Malvina Nissim |

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

Self-Assessment Tests are Unreliable Measures of LLM Personality

Akshat Gupta | Xiaoyang Song | Gopala Anumanchipalli |

Can We Statically Locate Knowledge in Large Language Models? Financial Domain and Toxicity Reduction Case Studies

Routing in Sparsely-gated Language Models responds to Context

Stefan Arnold | Marian Fietta | Dilara Yesilbas |

Attribution Patching Outperforms Automated Circuit Discovery

Aaquib Syed | Can Rager | Arthur Conmy |

Wrapper Boxes for Faithful Attribution of Model Predictions to Training Data

Yiheng Su | Junyi Jessy Li | Matthew Lease |