Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization

Seungwoo Son | Wonpyo Park | Woohyun Han | Kyuyeun Kim | Jaeho Lee |

Paper Details:

Month: November
Year: 2024
Location: Miami, Florida, USA
Venue: EMNLP |

Citations

URL

No Citations Yet

No URLs Found

Field Of Study