NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
QUIK: Towards End-to-end 4-Bit Inference on Generative Large Language Models
Saleh Ashkboos
|
Ilia Markov
|
Elias Frantar
|
Tingxuan Zhong
|
Xincheng Wang
|
Jie Ren
|
Torsten Hoefler
|
Dan Alistarh
|
Paper Details:
Month: November
Year: 2024
Location: Miami, Florida, USA
Venue:
EMNLP |
Citations
URL
No Citations Yet
https://huggingface.co/tiiuae
Field Of Study