NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?
Cheng Zhang
|
Jianyi Cheng
|
Ilia Shumailov
|
George Constantinides
|
Yiren Zhao
|
Paper Details:
Month: December
Year: 2023
Location: Singapore
Venue:
EMNLP |
Citations
URL
No Citations Yet
https://github.com/ChengZhang-98/llm-mixed-q
https://github.com/huggingface/transformers
https://github.com/pytorch/pytorch
https://optuna.readthedocs.io/en/stable/
https://github.com/IST-DASLab/gptq
https://github.com/mit-han-lab/smoothquant
Field Of Study