NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Draft& Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding
Jun Zhang
|
Jue Wang
|
Huan Li
|
Lidan Shou
|
Ke Chen
|
Gang Chen
|
Sharad Mehrotra
|
Paper Details:
Month: August
Year: 2024
Location: Bangkok, Thailand
Venue:
ACL |
Citations
URL
No Citations Yet
https://github.com/dilab-zju/
https://github.com/bayesian-optimization/
https://github.com/
https://github.com/
https://github.com/huggingface/accelerate
https://github.com/TimDettmers/bitsandbytes
https://github.com/locuslab/wanda
Field Of Study