Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling

Xianzhen Luo | Yixuan Wang | Qingfu Zhu | Zhiming Zhang | Xuanyu Zhang | Qing Yang | Dongliang Xu |

Paper Details:

Month: July
Year: 2025
Location: Vienna, Austria
Venue: ACL |