TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection

Wei Wu | Zhuoshi Pan | Kun Fu | Chao Wang | Liyi Chen | Yunchu Bai | Tianfu Wang | Zheng Wang | Hui Xiong |

Paper Details:

Month: November
Year: 2025
Location: Suzhou, China
Venue: EMNLP |