SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation

Aurick Qiao | Zhewei Yao | Samyam Rajbhandari | Yuxiong He |

Paper Details:

Month: November
Year: 2025
Location: Suzhou, China
Venue: EMNLP |