FINCH: Prompt-guided Key-Value Cache Compression for Large Language Models

Giulio Corallo | Paolo Papotti |

Paper Details:


Year: 2024
Location: Cambridge, MA
Venue: TACL |