Scaling LLM Inference Efficiently with Optimized Sample Compute Allocation

Kexun Zhang | Shang Zhou | Danqing Wang | William Yang Wang | Lei Li |

Paper Details:

Month: April
Year: 2025
Location: Albuquerque, New Mexico
Venue: NAACL |

Citations

URL

No Citations Yet