TensorOpera Router: A Multi-Model Router for Efficient LLM Inference

Dimitris Stripelis | Zhaozhuo Xu | Zijian Hu | Alay Dilipbhai Shah | Han Jin | Yuhang Yao | Jipeng Zhang | Tong Zhang | Salman Avestimehr | Chaoyang He |

Paper Details:

Month: November
Year: 2024
Location: Miami, Florida, US
Venue: EMNLP |