NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
RelayAttention for Efficient Large Language Model Serving with Long System Prompts
Lei Zhu
|
Xinjiang Wang
|
Wayne Zhang
|
Rynson Lau
|
Paper Details:
Month: August
Year: 2024
Location: Bangkok, Thailand
Venue:
ACL |
Citations
URL
No Citations Yet
https://github.com/rayleizhu/vllm-ra
https://xxxxx/yyy
https://github.com/vllm-project/vllm
https://www.anthropic.com/in
https://github.com
https://bard.google.com
https://www.microsoft.com/en
https://www.microsoft.com/en
https://github.com/NVIDIA/Tensor
https://openai.com/research/tr
https://openai.com/blog/chatgp
https://openai.com/blog/custom
https://sharegpt.com/
Field Of Study