CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers

Longwei Zou | Qingyang Wang | Han Zhao | Jiangangkong Jiangangkong | Yi Yang | Yangdong Deng |

Paper Details:

Month: August
Year: 2024
Location: Bangkok, Thailand
Venue: ACL |

Citations

URL