Sharper and Faster mean Better: Towards More Efficient Vision-Language Model for Hour-scale Long Video Understanding

Daoze Zhang | Yuze Zhao | Jintao Huang | Yingda Chen |

Paper Details:

Month: July
Year: 2025
Location: Vienna, Austria
Venue: ACL |