Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search

Linhao Yu | Xingguang Ji | Yahui Liu | Fanheng Kong | Chenxi Sun | Jingyuan Zhang | Hongzhi Zhang | V. W. | Fuzheng Zhang | Deyi Xiong |

Paper Details:

Month: July
Year: 2025
Location: Vienna, Austria
Venue: ACL |