Disambiguating Reference in Visually Grounded Dialogues through Joint Modeling of Textual and Multimodal Semantic Structures

Shun Inadumi | Nobuhiro Ueda | Koichiro Yoshino |

Paper Details:

Month: July
Year: 2025
Location: Vienna, Austria
Venue: ACL |