ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback
Ju-Seung Byun |
Jiyun Chun |
Jihyung Kil |
Andrew Perrault |
Paper Details:
Month: November
Year: 2024
Location: Miami, Florida, USA
Venue:
EMNLP |