How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective

Teng Xiao | Mingxiao Li | Yige Yuan | Huaisheng Zhu | Chao Cui | Vasant G Honavar |

Paper Details:

Month: November
Year: 2024
Location: Miami, Florida, USA
Venue: EMNLP |