Dynamic Rewarding with Prompt Optimization Enables Tuning-free Self-Alignment of Language Models

Somanshu Singla | Zhen Wang | Tianyang Liu | Abdullah Ashfaq | Zhiting Hu | Eric P. Xing |

Paper Details:

Month: November
Year: 2024
Location: Miami, Florida, USA
Venue: EMNLP |