Characterizing Learning Curves During Language Model Pre-Training: Learning, Forgetting, and Stability

Tyler A. Chang | Zhuowen Tu | Benjamin K. Bergen |

Paper Details:


Year: 2024
Location: Cambridge, MA
Venue: TACL |