NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Demystifying Synthetic Data in LLM Pre-training: A Systematic Study of Scaling Laws, Benefits, and Pitfalls
Feiyang Kang
|
Newsha Ardalani
|
Michael Kuchnik
|
Youssef Emad
|
Mostafa Elhoushi
|
Shubhabrata Sengupta
|
Shang-Wen Li
|
Ramya Raghavendra
|
Ruoxi Jia
|
Carole-Jean Wu
|
Paper Details:
Month: November
Year: 2025
Location: Suzhou, China
Venue:
EMNLP |
Citations
URL
No Citations Yet
https://ai.meta.com/blog/llama-4-multimodal-
Field Of Study