NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
TokenSmith: Streamlining Data Editing, Search, and Inspection for Large-Scale Language Model Training and Interpretability
Mohammad Aflah Khan
|
Ameya Godbole
|
Johnny Wei
|
Ryan Yixiang Wang
|
James Flemings
|
Krishna P. Gummadi
|
Willie Neiswanger
|
Robin Jia
|
Paper Details:
Month: November
Year: 2025
Location: Suzhou, China
Venue:
EMNLP |
Citations
URL
No Citations Yet
https://github.com/aflah02/TokenSmith
https://aflah02.github.io/TokenSmith/
https://www.youtube.com/watch?v=cDO8VE9fZvU
https://github.com/NVIDIA/NeMo
https://github.com/LLM360/k2-train
https://github.com/multimodal-art-projection/
https://github.com/alibaba/Megatron-LLaMA
https://huggingface.co/collections/swiss-ai/
https://github.com/thunlp/Seq1F1B
https://github.com/sail-sg/
https://github.com/kwai/Megatron-Kwai
https://github.com/ROCm/Megatron-LM
https://github.com/EleutherAI/tokengrams
https://github.com/aflah02/tokensmith?tab=
https://streamlit.io/
https://github.com/EleutherAI/tokengrams?tab=
https://docs.python.org/3/library/gc.html
https://github.com/aflah02/TokenSmith/tree/
https://peps.python.org/pep-0484/
Field Of Study