NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music
Jiatong Shi
|
Hye-jin Shim
|
Jinchuan Tian
|
Siddhant Arora
|
Haibin Wu
|
Darius Petermann
|
Jia Qi Yip
|
You Zhang
|
Yuxun Tang
|
Wangyou Zhang
|
Dareen Safar Alharthi
|
Yichen Huang
|
Koichi Saito
|
Jionghao Han
|
Yiwen Zhao
|
Chris Donahue
|
Shinji Watanabe
|
Paper Details:
Month: April
Year: 2025
Location: Albuquerque, New Mexico
Venue:
NAACL |
WS |
Citations
URL
No Citations Yet
https://github.com/
https://github.com/wavlab-speech/versa
https://youtu
https://github.com/espnet/espnet
https://github.com/open-mmlab/Amphion
https://github.com/unilight/sheet
https://pypi.org/project/speechmos
https://pypi.org/project/fast-bss-eval
https://github.com/modelscope/ClearerVoice-Studio
https://github.com/haoheliu/audioldm_eval
https://github.com/Stability-AI/stable-audio-metrics
https://github.com/SonyCSLParis/audio-metrics
https://github.com/microsoft/fadtk
https://github.com/schmiph2/pysepm
https://github.com/facebookresearch/audiocraft/blob/main/docs/METRICS.md
https://github.com/Ashvala/AQUA-Tk
https://github.com/shinjiwlab/versa
https://github.com/
https://github.com/modelscope/clearervoice-studio
https://github.com/
https://keithito.com/
https://github.com/stability-
https://colab.research.google.com/drive/
https://github.com/wavlab-speech/
https://huggingface.co/facebook/encodec_24khz
https://github.com/descriptinc/descript-audio-codec/releases/download/0.0.4/weights_24khz.pth
https://huggingface.co/fnlp/AnyGPT-speech-modules/tree/main/speechtokenizer
https://huggingface.co/Dongchao/UniAudio/resolve/main/16k_50dim_9.zip
https://huggingface.co/espnet/owsmdata_soundstream_16k_200epoch
https://huggingface.co/ftshijt/espnet_codec_dac_large_v1.4_360epoch
https://huggingface.co/kyutai/mimi
https://huggingface.co/Alethia/BigCodec/resolve/main/bigcodec.pt
https://huggingface.co/novateur/WavTokenizer-large-speech-75token
https://github.com/espnet/espnet/tree/
https://huggingface.co/espnet/kan-bayashi_ljspeech_vits
https://huggingface.co/espnet/speechlm_tts_v1
https://huggingface.co/2Noise/ChatTTS
https://huggingface.co/model-scope/CosyVoice-300M
https://www.modelscope.cn/syq163/outputs.git
https://huggingface.co/myshell-ai/MeloTTS-English
https://huggingface.co/parler-tts/parler-tts-mini-v1
https://huggingface.co/WhisperSpeech/WhisperSpeech/blob/main/t2s-v1.95-small-8lang.model
https://huggingface.co/Plachta/VALL-E-X/resolve/main/vallex-checkpoint.pt
https://huggingface.co/amphion/valle
https://huggingface.co/amphion/naturalspeech2_libritts
https://huggingface.co/espnet/opencpop_naive_rnn_dp
https://huggingface.co/espnet/opencpop_xiaoice
https://github.com/MoonInTheRiver/DiffSinger/releases/download/pretrain-model/0228_opencpop_ds100_rel.zip
https://huggingface.co/espnet/opencpop_visinger
https://huggingface.co/espnet/opencpop_visinger2
https://huggingface.co/espnet/opencpop_svs2_toksing_pretrain
https://huggingface.co/yifengyu/svs_train_visinger2plus_mert_raw_phn_None_zh_200epoch
https://huggingface.co/cvssp/audioldm2-music
https://huggingface.co/facebook/musicgen-large
https://github.com/RetroCirce/MusicLDM?tab=readme-ov-file#step-4-run-musicldm
https://huggingface.co/riffusion/riffusion-model-v1
https://huggingface.co/stabilityai/stable-audio-open-1.0
https://github.com/LAION-AI/CLAP
Field Of Study