conformer deepspeed; sys_platform == 'linux' diffusers grpcio grpcio-tools hydra-core HyperPyYAML inflect librosa lightning matplotlib modelscope networkx omegaconf onnxruntime-gpu; sys_platform == 'linux' onnxruntime; sys_platform == 'darwin' or sys_platform == 'win32' openai-whisper protobuf pydantic rich soundfile tensorboard wget gdown pyarrow jieba pypinyin pydub audiosegment srt ffmpeg-python
a comfyui custom node for CosyVoice,you can find workflow in workflows
suport srt file to single voice or mutiple voice clone
input
output
test on 2080ti 11GB torch==2.3.0+cu121 python 3.10.8
use case | tts_text | prompt_text | prompt_wav | instruct_text | output
—– | —- | —- | —- | —- | —-
base tts | 你好,我是通义生成式语音大模型,请问有什么可以帮您的吗 | | | |
3s clone tts | 收到好友从远方寄来的生日礼物,那份意外的惊喜与深深的祝福让我心中充满了甜蜜的快乐,笑容如花儿般绽放。 | 希望你以后能够做的比我还好呦。 | | |
cross lingual | “And then later on, fully acquiring that company. So keeping management in line, interest in line with the asset that\\’s coming into the family is a reason why sometimes we don\\’t buy the whole thing.” | | | |
instruct | 在面对挑战时,他展现了非凡的勇气与智慧。 | | | Theo \\'Crimson\\', is a fiery, passionate rebel leader. Fights with fervor for justice, but struggles with impulsiveness. |
test on py3.10,2080ti 11gb,torch==2.3.0+cu121
make sure ffmpeg is worked in your commandline
for Linux
apt update
apt install ffmpeg
for Windows,you can install ffmpeg by WingetUI automatically
then!
## in ComfyUI/custom_nodes
git clone https://github.com/AIFSH/CosyVoice-ComfyUI.git
cd CosyVoice-ComfyUI
pip install -r requirements.txt
weights will be downloaded from modelscope