ComfyUI-EdgeTTS

ComfyUI-EdgeTTS
★ 68

文本转语音微软 Edge TTS多语言支持ComfyUI 节点
基于微软 Edge TTS 的 ComfyUI 文本转语音节点,支持多语言多音色、易集成与定制,可生成自然语音并提升交互体验
💡 在ComfyUI流程中将文本转换为自然语音,用于交互提示或配音
🍴 8 Forks💻 Python🔄 2026-01-25
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/8f9eee5e2cdb
📦 requirements.txt
edge-tts>=7.0.0
torchaudio
torchcodec==0.9
openai-whisper>=20231117
numpy
#mutagen>=1.47.0
#
For
audio
metadata
handling
torch
googletrans-py>=3.0.0
deep-translator>=1.11.4
edgeTTS
TTS-STT
📄 README

ComfyUI Audio Nodes

ComfyUI-EdgeTTS is a powerful text-to-speech node for ComfyUI, leveraging Microsoft’s Edge TTS capabilities. It enables seamless conversion of text into natural-sounding speech, supporting multiple languages and voices. Ideal for enhancing user interactions, this node is easy to integrate and customize, making it perfect for various applications.

https://github.com/user-attachments/assets/a5b9165b-a413-49fd-989e-0ef3141afce7

Updates

  • V1.2.2 (2026-01-25) – Voice ID update requirments update log.
  • V1.2.1 (2025-07-23) – Voice ID update & Bug Fixed update log.
  • V1.2.0 (2025-06-20) – Simplified voice display format, improved performance with lazy loading and caching, and reduced memory usage. For more information, please see the update log.
  • V1.1.0 (2025-01-24) – Added 19 new languages and 38 new voices, with more detailed characteristics for existing Chinese voices. For more information, please see the update log.
  • Features

    Edge TTS Node

  • Edge TTS: Convert text to speech using Microsoft Edge TTS
  • Multiple languages and voices support
  • Adjustable speech rate and pitch
  • High-quality voice synthesis
  • Configurable via config.json
  • Speech to Text Node

  • Whisper STT: High-accuracy speech recognition
  • Multiple language support with auto-detection
  • Multiple model sizes (tiny to large)
  • Supports ComfyUI audio format
  • Language detection confidence reporting
  • Audio File Node

  • Save Audio: Export audio files
  • Supports WAV, MP3, FLAC formats
  • Quality presets (high/medium/low)
  • Custom file naming and paths
  • Automatic file numbering
  • Installation

    Method 1. install on ComfyUI-Manager, search Comfyui-EdgeTTS and install

    install requirment.txt in the ComfyUI-EdgeTTS folder

    “`bash

    ./ComfyUI/python_embeded/python -m pip install -r requirements.txt

    “`

    Method 2. Clone this repository to your ComfyUI custom_nodes folder:

    “`bash

    cd ComfyUI/custom_nodes

    git clone https://github.com/1038lab/ComfyUI-EdgeTTS.git

    “`

    install requirment.txt in the ComfyUI-EdgeTTS folder

    “`bash

    ./ComfyUI/python_embeded/python -m pip install -r requirements.txt

    “`

    Requirements

  • Python packages (see requirements.txt)
  • FFmpeg on system PATH (required by Whisper STT)
  • Example (Windows PowerShell): $env:Path += ";F:\\FFmpeg\\bin" then restart ComfyUI
  • CUDA compatible GPU (optional, for faster Whisper processing)
  • [!NOTE]

    ### Important: torchaudio 2.9+ requires torchcodec

    If you use torch/torchaudio 2.9 or newer, torchaudio.load/save requires torchcodec.

    Recommended fix (match PyTorch 2.9.x):

    ./ComfyUI/python_embeded/python -m pip uninstall -y torchcodec
    ./ComfyUI/python_embeded/python -m pip install --no-cache-dir "torchcodec==0.9"

    Text to Speech

  • Add Edge TTS node to workflow
  • Input text and select voice
  • Adjust speed and pitch if needed
  • Connect to Save Audio node for export
  • Speech to Text

  • Add Whisper STT node
  • Connect audio input
  • Select model size and language (or auto-detect)
  • Run to get transcription
  • Supported Voices

    | Language | Female Voices | Male Voices |

    |———-|————–|————-|

    | Chinese-Mainland | XiaoXiao (News, Novel, Warm), XiaoYi (Cartoon, Novel, Lively) | Yunjian (Sports, Novel, Passion), Yunxi (Novel, Lively), Yunxia (Cartoon, Novel), Yunyang (News, Professional) |

    | Chinese-Cantonese | HiuGaai (Friendly), HiuMaan (Friendly) | WanLung (Friendly) |

    | Chinese-Taiwan | HsiaoChen (Friendly), HsiaoYu (Friendly) | YunJhe (Friendly) |

    | English-US | Jenny (Friendly), Aria (Positive), Ana (Cute), Michelle (Friendly) | Guy (Passion), Christopher (Authority), Eric (Rational), Roger (Lively), Steffan (Rational) |

    | English-GB | Libby (Friendly), Maisie (Friendly), Sonia (Friendly) | Ryan (Friendly), Thomas (Friendly) |

    | English-AU | Natasha (Friendly) | William (Friendly) |

    | Japanese | Nanami (Friendly) | Keita (Friendly) |

    | Korean | SunHi (Friendly) | InJoon (Friendly), Hyunsu (Multilingual) |

    | French-FR | Denise (Friendly), Eloise (Friendly), Vivienne (Multilingual) | Henri (Friendly), Remy (Multilingual) |

    | French-CA | Sylvie (Friendly) | Jean (Friendly), Antoine (Friendly) |

    | German-DE | Katja (Friendly), Amala (Friendly), Seraphina (Multilingual) | Conrad (Friendly), Killian (Friendly), Florian (Multilingual) |

    More voices available in config.json, including voices for:

  • German (AT/CH)
  • Spanish (ES/MX)
  • Russian
  • Italian
  • Portuguese (BR/PT)
  • Dutch
  • Polish
  • Turkish
  • Arabic
  • Hindi
  • Indonesian
  • Vietnamese
  • Thai
  • Ukrainian
  • And many more…

    Each language provides at least one male and female voice option, allowing you to choose different voice styles based on your needs.

    Credits

  • Edge TTS: Microsoft Edge TTS
  • Whisper: OpenAI Whisper