# ThinkSound ComfyUI Requirements # Core dependencies needed for ThinkSound functionality # Critical ThinkSound dependencies (these were missing and caused errors) alias-free-torch==0.0.6 descript-audio-codec==1.0.0 vector-quantize-pytorch==1.9.14 # Essential for functionality einops==0.7.0 open-clip-torch>=2.20.0 huggingface_hub safetensors sentencepiece>=0.1.99 tqdm # Optional but recommended for full compatibility auraloss==0.4.0 encodec==0.1.1 lightning>=2.0.0 einops-exts==0.0.4 ema-pytorch==0.2.3 k-diffusion==0.1.1 PyWavelets==1.4.1 pandas>=2.0.0 importlib-resources>=5.0.0

A ComfyUI wrapper implementation of ThinkSound – an advanced AI model for generating high-quality audio from text descriptions and video content using Chain-of-Thought (CoT) reasoning.
https://github.com/user-attachments/assets/b3f090a7-fe58-4bb0-8e21-cb19377aa9cf
ThinkSound uses multimodal AI to understand both text and video:
The following Python packages will be installed automatically:
torch>=2.0.1
torchaudio>=2.0.2
torchvision>=0.15.0
transformers>=4.20.0
accelerate>=0.20.0
alias-free-torch==0.0.6
descript-audio-codec==1.0.0
vector-quantize-pytorch==1.9.14
einops==0.7.0
open-clip-torch>=2.20.0
huggingface_hub
safetensors
sentencepiece>=0.1.99
“`bash
cd ComfyUI/custom_nodes/
“`
“`bash
git clone https://github.com/ShmuelRonen/ComfyUI-ThinkSound_Wrapper.git
cd ComfyUI-ThinkSound_Wrapper
“`
“`
ComfyUI-ThinkSound_Wrapper/
├── __init__.py
├── nodes.py
├── requirements.txt
├── thinksound/
│ ├── data/
│ ├── models/
│ ├── inference/
│ └── …
└── README.md
“`
Option A: Install all dependencies (recommended)
pip install -r requirements.txt
Option B: Install minimal dependencies
pip install torch torchaudio torchvision transformers accelerate
pip install alias-free-torch==0.0.6 descript-audio-codec==1.0.0 vector-quantize-pytorch==1.9.14
pip install einops open-clip-torch huggingface_hub safetensors sentencepiece
🔗 Download Models (Google Drive)
“`
ComfyUI/models/thinksound/
├── thinksound_light.ckpt
├── vae.ckpt
├── synchformer_state_dict.pth
└── (other model files)
“`
“`bash
mkdir -p ComfyUI/models/thinksound
“`
“`
🎉 ThinkSound modules imported successfully!
✅ SUCCESS: Found FeaturesUtils in thinksound.data.v2a_utils.feature_utils_224
“`
After installation, you’ll find these nodes in ComfyUI:
thinksound_model (select your .ckpt file)thinksound_modelvae_model, synchformer_modelfeature_utilsThinkSound Model Loader ──┐
├── ThinkSound Sampler ── Audio Output
ThinkSound Feature Utils ─┘
Loader
Example 1: Simple Audio
Caption: "Dog barking"
CoT Description: "Generate the sound of a medium-sized dog barking outdoors. The barking should be natural and energetic, with slight echo to suggest an open space. Include 3-4 distinct barks with realistic timing between them."
Example 2: Complex Scene
Caption: "Ocean waves at beach"
CoT Description: "Create gentle ocean waves lapping against the shore. Add subtle sounds of water receding over sand and pebbles. Include distant seagull calls and a light ocean breeze for natural ambiance."
Example 3: Musical Content
Caption: "Jazz piano"
CoT Description: "Generate a smooth jazz piano melody in a minor key. Include syncopated rhythms, bluesy chord progressions, and subtle improvisation. The tempo should be moderate and relaxing, perfect for a late-night cafe atmosphere."
Issue: “ThinkSound source code not installed”
Solution: Ensure you've downloaded the ThinkSound repository to the 'thinksound' folder
Issue: “ImportError: No module named ‘alias_free_torch'”
Solution: Install missing dependencies:
pip install alias-free-torch==0.0.6 descript-audio-codec==1.0.0 vector-quantize-pytorch==1.9.14
Issue: “Input type (float) and bias type (struct c10::Half) should be the same”
Solution: This is resolved automatically with fp32 precision. Restart ComfyUI if you see this error.
Issue: “Tensors must have same number of dimensions”
Solution: Update to the latest version of the nodes. This was fixed in recent updates.
Issue: Models not loading
Solution:
1. Check that models are in ComfyUI/models/thinksound/
2. Verify model file names match the dropdown options
3. Check ComfyUI console for specific error messages
To update the project:
git pull origin maincd thinksound && git pullThis project is a wrapper implementation based on ThinkSound by FunAudioLLM. Please refer to the original ThinkSound repository for licensing information.
Contributions are welcome! Please:
If you encounter issues:
Enjoy creating amazing audio with ThinkSound! 🎵✨