ComfyUI-IF_MemoAvatar

ComfyUI-IF_MemoAvatar
★ 174

说话头像视频生成音频驱动情感迁移
ComfyUI节点,基于MEMO从单张人像与音频生成情感丰富的会说话头像视频,支持音频驱动表情迁移与高质量输出。
💡 用单张照片和音频快速生成情感化的会说话头像视频。
🍴 11 Forks💻 Python🔄 2025-03-09
📦
网盘下载
复制链接后前往夸克网盘下载
https://pan.quark.cn/s/9671236b7e59
📦 requirements.txt
git-lfs
diffusers>=0.31.0
audio-separator
albumentations
numba
librosa
modelscope
transformers>=4.46.3
numpy>=1.26.4
PyYAML>=6.0.1
moviepy>=1.0.3
pillow>=10.4.0
librosa==0.10.2
audio-separator==0.24.1
funasr==1.0.27
modelscope
insightface==0.7.3
accelerate==1.1.1
albumentations==1.4.21
black==23.12.1
einops==0.8.0
ffmpeg-python==0.2.0
huggingface-hub==0.26.2
imageio==2.36.0
imageio-ffmpeg==0.5.1
hydra-core==1.3.2
jax==0.4.35
mediapipe==0.10.18
modelscope==1.20.1
omegaconf==2.3.0
onnxruntime>=1.20.1
onnxruntime-gpu>=1.20.1
opencv-python-headless==4.10.0.84
scikit-learn>=1.5.2
scipy>=1.14.1
tqdm>=4.67.1
demo
thorium_XMBCG9kbGn
yW8hDQhnhM
📄 README

ComfyUI-IF_MemoAvatar

Memory-Guided Diffusion for Expressive Talking Video Generation

ORIGINAL REPO

MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation

Longtao Zheng\*,

Yifan Zhang\*,

Hanzhong Guo\,

Jiachun Pan,

Zhenxiong Tan,

Jiahao Lu,

Chuanxin Tang,

Bo An,

Shuicheng Yan

_Project Page | arXiv | Model_

This repository contains the example inference script for the MEMO-preview model. The gif demo below is compressed. See our project page for full videos.

ComfyUI-IF_MemoAvatar

Memory-Guided Diffusion for Expressive Talking Video Generation

Overview

This is a ComfyUI implementation of MEMO (Memory-Guided Diffusion for Expressive Talking Video Generation), which enables the creation of expressive talking avatar videos from a single image and audio input.

Features

  • Generate expressive talking head videos from a single image
  • Audio-driven facial animation
  • Emotional expression transfer
  • High-quality video output
  • https://github.com/user-attachments/assets/bfbf896d-a609-4e0f-8ed3-16ec48f8d85a

    Installation

    * Xformers NOT REQUIRED BUT BETTER IF INSTALLED*

    * MAKE SURE YoU HAVE HF Token On Your environment VARIABLES *

    git clone the repo to your custom_nodes folder and then

    cd ComfyUI-IF_MemoAvatar
    pip install -r requirements.txt

    I removed xformers from the file because it needs a particular combination of pytorch on windows to work

    if you are on linux you can just run

    pip install xformers 

    for windows users if you don’t have xformers on your env

    pip show xformers 

    follow this guide to install a good comfyui environment if you don’t see any version install the latest following this free guide

    Installing Triton and Sage Attention Flash Attention

    [](https://www.youtube.com/watch?v=nSUGEdm2wU4)

    Model Files

    The models will automatically download to the following locations in your ComfyUI installation:

    models/checkpoints/memo/
    ├── audio_proj/
    ├── diffusion_net/
    ├── image_proj/
    ├── misc/
    │ ├── audio_emotion_classifier/
    │ ├── face_analysis/
    │ └── vocal_separator/
    └── reference_net/
    models/wav2vec/
    models/vae/sd-vae-ft-mse/
    models/emotion2vec/emotion2vec_plus_large/
    

    Copy the faceanalisys/models models from the folder directly into faceanalisys

    just until I make sure don’t just move then duplicate them cos

    HF will detect empty and download them every time

    If you don’t see a models.json or errors out create one yourself this is the content

    {
      "detection": [
        "scrfd_10g_bnkps"
      ],
      "recognition": [
        "glintr100"
      ],
      "analysis": [
        "genderage",
        "2d106det",
        "1k3d68"
      ]
    }

    and a version.txt containing

    0.7.3