inference-gpu[yolo-world]==0.9.13




Have you ever fantasized about crafting a single masterpiece from elements of different photos? 😲 What if I told you that manually prompting each object’s shape, color, and features for diffusion could be a thing of the past? And let’s be real, using GPT to describe each object step by step can feel like chasing an AI in slow motion. 🐢💤
Here, I introduce a complete pipeline that takes two images as input, allowing you to choose which objects from the images you want to fuse seamlessly.
Click to expand/collapse
custom_nodes directory in ComfyUI:“`bash
git clone https://github.com/ducido/ObjectFusion_ComfyUI_nodes
“`
_Note: Place the 3 clips model into models/clip._
“`bash
conda create -n objectfusion python=3.10 -y
conda activate objectfusion
pip install -r custom_nodes/ObjectFusion_ComfyUI_nodes/requirements.txt
wget https://huggingface.co/camenduru/YoloWorld-EfficientSAM/resolve/main/efficient_sam_s_gpu.jit -P custom_nodes/ObjectFusion_ComfyUI_nodes/Custom_ComfyUI_YoloWorld_EfficientSAM
“`
_All the folders, except CROP_OBJECT, are from other repositories. Thank you for your amazing works, I appreciate that. Besides, I have made some minor modifications to fit this project. Here are the details:_
BBOX, categories.{ID} - {class} - {confidence}).prompt.object1, desc_obj1, object2, desc_obj2.default value to "" because the newest frontend of ComfyUI consider None value as a bugContributions are welcome! Please open an issue or submit a pull request.
This project is licensed under the MIT License.