comfyui-lmstudio-image-to-text-node

★ 45

图像描述视觉模型本地APIComfyUI节点

将 LM Studio 视觉模型接入 ComfyUI，通过本地 lmstudio API 将图像转换为文本描述，支持自定义系统提示、多种输入组合与模型管理。

💡 在 ComfyUI 流程中把图片生成文字描述以便注释或作为提示上下文。

🍴 12 Forks💻 Python🔄 2026-03-11

🔗 GitHub 原文

📦

网盘下载

复制链接后前往夸克网盘下载

https://pan.quark.cn/s/f414772aa5c3

📦 requirements.txt

lmstudio
requests

📄 README

LM Studio Nodes for ComfyUI

Author: Matt John Powell

This extension provides a suite of custom nodes for ComfyUI that deeply integrate LM Studio’s capabilities using the official lmstudio Python SDK. It allows you to leverage locally run models for various generative tasks directly within your ComfyUI workflows.

The nodes offer functionalities including:

Unified Generation: Generate text from text prompts, image prompts, or both combined.

Image to Text: Generate detailed text descriptions of images using vision models.

Text Generation: Generate text based on a given prompt using language models, with streaming support.

Model Management: List available models, load models into memory with a TTL, and unload them.

Model Selection: Dynamically select models based on type (LLM/Vision) and text filters.

Setup Assistance: Helper node for SDK installation check, model listing, and download guidance.

Random List Picker: Paste any newline-separated list of items and randomly select one or many each run — great for injecting random genres, styles, moods, subjects, or any variable into your prompts.

Workflow Example

Here’s an example of how the LM Studio nodes can be used in a ComfyUI workflow:

*(Ensure this image accurately reflects a workflow using the new nodes)*

Features

Utilizes the official lmstudio Python SDK for robust connection and interaction.

Supports text-only, image-only (vision), and combined text+image inputs.

Identifies models using model_key strings (e.g., llama-3.2-1b-instruct).

Provides dedicated nodes for specific tasks (Image-to-Text, Text Generation).

Includes nodes for managing model memory (Load/Unload/List).

Offers a dynamic model selector node for easier workflow building.

Setup node provides guidance for installing the SDK and getting models.

Customizable system prompts for context setting.

Control over generation parameters like max_tokens, temperature, and seed.

Optional streaming output for the Text Generation node.

Debug mode for detailed console logging.

Automatic connection to the LM Studio server (no need to specify IP/Port in nodes).

Random List Picker utility node for injecting randomness into prompts — supports weighted items, templates, multi-pick, shuffle, case control, exclusions, and reproducible seeds.

Installation

Navigate to your ComfyUI custom_nodes directory.

Clone this repository:

“`bash

cd /path/to/ComfyUI/custom_nodes

git clone https://github.com/mattjohnpowell/comfyui-lmstudio-nodes.git ComfyExpo-LMStudioNodes

# Using ‘ComfyExpo-LMStudioNodes’ as the directory name to avoid potential conflicts

“`

Install Dependencies: Ensure the lmstudio package is installed in ComfyUI’s Python environment:

“`bash

# Activate your ComfyUI Python environment (if using venv, conda, etc.) first!

pip install lmstudio

“`

Troubleshooting

LM Studio Version Compatibility

If you encounter errors like:

Error processing with LM Studio: Object missing required field 'bosToken' - at '$.promptTemplate.jinjaPromptTemplate'

Other SDK-related errors after upgrading LM Studio

This usually means your LM Studio SDK version is incompatible with your LM Studio version. Solution:

Quick Fix: Run the upgrade helper script:

“`bash

python upgrade_lmstudio.py

“`

Manual Fix: Upgrade the LM Studio SDK:

“`bash

pip install lmstudio –upgrade

“`

Restart ComfyUI completely after upgrading.

Backward Compatibility

The nodes now support backward compatibility with older workflow files that used:

model parameter instead of model_key

ip_address and port parameters for HTTP connections

When old parameters are detected, the nodes will:

Show a deprecation warning

Automatically use the legacy HTTP mode if ip_address and port are provided

Map model parameter to model_key for SDK mode

Recommendation: Update old workflows to use the new SDK-based parameters for better performance and reliability.

*(You might need to use a specific pip command depending on your ComfyUI setup, e.g., for portable builds)*

Restart ComfyUI.

Usage

Add the nodes to your workflow by right-clicking the canvas and searching for their names (e.g., “LM Studio Unified (Expo)”).

LM Studio Unified (Expo)

Handles text-only, image-only, or combined text+image generation.

Inputs:

model_key (STRING, required): The key of the LM Studio model to use (e.g., llama-3.2-1b-instruct or a vision model key). Default: llama-3.2-1b-instruct.

system_prompt (STRING, required): System prompt for the AI. Default: “You are a helpful AI assistant.”

seed (INT, required): Seed for reproducibility (-1 for random). Default: -1.

image (IMAGE, optional): Input image (requires a vision-capable model_key).

text_input (STRING, optional): Text prompt. Default: “”.

max_tokens (INT, optional): Max tokens for the response. Default: 1000.

temperature (FLOAT, optional): Generation temperature. Default: 0.7.

debug (BOOLEAN, optional): Enable console logging. Default: False.

Output:

Generated Text (STRING): The model’s response.

*(Note: At least one of image or text_input must be provided for the node to run).*

LM Studio I2T (Expo)

Generates text descriptions for images using vision models.

Inputs:

model_key (STRING, required): The key of the vision-capable LM Studio model. Default: qwen2-vl-2b-instruct.

system_prompt (STRING, required): System prompt for the AI. Default: “This is a chat between a user and an assistant. The assistant is an expert in describing images, with detail and accuracy”.

seed (INT, required): Seed for reproducibility (-1 for random). Default: -1.

image (IMAGE, required): The input image to be described.

user_prompt (STRING, optional): The prompt asking about the image. Default: “Describe this image in detail”.

max_tokens (INT, optional): Max tokens for the response. Default: 1000.

temperature (FLOAT, optional): Generation temperature. Default: 0.7.

debug (BOOLEAN, optional): Enable console logging. Default: False.

Output:

Description (STRING): The generated text description.

LM Studio Text Gen (Expo)

Generates text based on a text prompt using language models.

Inputs:

model_key (STRING, required): The key of the LM Studio language model. Default: llama-3.2-1b-instruct.

system_prompt (STRING, required): System prompt for the AI. Default: “You are a helpful AI assistant.”.

seed (INT, required): Seed for reproducibility (-1 for random). Default: -1.

prompt (STRING, optional): The input prompt for text generation. Default: “Generate a creative story:”.

max_tokens (INT, optional): Max tokens for the response. Default: 1000.

temperature (FLOAT, optional): Generation temperature. Default: 0.7.

stream_output (BOOLEAN, optional): Stream response fragments (Note: final output in ComfyUI is still the complete text). Default: False.

debug (BOOLEAN, optional): Enable console logging. Default: False.

Output:

Generated Text (STRING): The generated text.

*(Note: Requires a non-empty prompt to run).*

LM Studio Model Mgr (Expo)

Manages models loaded via the LM Studio SDK.

Inputs:

action (COMBO, required): Action to perform (LIST, LOAD, UNLOAD). Default: LIST.

model_key (STRING, optional): The model key to load or unload. Required for LOAD/UNLOAD. Default: “”.

model_type (COMBO, optional): Filter models by type (ALL, LLM, EMBEDDING). Default: ALL.

load_ttl (INT, optional): Time-to-live (seconds) for loaded models (how long to keep in memory after last use). Used with LOAD. Default: 3600.

debug (BOOLEAN, optional): Enable console logging. Default: False.

Output:

Result (STRING): Confirmation message or list of models.

LM Studio Model Sel (Expo)

Dynamically provides a model key based on filters, useful for connecting to other nodes.

Inputs:

model_type (COMBO, required): Type of model to list (LLM, Vision). Default: LLM.

filter_text (STRING, optional): Text to filter model names/keys by. Default: “”.

Output:

Model Key (STRING): The model_key of the first matching model (sorted alphabetically), or a default if none match.

*(Note: This node attempts to list models available via the SDK at workflow load time. Ensure LM Studio server is running when building workflows).*

LM Studio Setup (Expo)

Provides helper actions related to SDK setup and model discovery.

Inputs:

action (COMBO, required): Action to perform (INSTALL SDK, GET MODEL, LIST MODELS). Default: LIST MODELS.

model_key (STRING, required): Model key relevant to the action (used for GET MODEL). Default: llama-3.2-1b-instruct.

Output:

Result (STRING): Status message, command guidance, or list of models.

*(Note: INSTALL SDK attempts pip install lmstudio. GET MODEL provides instructions for using the lms CLI tool).*

Random List Picker

A general-purpose utility node. Paste any newline-separated list of items and the node randomly selects one (or many) each time the workflow runs, then assembles a ready-to-use prompt string. No LM Studio connection required — works standalone.

Inputs:

|—|—|—|—|

| template | STRING (multiline) | *(empty)* | Prompt template using {item} as a placeholder, e.g. "Create a {item} track with heavy bass." Overrides prefix/suffix when non-empty. |

| count | INT | 1 | Number of unique items to pick. Automatically capped at the list size. |

| seed | INT | -1 | Random seed. -1 = new random result every run; any other value = reproducible. |

Outputs:

| Output | Type | Description |

|—|—|—|

| selected_item | STRING | The chosen item(s), joined with separator. |

| prompt | STRING | Fully assembled prompt (template or prefix + item + suffix). Wire this directly into any text input. |

| item_count | INT | Total usable items after exclusions. |

| selected_index | INT | Zero-based index of the first chosen item (-1 in shuffle mode). |

Example — random music genre prompt:

items:
  rock
  pop
  jazz::3
  classical
  hip-hop

template:  Create a {item} song with a melancholic mood.

Each run the node draws a weighted-random genre and outputs something like:

Create a jazz song with a melancholic mood.

Example — pick 3 unique styles:

count: 3
separator: ", "
template: Generate artwork combining {item}.

Output prompt: Generate artwork combining watercolour, impressionist, cyberpunk.

LM Studio Setup

Install and run LM Studio on your machine.

Download desired models within LM Studio (ensure you have vision models for image tasks).

Go to the “Server” tab (icon looks like <->) in LM Studio.

Select a model to load and click Start Server.

The LM Studio server *must* be running for these ComfyUI nodes to connect and function.

Notes

These nodes use the official lmstudio Python SDK, which handles the connection to your running LM Studio server automatically (typically localhost:1234). There’s no need to input IP/Port in the nodes themselves.

Use the model_key (found in LM Studio, e.g., Org/ModelName-Format) as the identifier in the model_key inputs.

The seed input allows for reproducible outputs. Set to -1 for a random seed on each run.

SDK Usage

These nodes leverage the official lmstudio Python SDK, replacing the previous method of interacting via the OpenAI-compatible API endpoint directly. This provides more robust integration and access to SDK-specific features.

Troubleshooting

If you encounter any issues:

Enable the debug input (set to True) on the relevant node(s).

Check the ComfyUI console for error messages and detailed debug output from the nodes.

Verify that the LM Studio application is running and the Server has been started with a model loaded.

Ensure the model_key you are providing exists in your LM Studio library and is compatible with the node’s task (e.g., a vision model for the I2T node).

Confirm the lmstudio Python package is correctly installed in your ComfyUI environment.

For further assistance, please open an issue on the GitHub repository: https://github.com/mattjohnpowell/comfyui-lmstudio-nodes

License

This project is licensed under the MIT License – see the LICENSE file for details.

Acknowledgments

Built upon the ComfyUI framework.

Utilizes the official LM Studio Python SDK.

Inspired by LM Studio’s capabilities and examples.