# StreamDiffusion: Real-Time Interactive Generation with Diffusion Pipelines

This repository profile is provided by osrepos.com, an open source repository discovery platform.

Source: osrepos.com
Repository profile: https://osrepos.com/repo/cumulo-autumn-streamdiffusion
Generated for open source discovery and AI-assisted research.

StreamDiffusion is an innovative diffusion pipeline designed for real-time interactive generation, significantly enhancing the performance of current diffusion-based image generation techniques. It offers a pipeline-level solution to achieve high-speed image and text-to-image generation, making interactive AI experiences more accessible. This project introduces several key features to optimize computational efficiency and GPU utilization.

GitHub: https://github.com/cumulo-autumn/StreamDiffusion
OSRepos URL: https://osrepos.com/repo/cumulo-autumn-streamdiffusion

## Summary

StreamDiffusion is an innovative diffusion pipeline designed for real-time interactive generation, significantly enhancing the performance of current diffusion-based image generation techniques. It offers a pipeline-level solution to achieve high-speed image and text-to-image generation, making interactive AI experiences more accessible. This project introduces several key features to optimize computational efficiency and GPU utilization.

## Topics

- Python
- Diffusion Models
- Real-time AI
- Image Generation
- Machine Learning
- Deep Learning
- Computer Vision
- Generative AI

## Repository Information

Last analyzed by OSRepos: Sat Dec 13 2025 08:01:19 GMT+0000 (Western European Standard Time)
Detail views: 6
GitHub clicks: 2

## Safety Notice

OSRepos shares public repositories for knowledge and discovery only. Review source code, dependencies, licenses, and security implications before running or installing anything.

## Content

## Introduction
StreamDiffusion is a groundbreaking diffusion pipeline developed by cumulo-autumn, offering a pipeline-level solution for real-time interactive generation. This project aims to significantly enhance the performance of existing diffusion-based image generation techniques, enabling faster and more responsive AI art creation. It was introduced in the paper 'StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation' and is built on a foundation of innovative optimization strategies.

## Installation
To get started with StreamDiffusion, follow these steps:

### Step 0: Clone the Repository
bash
git clone https://github.com/cumulo-autumn/StreamDiffusion.git
cd StreamDiffusion


### Step 1: Make Environment
You can install StreamDiffusion via `pip`, `conda`, or Docker.

**Using Conda:**
bash
conda create -n streamdiffusion python=3.10
conda activate streamdiffusion


**Using Venv:**
bash
python -m venv .venv
# Windows
.\.venv\Scripts\activate
# Linux
source .venv/bin/activate


### Step 2: Install PyTorch
Select the appropriate version for your system.

**CUDA 11.8:**
bash
pip3 install torch==2.1.0 torchvision==0.16.0 xformers --index-url https://download.pytorch.org/whl/cu118


**CUDA 12.1:**
bash
pip3 install torch==2.1.0 torchvision==0.16.0 xformers --index-url https://download.pytorch.org/whl/cu121


For more details, visit the [PyTorch website](https://pytorch.org/ "PyTorch website" target="_blank").

### Step 3: Install StreamDiffusion

#### For Users
Install StreamDiffusion:
bash
# For Latest Version (recommended)
pip install git+https://github.com/cumulo-autumn/StreamDiffusion.git@main#egg=streamdiffusion[tensorrt]

# Or for Stable Version
pip install streamdiffusion[tensorrt]


Install TensorRT extension:
bash
python -m streamdiffusion.tools.install-tensorrt


(Only for Windows) You may need to install `pywin32` additionally if you installed the Stable Version (`pip install streamdiffusion[tensorrt]`):
bash
pip install --force-reinstall pywin32


#### For Developers
bash
python setup.py develop easy_install streamdiffusion[tensorrt]
python -m streamdiffusion.tools.install-tensorrt


### Docker Installation (TensorRT Ready)
bash
git clone https://github.com/cumulo-autumn/StreamDiffusion.git
cd StreamDiffusion
docker build -t stream-diffusion:latest -f Dockerfile .
docker run --gpus all -it -v $(pwd):/home/ubuntu/streamdiffusion stream-diffusion:latest


## Examples
StreamDiffusion provides various examples to demonstrate its capabilities, including real-time text-to-image and image-to-image generation. You can find more detailed examples in the [`examples`](https://github.com/cumulo-autumn/StreamDiffusion/tree/main/examples "StreamDiffusion Examples" target="_blank") directory of the repository.

### Image-to-Image Example
This example shows how to use StreamDiffusion for real-time image-to-image generation:

python
import torch
from diffusers import AutoencoderTiny, StableDiffusionPipeline
from diffusers.utils import load_image

from streamdiffusion import StreamDiffusion
from streamdiffusion.image_utils import postprocess_image

pipe = StableDiffusionPipeline.from_pretrained("KBlueLeaf/kohaku-v2.1").to(
    device=torch.device("cuda"),
    dtype=torch.float16,
)

stream = StreamDiffusion(
    pipe,
    t_index_list=[32, 45],
    torch_dtype=torch.float16,
)

stream.load_lcm_lora()
stream.fuse_lora()
stream.vae = AutoencoderTiny.from_pretrained("madebyollin/taesd").to(device=pipe.device, dtype=pipe.dtype)
pipe.enable_xformers_memory_efficient_attention()

prompt = "1girl with dog hair, thick frame glasses"
stream.prepare(prompt)

init_image = load_image("assets/img2img_example.png").resize((512, 512))

for _ in range(2):
    stream(init_image)

while True:
    x_output = stream(init_image)
    postprocess_image(x_output, output_type="pil")[0].show()
    input_response = input("Press Enter to continue or type 'stop' to exit: ")
    if input_response == "stop":
        break


### Text-to-Image Example
Here is a basic example for real-time text-to-image generation:

python
import torch
from diffusers import AutoencoderTiny, StableDiffusionPipeline

from streamdiffusion import StreamDiffusion
from streamdiffusion.image_utils import postprocess_image

pipe = StableDiffusionPipeline.from_pretrained("KBlueLeaf/kohaku-v2.1").to(
    device=torch.device("cuda"),
    dtype=torch.float16,
)

stream = StreamDiffusion(
    pipe,
    t_index_list=[0, 16, 32, 45],
    torch_dtype=torch.float16,
    cfg_type="none",
)

stream.load_lcm_lora()
stream.fuse_lora()
stream.vae = AutoencoderTiny.from_pretrained("madebyollin/taesd").to(device=pipe.device, dtype=pipe.dtype)
pipe.enable_xformers_memory_efficient_attention()

prompt = "1girl with dog hair, thick frame glasses"
stream.prepare(prompt)

for _ in range(4):
    stream()

while True:
    x_output = stream.txt2img()
    postprocess_image(x_output, output_type="pil")[0].show()
    input_response = input("Press Enter to continue or type 'stop' to exit: ")
    if input_response == "stop":
        break


### Faster Generation with TensorRT
To achieve even faster generation, you can integrate TensorRT acceleration:

python
from streamdiffusion.acceleration.tensorrt import accelerate_with_tensorrt

stream = accelerate_with_tensorrt(
    stream, "engines", max_batch_size=2,
)


This requires the TensorRT extension and time to build the engine, but it significantly boosts performance.

### Real-Time Demos
StreamDiffusion includes interactive demos for both text-to-image and image-to-image generation.

**Real-Time Txt2Img Demo:**
![Real-Time Txt2Img Demo](https://github.com/cumulo-autumn/StreamDiffusion/raw/main/assets/demo_01.gif "Real-Time Txt2Img Demo")

**Real-Time Img2Img Demo (Webcam/Screen Capture):**
![Real-Time Img2Img Demo](https://github.com/cumulo-autumn/StreamDiffusion/raw/main/assets/img2img1.gif "Real-Time Img2Img Demo")

## Why Use StreamDiffusion
StreamDiffusion stands out due to its innovative approach to optimizing diffusion models for real-time applications. Its core strength lies in a suite of features designed to maximize efficiency and speed.

### Key Features
*   **Stream Batch**: Streamlined data processing through efficient batch operations.
*   **Residual Classifier-Free Guidance (RCFG)**: An improved guidance mechanism that minimizes computational redundancy, offering competitive complexity compared to traditional CFG. It supports "Self-Negative" and "Onetime-Negative" configurations.
*   **Stochastic Similarity Filter**: Enhances GPU utilization efficiency by reducing processing when there is little change between frames, ideal for video inputs.
*   **IO Queues**: Efficiently manages input and output operations for smoother execution.
*   **Pre-Computation for KV-Caches**: Optimizes caching strategies for accelerated processing.
*   **Model Acceleration Tools**: Utilizes various tools for model optimization and performance boost, including TensorRT integration.

### Performance Benchmarks
When running on an RTX 4090 GPU, Core i9-13900K CPU, and Ubuntu 22.04.3 LTS, StreamDiffusion achieves impressive frame rates:

| model | Denoising Step | fps on Txt2Img | fps on Img2Img |
| :-------------------------: | :------------: | :------------: | :------------: |
| SD-turbo | 1 | 106.16 | 93.897 |
| LCM-LoRA <br>+<br> KohakuV2 | 4 | 38.023 | 37.133 |

These benchmarks demonstrate StreamDiffusion's capability to deliver high-speed generation, making it a powerful tool for interactive AI applications.

## Links
For more information and to explore the project further, please visit the official links:

*   **GitHub Repository**: [https://github.com/cumulo-autumn/StreamDiffusion](https://github.com/cumulo-autumn/StreamDiffusion "StreamDiffusion GitHub Repository" target="_blank")
*   **arXiv Paper**: [https://arxiv.org/abs/2312.12491](https://arxiv.org/abs/2312.12491 "StreamDiffusion arXiv Paper" target="_blank")
*   **Hugging Face Papers**: [https://huggingface.co/papers/2312.12491](https://huggingface.co/papers/2312.12491 "StreamDiffusion Hugging Face Papers" target="_blank")