{"name":"StreamDiffusion: Real-Time Interactive Generation with Diffusion Pipelines","description":"StreamDiffusion is an innovative diffusion pipeline designed for real-time interactive generation, significantly enhancing the performance of current diffusion-based image generation techniques. It offers a pipeline-level solution to achieve high-speed image and text-to-image generation, making interactive AI experiences more accessible. This project introduces several key features to optimize computational efficiency and GPU utilization.","github":"https://github.com/cumulo-autumn/StreamDiffusion","url":"https://osrepos.com/repo/cumulo-autumn-streamdiffusion","source":"osrepos.com","sourceDescription":"This repository profile is provided by osrepos.com, an open source repository discovery platform.","repositoryProfile":"https://osrepos.com/repo/cumulo-autumn-streamdiffusion","generatedFor":"open source discovery and AI-assisted research","markdown":"https://osrepos.com/repo/cumulo-autumn-streamdiffusion.md","json":"https://osrepos.com/repo/cumulo-autumn-streamdiffusion.json","topics":["Python","Diffusion Models","Real-time AI","Image Generation","Machine Learning","Deep Learning","Computer Vision","Generative AI"],"keywords":["Python","Diffusion Models","Real-time AI","Image Generation","Machine Learning","Deep Learning","Computer Vision","Generative AI"],"stars":null,"summary":"StreamDiffusion is an innovative diffusion pipeline designed for real-time interactive generation, significantly enhancing the performance of current diffusion-based image generation techniques. It offers a pipeline-level solution to achieve high-speed image and text-to-image generation, making interactive AI experiences more accessible. This project introduces several key features to optimize computational efficiency and GPU utilization.","content":"## Introduction\nStreamDiffusion is a groundbreaking diffusion pipeline developed by cumulo-autumn, offering a pipeline-level solution for real-time interactive generation. This project aims to significantly enhance the performance of existing diffusion-based image generation techniques, enabling faster and more responsive AI art creation. It was introduced in the paper 'StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation' and is built on a foundation of innovative optimization strategies.\n\n## Installation\nTo get started with StreamDiffusion, follow these steps:\n\n### Step 0: Clone the Repository\nbash\ngit clone https://github.com/cumulo-autumn/StreamDiffusion.git\ncd StreamDiffusion\n\n\n### Step 1: Make Environment\nYou can install StreamDiffusion via `pip`, `conda`, or Docker.\n\n**Using Conda:**\nbash\nconda create -n streamdiffusion python=3.10\nconda activate streamdiffusion\n\n\n**Using Venv:**\nbash\npython -m venv .venv\n# Windows\n.\\.venv\\Scripts\\activate\n# Linux\nsource .venv/bin/activate\n\n\n### Step 2: Install PyTorch\nSelect the appropriate version for your system.\n\n**CUDA 11.8:**\nbash\npip3 install torch==2.1.0 torchvision==0.16.0 xformers --index-url https://download.pytorch.org/whl/cu118\n\n\n**CUDA 12.1:**\nbash\npip3 install torch==2.1.0 torchvision==0.16.0 xformers --index-url https://download.pytorch.org/whl/cu121\n\n\nFor more details, visit the [PyTorch website](https://pytorch.org/ \"PyTorch website\" target=\"_blank\").\n\n### Step 3: Install StreamDiffusion\n\n#### For Users\nInstall StreamDiffusion:\nbash\n# For Latest Version (recommended)\npip install git+https://github.com/cumulo-autumn/StreamDiffusion.git@main#egg=streamdiffusion[tensorrt]\n\n# Or for Stable Version\npip install streamdiffusion[tensorrt]\n\n\nInstall TensorRT extension:\nbash\npython -m streamdiffusion.tools.install-tensorrt\n\n\n(Only for Windows) You may need to install `pywin32` additionally if you installed the Stable Version (`pip install streamdiffusion[tensorrt]`):\nbash\npip install --force-reinstall pywin32\n\n\n#### For Developers\nbash\npython setup.py develop easy_install streamdiffusion[tensorrt]\npython -m streamdiffusion.tools.install-tensorrt\n\n\n### Docker Installation (TensorRT Ready)\nbash\ngit clone https://github.com/cumulo-autumn/StreamDiffusion.git\ncd StreamDiffusion\ndocker build -t stream-diffusion:latest -f Dockerfile .\ndocker run --gpus all -it -v $(pwd):/home/ubuntu/streamdiffusion stream-diffusion:latest\n\n\n## Examples\nStreamDiffusion provides various examples to demonstrate its capabilities, including real-time text-to-image and image-to-image generation. You can find more detailed examples in the [`examples`](https://github.com/cumulo-autumn/StreamDiffusion/tree/main/examples \"StreamDiffusion Examples\" target=\"_blank\") directory of the repository.\n\n### Image-to-Image Example\nThis example shows how to use StreamDiffusion for real-time image-to-image generation:\n\npython\nimport torch\nfrom diffusers import AutoencoderTiny, StableDiffusionPipeline\nfrom diffusers.utils import load_image\n\nfrom streamdiffusion import StreamDiffusion\nfrom streamdiffusion.image_utils import postprocess_image\n\npipe = StableDiffusionPipeline.from_pretrained(\"KBlueLeaf/kohaku-v2.1\").to(\n    device=torch.device(\"cuda\"),\n    dtype=torch.float16,\n)\n\nstream = StreamDiffusion(\n    pipe,\n    t_index_list=[32, 45],\n    torch_dtype=torch.float16,\n)\n\nstream.load_lcm_lora()\nstream.fuse_lora()\nstream.vae = AutoencoderTiny.from_pretrained(\"madebyollin/taesd\").to(device=pipe.device, dtype=pipe.dtype)\npipe.enable_xformers_memory_efficient_attention()\n\nprompt = \"1girl with dog hair, thick frame glasses\"\nstream.prepare(prompt)\n\ninit_image = load_image(\"assets/img2img_example.png\").resize((512, 512))\n\nfor _ in range(2):\n    stream(init_image)\n\nwhile True:\n    x_output = stream(init_image)\n    postprocess_image(x_output, output_type=\"pil\")[0].show()\n    input_response = input(\"Press Enter to continue or type 'stop' to exit: \")\n    if input_response == \"stop\":\n        break\n\n\n### Text-to-Image Example\nHere is a basic example for real-time text-to-image generation:\n\npython\nimport torch\nfrom diffusers import AutoencoderTiny, StableDiffusionPipeline\n\nfrom streamdiffusion import StreamDiffusion\nfrom streamdiffusion.image_utils import postprocess_image\n\npipe = StableDiffusionPipeline.from_pretrained(\"KBlueLeaf/kohaku-v2.1\").to(\n    device=torch.device(\"cuda\"),\n    dtype=torch.float16,\n)\n\nstream = StreamDiffusion(\n    pipe,\n    t_index_list=[0, 16, 32, 45],\n    torch_dtype=torch.float16,\n    cfg_type=\"none\",\n)\n\nstream.load_lcm_lora()\nstream.fuse_lora()\nstream.vae = AutoencoderTiny.from_pretrained(\"madebyollin/taesd\").to(device=pipe.device, dtype=pipe.dtype)\npipe.enable_xformers_memory_efficient_attention()\n\nprompt = \"1girl with dog hair, thick frame glasses\"\nstream.prepare(prompt)\n\nfor _ in range(4):\n    stream()\n\nwhile True:\n    x_output = stream.txt2img()\n    postprocess_image(x_output, output_type=\"pil\")[0].show()\n    input_response = input(\"Press Enter to continue or type 'stop' to exit: \")\n    if input_response == \"stop\":\n        break\n\n\n### Faster Generation with TensorRT\nTo achieve even faster generation, you can integrate TensorRT acceleration:\n\npython\nfrom streamdiffusion.acceleration.tensorrt import accelerate_with_tensorrt\n\nstream = accelerate_with_tensorrt(\n    stream, \"engines\", max_batch_size=2,\n)\n\n\nThis requires the TensorRT extension and time to build the engine, but it significantly boosts performance.\n\n### Real-Time Demos\nStreamDiffusion includes interactive demos for both text-to-image and image-to-image generation.\n\n**Real-Time Txt2Img Demo:**\n![Real-Time Txt2Img Demo](https://github.com/cumulo-autumn/StreamDiffusion/raw/main/assets/demo_01.gif \"Real-Time Txt2Img Demo\")\n\n**Real-Time Img2Img Demo (Webcam/Screen Capture):**\n![Real-Time Img2Img Demo](https://github.com/cumulo-autumn/StreamDiffusion/raw/main/assets/img2img1.gif \"Real-Time Img2Img Demo\")\n\n## Why Use StreamDiffusion\nStreamDiffusion stands out due to its innovative approach to optimizing diffusion models for real-time applications. Its core strength lies in a suite of features designed to maximize efficiency and speed.\n\n### Key Features\n*   **Stream Batch**: Streamlined data processing through efficient batch operations.\n*   **Residual Classifier-Free Guidance (RCFG)**: An improved guidance mechanism that minimizes computational redundancy, offering competitive complexity compared to traditional CFG. It supports \"Self-Negative\" and \"Onetime-Negative\" configurations.\n*   **Stochastic Similarity Filter**: Enhances GPU utilization efficiency by reducing processing when there is little change between frames, ideal for video inputs.\n*   **IO Queues**: Efficiently manages input and output operations for smoother execution.\n*   **Pre-Computation for KV-Caches**: Optimizes caching strategies for accelerated processing.\n*   **Model Acceleration Tools**: Utilizes various tools for model optimization and performance boost, including TensorRT integration.\n\n### Performance Benchmarks\nWhen running on an RTX 4090 GPU, Core i9-13900K CPU, and Ubuntu 22.04.3 LTS, StreamDiffusion achieves impressive frame rates:\n\n| model | Denoising Step | fps on Txt2Img | fps on Img2Img |\n| :-------------------------: | :------------: | :------------: | :------------: |\n| SD-turbo | 1 | 106.16 | 93.897 |\n| LCM-LoRA <br>+<br> KohakuV2 | 4 | 38.023 | 37.133 |\n\nThese benchmarks demonstrate StreamDiffusion's capability to deliver high-speed generation, making it a powerful tool for interactive AI applications.\n\n## Links\nFor more information and to explore the project further, please visit the official links:\n\n*   **GitHub Repository**: [https://github.com/cumulo-autumn/StreamDiffusion](https://github.com/cumulo-autumn/StreamDiffusion \"StreamDiffusion GitHub Repository\" target=\"_blank\")\n*   **arXiv Paper**: [https://arxiv.org/abs/2312.12491](https://arxiv.org/abs/2312.12491 \"StreamDiffusion arXiv Paper\" target=\"_blank\")\n*   **Hugging Face Papers**: [https://huggingface.co/papers/2312.12491](https://huggingface.co/papers/2312.12491 \"StreamDiffusion Hugging Face Papers\" target=\"_blank\")","metrics":{"detailViews":6,"githubClicks":2},"dates":{"published":null,"modified":"2025-12-13T08:01:19.000Z"}}