DragGAN: Interactive Point-Based Image Manipulation with Generative AI

Introduction

DragGAN presents a groundbreaking approach to interactive image manipulation, as featured in the SIGGRAPH 2023 conference proceedings. This repository provides the official implementation for "Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold," allowing users to precisely control the pose, shape, expression, and layout of objects within AI-generated images. By simply "dragging" points on an image, users can achieve complex and realistic transformations, making it a powerful tool for artists, researchers, and developers working with generative models.

Installation

To get started with DragGAN, follow these installation instructions based on your system configuration.

For CUDA-enabled GPUs:

conda env create -f environment.yml
conda activate stylegan3
pip install -r requirements.txt

For MacOS with Apple Silicon (M1/M2) or CPU-only:

cat environment.yml | \
  grep -v -E 'nvidia|cuda' > environment-no-nvidia.yml && \
    conda env create -f environment-no-nvidia.yml
conda activate stylegan3

# On MacOS
export PYTORCH_ENABLE_MPS_FALLBACK=1

Running with Docker (for Gradio visualizer):

First, clone the repository and download pre-trained models:

python scripts/download_model.py

Then, build and run the Docker container:

docker build . -t draggan:latest
docker run -p 7860:7860 -v "$PWD":/workspace/src -it draggan:latest bash
# For GPU acceleration:
# docker run --gpus all -p 7860:7860 -v "$PWD":/workspace/src -it draggan:latest bash

cd src && python visualizer_drag_gradio.py --listen

Examples

DragGAN offers several ways to interact with its powerful image manipulation capabilities. After installation and downloading pre-trained StyleGAN2 weights (using python scripts/download_model.py), you can run the graphical user interface (GUI) or a Gradio web demo.

Running the GUI:

sh scripts/gui.sh
# For Windows:
# .\scripts\gui.bat

The GUI allows for direct editing of GAN-generated images. For real image editing, GAN inversion tools like PTI are required first.

Running the Gradio Demo:

python visualizer_drag_gradio.py

This provides a web-based interface accessible from any browser, making it easy to experiment with the dragging functionality. Pre-trained models for StyleGAN-Human and Landscapes HQ (LHQ) are also available for download to expand your creative possibilities.

Why Use DragGAN?

DragGAN stands out for its intuitive and precise control over generative adversarial networks (GANs). Instead of complex parameter adjustments, users can achieve desired image transformations by simply dragging points, mimicking how one might edit an image in a traditional editor. This interactive approach democratizes access to advanced AI image generation and manipulation, enabling rapid prototyping, artistic creation, and detailed research into the latent space of GANs. Its robust implementation, backed by SIGGRAPH 2023, ensures high-quality results and a strong foundation for further development in the field of AI-driven content creation.