WeClone: Create Your AI Digital Twin from Chat History with LLMs

WeClone: Create Your AI Digital Twin from Chat History with LLMs

Summary

WeClone is an innovative open-source project that provides a comprehensive solution for creating your personal AI digital twin. It allows users to fine-tune Large Language Models (LLMs) using their chat history, capturing unique communication styles. The resulting AI can then be integrated with various chatbots, bringing your digital self to life.

Repository Info

Updated on May 3, 2026
View on GitHub

Tags

Click on any tag to explore related repositories

Introduction

WeClone is an innovative open-source project designed to create your personal AI digital twin from your chat history. It offers a comprehensive, end-to-end solution for fine-tuning Large Language Models (LLMs) with your unique communication style, allowing you to bring a digital version of yourself to life. The project supports various chat data sources and deployment platforms, emphasizing privacy and localized control over your data.

Installation

To get started with WeClone, follow these steps. A CUDA environment (version 12.6 or above) is required.

  1. Clone the repository:

    git clone https://github.com/xming521/WeClone.git && cd WeClone
    
  2. Set up the environment with uv (recommended):

    uv venv .venv --python=3.12
    source .venv/bin/activate # For Windows: .venv\Scripts\activate
    uv pip install --group main -e .
    
  3. Copy the configuration file:

    cp examples/tg.template.jsonc settings.jsonc
    

    Modify settings.jsonc for your specific needs.

  4. Download models:
    It is recommended to use Hugging Face or the following command:

    git lfs install
    git clone https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct models/Qwen2.5-VL-7B-Instruct
    

Examples

WeClone provides a clear workflow from data preparation to deployment.

  1. Data Preparation: Export your chat history (e.g., from Telegram Desktop) as JSON and place it in the ./dataset/telegram directory.

  2. Data Preprocessing: Configure settings.jsonc (e.g., language, platform, telegram_args.my_id) and run:

    weclone-cli make-dataset
    

    The project includes privacy filtering for sensitive information.

  3. Fine-tuning the Model: Adjust training parameters in settings.jsonc and execute:

    weclone-cli train-sft
    

    Multi-GPU training is also supported with DeepSpeed.

  4. Inference and Deployment:

    • Webchat Demo: Test your fine-tuned model in a browser:

      weclone-cli webchat-demo
      
    • API Server: Start an API service for integration:

      weclone-cli server
      
    • Deploy to Chatbots: Integrate your AI twin with platforms like AstrBot or LangBot by configuring them to use the WeClone API service.

Why Use WeClone?

WeClone stands out as a powerful tool for creating personalized AI avatars due to several key features:

  • End-to-End Solution: It covers every step, from chat data export and preprocessing to model training and deployment.
  • Personalized LLMs: Fine-tune models with your actual chat history, including image modal data, to capture your unique style and "flavor."
  • Privacy and Control: Supports localized fine-tuning and deployment, along with privacy information filtering, ensuring your data remains secure and under your control.
  • Multi-Platform Integration: Easily integrate your digital avatar with popular messaging platforms like Telegram, Discord, Slack, and WeChat.
  • Active Development: The project is in rapid iteration, continuously adding new features and improvements.

Links