# EasyJailbreak: A Python Framework for Adversarial LLM Jailbreak Prompts

This repository profile is provided by osrepos.com, an open source repository discovery platform.

Source: osrepos.com
Repository profile: https://osrepos.com/repo/easyjailbreak-easyjailbreak
Generated for open source discovery and AI-assisted research.

EasyJailbreak is an intuitive Python framework designed for generating adversarial jailbreak prompts for Large Language Models (LLMs). It provides a structured approach to decompose the jailbreaking process into iterative steps, offering components for mutation, attack, and evaluation. This tool is ideal for researchers and developers focused on LLM security and understanding model vulnerabilities.

GitHub: https://github.com/EasyJailbreak/EasyJailbreak
OSRepos URL: https://osrepos.com/repo/easyjailbreak-easyjailbreak

## Summary

EasyJailbreak is an intuitive Python framework designed for generating adversarial jailbreak prompts for Large Language Models (LLMs). It provides a structured approach to decompose the jailbreaking process into iterative steps, offering components for mutation, attack, and evaluation. This tool is ideal for researchers and developers focused on LLM security and understanding model vulnerabilities.

## Topics

- Python
- Jailbreak
- LLM Security
- Adversarial AI
- AI Safety
- Large Language Model
- Framework

## Repository Information

Last analyzed by OSRepos: Fri Jun 26 2026 16:37:34 GMT+0100 (Western European Summer Time)
Detail views: 1
GitHub clicks: 0

## Safety Notice

OSRepos shares public repositories for knowledge and discovery only. Review source code, dependencies, licenses, and security implications before running or installing anything.

## Content

## Introduction

EasyJailbreak is an easy-to-use Python framework specifically designed for researchers and developers focusing on Large Language Model (LLM) security. It provides a robust platform for generating adversarial jailbreak prompts by assembling various methods. The framework decomposes the mainstream jailbreaking process into several iterable steps: initializing mutation seeds, selecting suitable seeds, adding constraints, mutating, attacking, and evaluating. This modular design creates a flexible playground for further research and experimentation in LLM safety and vulnerability.

For more in-depth information, you can refer to the [official paper](https://arxiv.org/pdf/2403.12171.pdf), explore different LLMs' jailbreak results on the [EasyJailbreak Website](http://easyjailbreak.org/), and consult the [detailed documentation](https://easyjailbreak.github.io/EasyJailbreakDoc.github.io) for API and parameter explanations.

## Installation

To get started with EasyJailbreak, ensure you have `python>=3.9` installed. There are two primary methods for installation:

1.  **For users who only require the collected approaches (recipes):**
    bash
    pip install easyjailbreak
    

2.  **For users interested in adding new components (e.g., new mutate or evaluate methods):**
    bash
    git clone https://github.com/EasyJailbreak/EasyJailbreak.git
    cd EasyJailbreak
    pip install -e .
    

## Examples

EasyJailbreak provides a straightforward API to utilize its pre-implemented attack "recipes" on various models. Here's an example demonstrating how to use the `PAIR` recipe:

python
from easyjailbreak.attacker.PAIR_chao_2023 import PAIR
from easyjailbreak.datasets import JailbreakDataset
from easyjailbreak.models.huggingface_model import from_pretrained
from easyjailbreak.models.openai_model import OpenaiModel

# First, prepare models and datasets.
attack_model = from_pretrained(model_name_or_path='lmsys/vicuna-13b-v1.5',
                               model_name='vicuna_v1.1')
target_model = OpenaiModel(model_name='gpt-4',
                         api_keys='INPUT YOUR KEY HERE!!!')
eval_model = OpenaiModel(model_name='gpt-4',
                         api_keys='INPUT YOUR KEY HERE!!!')
dataset = JailbreakDataset('AdvBench')

# Then instantiate the recipe.
attacker = PAIR(attack_model=attack_model,
                target_model=target_model,
                eval_model=eval_model,
                jailbreak_datasets=dataset)

# Finally, start jailbreaking.
attacker.attack(save_path='vicuna-13b-v1.5_gpt4_gpt4_AdvBench_result.jsonl')


For more advanced customization, such as loading models, datasets, initializing seeds, and instantiating individual components (Selectors, Mutators, Constraints, Evaluators), refer to the comprehensive [documentation](https://easyjailbreak.github.io/EasyJailbreakDoc.github.io/).

## Why Use EasyJailbreak?

EasyJailbreak stands out as a valuable tool for several reasons:

*   **Ease of Use:** It offers an intuitive Python framework, simplifying the complex process of generating adversarial prompts.
*   **Modular Design:** The framework's decomposition into distinct, iterable steps allows for flexible experimentation and the development of custom attack methods.
*   **Comprehensive Recipes:** It collects and implements numerous attack recipes from relevant papers, providing a ready-to-use toolkit for evaluating LLM vulnerabilities.
*   **LLM Security Focus:** Designed specifically for LLM security research, it helps identify and understand potential weaknesses in large language models.
*   **Extensibility:** Researchers can easily integrate new components, such as novel mutation techniques or evaluation metrics, to push the boundaries of LLM safety research.

## Links

*   **GitHub Repository:** [https://github.com/EasyJailbreak/EasyJailbreak](https://github.com/EasyJailbreak/EasyJailbreak)
*   **Official Website:** [http://easyjailbreak.org/](http://easyjailbreak.org/)
*   **Documentation:** [https://easyjailbreak.github.io/EasyJailbreakDoc.github.io](https://easyjailbreak.github.io/EasyJailbreakDoc.github.io)
*   **Research Paper:** [https://arxiv.org/pdf/2403.12171.pdf](https://arxiv.org/pdf/2403.12171.pdf)