DeepFabric: High-Quality Synthetic Data for Agentic AI Systems

Summary

DeepFabric is an open-source Python library designed to generate high-quality synthetic training data for language models and agent evaluations. It excels at creating domain-specific datasets that teach models to think, plan, and act effectively, including correct tool usage and adherence to schema structures. This comprehensive pipeline also integrates training and evaluation capabilities, ensuring robust model development.

Use at your own risk

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.

Introduction

DeepFabric is a powerful open-source Python library that streamlines the process of generating synthetic training data for language models and evaluating agentic systems. It provides a complete pipeline to create high-quality, domain-specific datasets, train models, and rigorously assess their performance, particularly in tool-calling scenarios. By focusing on realistic reasoning traces and tool-calling patterns, DeepFabric helps develop models that can think, plan, and act effectively.

Installation

Getting started with DeepFabric is straightforward. You can install it using pip:

pip install deepfabric

Examples

DeepFabric can be used via its CLI, as a library, or with YAML configurations. Here's a quick example using the CLI to generate a dataset:

export OPENAI_API_KEY="your-api-key"

deepfabric generate \
  --topic-prompt "Python programming fundamentals" \
  --generation-system-prompt "You are a Python expert" \
  --mode graph \
  --depth 3 \
  --degree 3 \
  --num-samples 9 \
  --batch-size 3 \
  --provider openai \
  --model gpt-4o \
  --output-save-as dataset.jsonl

This command generates a topic graph and creates 27 unique nodes, then generates 27 training samples saved to dataset.jsonl, ensuring 100% topic coverage.

For evaluation, after training your model, you can use DeepFabric's built-in evaluator:

from deepfabric.evaluation import Evaluator, EvaluatorConfig, InferenceConfig
from datasets import load_dataset

# Load your evaluation dataset
dataset = load_dataset("your-username/your-dataset", split="test")

config = EvaluatorConfig(
    inference_config=InferenceConfig(
        model_path="./output/checkpoint-final",  # Local path or HF Hub ID
        backend="transformers",
    ),
)

evaluator = Evaluator(config)
results = evaluator.evaluate(dataset=dataset)

print(f"Overall Score: {results.metrics.overall_score:.2%}")

Why Use It

DeepFabric stands out by generating synthetic data that ensures high diversity while maintaining domain-anchored relevance, thanks to its unique topic graph generation algorithms. This approach prevents model overfit, a common issue with other tools. A key differentiator is its support for real tool execution using the Spin Framework, allowing agents to interact with isolated WebAssembly sandboxes. This produces authentic training data where decisions are based on actual observations, rather than simulated outputs. The platform also offers robust evaluation metrics, including tool selection accuracy, parameter accuracy, and execution success rate, providing a comprehensive view of model performance.