Instructor: Structured Outputs for LLMs with Pydantic and Python

This repository profile is provided by osrepos.com, an open source repository discovery platform.

Instructor: Structured Outputs for LLMs with Pydantic and Python

Summary

Instructor is a powerful Python library designed to simplify obtaining structured outputs from Large Language Models (LLMs). By leveraging Pydantic, it provides robust validation, type safety, and IDE support, eliminating the need for manual JSON parsing, error handling, or retries. This tool streamlines the process of extracting reliable, structured data from any LLM provider.

Repository Information

Analyzed by OSRepos on November 8, 2025

Topics

Click on any tag to explore related repositories

Use at your own risk

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.

Introduction

Instructor is a Python library that revolutionizes how developers interact with Large Language Models (LLMs) to obtain structured data. It addresses the common challenges of getting reliable, validated JSON outputs from LLMs by integrating seamlessly with Pydantic. This integration provides automatic validation, type safety, and excellent IDE support, significantly simplifying the development of AI applications. Instructor handles complex tasks such as writing JSON schemas, managing validation errors, retrying failed extractions, and parsing unstructured responses across various LLM providers, all through a single, intuitive interface.

Installation

Getting started with Instructor is straightforward. You can install it using pip:

pip install instructor

Alternatively, you can use other package managers like uv or poetry:

uv add instructor
poetry add instructor

Examples

Instructor's core strength lies in its simplicity. Define your desired output structure using a Pydantic BaseModel, and Instructor handles the rest. Below is a basic example demonstrating how to extract user information:

import instructor
from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int

client = instructor.from_provider("openai/gpt-4o-mini")
user = client.chat.completions.create(
    response_model=User,
    messages=[{"role": "user", "content": "John is 25 years old"}],
)

print(user) # User(name='John', age=25)

Instructor also supports a wide range of LLM providers, allowing you to use the same API for OpenAI, Anthropic, Google, Ollama, and more.

# OpenAI
client = instructor.from_provider("openai/gpt-4o")

# Anthropic
client = instructor.from_provider("anthropic/claude-3-5-sonnet")

# Google
client = instructor.from_provider("google/gemini-pro")

# Ollama (local)
client = instructor.from_provider("ollama/llama3.2")

# All use the same API!
# user = client.chat.completions.create(
#     response_model=User,
#     messages=[{"role": "user", "content": "Extract a user from this text..."}],
# )

Why use Instructor

Instructor stands out by offering a comprehensive solution for structured LLM outputs, addressing many pain points developers face:

  • Automatic Validation and Retries: It automatically validates outputs against your Pydantic models and retries if validation fails, providing robust error handling.
  • Type Safety and IDE Support: Leveraging Pydantic ensures strong typing and excellent IDE integration, making your code more maintainable and less prone to errors.
  • Provider Agnostic: Use the same clean API across various LLM providers, including OpenAI, Anthropic, Google, and local models like Ollama.
  • Streaming Support: Stream partial objects as they are generated, enabling real-time feedback and more dynamic applications.
  • Complex Data Structures: Easily extract nested objects and lists, simplifying the handling of intricate data models.
  • Simplified Development: Eliminates the need for manual JSON schema writing, parsing, and error handling, allowing you to focus on application logic.

Links

To learn more and get started with Instructor, explore these official resources:

Related repositories

Similar repositories that may be relevant next.

TensorRT-LLM: Optimizing Large Language Model Inference on NVIDIA GPUs

TensorRT-LLM: Optimizing Large Language Model Inference on NVIDIA GPUs

July 3, 2026

TensorRT-LLM is an open-source library by NVIDIA designed to optimize inference for Large Language Models (LLMs) and Visual Generation models. It offers a user-friendly Python API, state-of-the-art optimizations, and specialized kernels to ensure efficient performance on NVIDIA GPUs. This powerful tool enables developers to deploy LLMs with high throughput and low latency, from single-GPU setups to multi-node deployments.

PythonLLMInference Optimization
DataDreamer: Streamlining Synthetic Data Generation and LLM Workflows

DataDreamer: Streamlining Synthetic Data Generation and LLM Workflows

July 3, 2026

DataDreamer is an open-source Python library designed for efficient prompting, synthetic data generation, and model training workflows. It simplifies the process of creating complex LLM workflows, generating high-quality synthetic datasets, and aligning or fine-tuning models. Built to be simple, efficient, and research-grade, DataDreamer empowers users to build reproducible and shareable AI solutions.

PythonLLMSynthetic Data
EasyInstruct: An Easy-to-Use Instruction Processing Framework for LLMs

EasyInstruct: An Easy-to-Use Instruction Processing Framework for LLMs

July 2, 2026

EasyInstruct is an open-source Python framework designed to simplify instruction processing for Large Language Models (LLMs). Accepted at ACL 2024, it offers modularized components for instruction generation, selection, and prompting, supporting various LLMs like GPT-4 and LLaMA. This framework is ideal for researchers and developers working on LLM-based experiments and applications.

EasyInstructLLM FrameworkPython
LazyLLM: Low-Code Development for Multi-Agent LLM Applications

LazyLLM: Low-Code Development for Multi-Agent LLM Applications

July 2, 2026

LazyLLM offers a low-code development tool designed for building multi-agent LLM applications with ease. It simplifies the creation of complex AI applications, providing a streamlined workflow for rapid prototyping, data feedback, and iterative optimization. Developers can leverage its extensive features for deployment, cross-platform compatibility, and efficient model fine-tuning.

PythonAI DevelopmentMulti-Agent

Source repository

Open the original repository on GitHub.

6 counted GitHub visits

View on GitHub
OS
OSRepos

Analysis and discovery of open source repositories. Find interesting projects and follow their updates.

Monitor your website with YourWebsiteScore

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of third-party repository code is at your own risk. Always review source code, dependencies, licenses, and security implications before running anything.

© 2025 OSRepos. Built with Nuxt 3 and lots of ❤️