# spacy-llm: Integrating LLMs into Structured NLP Pipelines with spaCy

This repository profile is provided by osrepos.com, an open source repository discovery platform.

Source: osrepos.com
Repository profile: https://osrepos.com/repo/explosion-spacy-llm
Generated for open source discovery and AI-assisted research.

spacy-llm seamlessly integrates Large Language Models (LLMs) into spaCy, offering a modular system for rapid prototyping and transforming unstructured LLM responses into robust outputs for various NLP tasks. It supports a wide range of LLMs, including OpenAI, Cohere, Anthropic, and open-source models, enabling users to combine the power of LLMs with spaCy's production-ready capabilities. This package allows for quick experimentation and the creation of efficient, reliable, and controlled NLP systems.

GitHub: https://github.com/explosion/spacy-llm
OSRepos URL: https://osrepos.com/repo/explosion-spacy-llm

## Summary

spacy-llm seamlessly integrates Large Language Models (LLMs) into spaCy, offering a modular system for rapid prototyping and transforming unstructured LLM responses into robust outputs for various NLP tasks. It supports a wide range of LLMs, including OpenAI, Cohere, Anthropic, and open-source models, enabling users to combine the power of LLMs with spaCy's production-ready capabilities. This package allows for quick experimentation and the creation of efficient, reliable, and controlled NLP systems.

## Topics

- Python
- NLP
- LLM
- spaCy
- Machine Learning
- Prompt Engineering
- Text Classification
- Named Entity Recognition

## Repository Information

Last analyzed by OSRepos: Wed Jun 24 2026 01:17:52 GMT+0100 (Western European Summer Time)
Detail views: 1
GitHub clicks: 1

## Safety Notice

OSRepos shares public repositories for knowledge and discovery only. Review source code, dependencies, licenses, and security implications before running or installing anything.

## Content

## Introduction

`spacy-llm` is a powerful Python package that integrates Large Language Models (LLMs) into [spaCy](https://spacy.io), a leading library for advanced Natural Language Processing. This integration provides a modular system designed for **fast prototyping** and **prompting**, effectively turning unstructured LLM responses into **robust outputs** for a variety of NLP tasks, often **without requiring training data**.

The package features a serializable `llm` component for easy integration into your spaCy pipeline, along with modular functions to define specific tasks and models. It interfaces with major LLM APIs such as [OpenAI](https://platform.openai.com/docs/api-reference/), [Cohere](https://docs.cohere.com/reference/generate/), [Anthropic](https://docs.anthropic.com/claude/reference/), [Google PaLM](https://ai.google/discover/palm2/), and [Microsoft Azure AI](https://azure.microsoft.com/en-us/solutions/ai/). Additionally, `spacy-llm` supports a broad spectrum of open-source LLMs hosted on Hugging Face, including Falcon, Dolly, Llama 2, OpenLLaMA, StableLM, and Mistral. It also integrates with [LangChain](https://github.com/hwchase17/langchain), allowing all LangChain models and features to be utilized within `spacy-llm`.

Out-of-the-box, `spacy-llm` provides tasks for Named Entity Recognition, Text Classification, Lemmatization, Relationship Extraction, Sentiment Analysis, Span Categorization, Summarization, Entity Linking, Translation, and raw prompt execution for maximum flexibility. Users can also implement their own custom functions for prompting, parsing, and model integrations via [spaCy's registry](https://spacy.io/api/top-level#registry). For handling prompts that exceed an LLM's context window, a map-reduce approach is available to split prompts and fuse the results.

## Installation

To install `spacy-llm`, ensure you have `spacy` [installed](https://spacy.io/usage) in your virtual environment, then run the following command:

bash
python -m pip install spacy-llm


## Examples

Here are a couple of quick examples to get started with `spacy-llm`.

### In Python code

For quick experiments, you can use the following Python code to perform text classification with a GPT model from OpenAI:

python
import spacy

nlp = spacy.blank("en")
llm = nlp.add_pipe("llm_textcat")
llm.add_label("INSULT")
llm.add_label("COMPLIMENT")
doc = nlp("You look gorgeous!")
print(doc.cats)
# {"COMPLIMENT": 1.0, "INSULT": 0.0}


This example uses the `llm_textcat` factory, which leverages the latest version of the built-in text classification task and the default GPT-3.5 model from OpenAI.

### Using a config file

For more control over the various parameters of the `llm` pipeline, you can utilize [spaCy's config system](https://spacy.io/api/data-formats#config). Create a `config.cfg` file like the one below:

ini
[nlp]
lang = "en"
pipeline = ["llm"]

[components]

[components.llm]
factory = "llm"

[components.llm.task]
@llm_tasks = "spacy.TextCat.v3"
labels = ["COMPLIMENT", "INSULT"]

[components.llm.model]
@llm_models = "spacy.GPT-4.v2"


Then, run the following Python code to load and use your configured pipeline:

python
from spacy_llm.util import assemble

nlp = assemble("config.cfg")
doc = nlp("You look gorgeous!")
print(doc.cats)
# {"COMPLIMENT": 1.0, "INSULT": 0.0}


This approach provides greater flexibility for customizing your LLM-powered NLP workflows.

## Why Use spacy-llm?

Large Language Models offer powerful natural language understanding, making them excellent for quickly prototyping custom NLP tasks with few or no examples. However, for production systems, supervised learning models often provide better efficiency, reliability, control, and accuracy for well-defined tasks.

`spacy-llm` offers **the best of both worlds**. You can rapidly initialize pipelines with LLM-powered components for quick experimentation and then seamlessly integrate or replace them with spaCy's traditional supervised learning or rule-based components as your project matures. This allows you to leverage the prototyping speed of LLMs while maintaining the production-readiness, efficiency, and control that spaCy is known for. Even when an LLM is justified for complex tasks, `spacy-llm` enables you to combine it with other spaCy components, such as cheaper text classification models for filtering or rule-based systems for output validation, creating a robust and optimized NLP system.

## Links

*   **GitHub Repository:** [explosion/spacy-llm](https://github.com/explosion/spacy-llm)
*   **spaCy Documentation:** [spacy.io](https://spacy.io)
*   **PyPI Project Page:** [spacy-llm](https://pypi.org/project/spacy-llm/)