LMQL: A Language for Constraint-Guided and Efficient LLM Programming

LMQL: A Language for Constraint-Guided and Efficient LLM Programming

Summary

LMQL is an innovative programming language that extends Python, designed for efficient and constraint-guided programming of Large Language Models (LLMs). It allows developers to interweave traditional programming logic with native LLM calls, offering advanced control over model behavior through features like constraints, rich control flow, and optimized runtimes. This makes it easier to build complex LLM-powered applications with greater precision and efficiency.

Repository Info

Updated on January 17, 2026
View on GitHub

Tags

Click on any tag to explore related repositories

Introduction

LMQL is an innovative programming language designed for Large Language Models (LLMs), functioning as a superset of Python. It offers a novel approach to interweaving traditional programming logic with native LLM calls directly within your code. Unlike conventional templating languages, LMQL integrates LLM interaction at the program code level, enabling developers to build sophisticated LLM-powered applications with enhanced control and efficiency. It emphasizes constraint-guided programming, allowing precise control over model outputs.

Installation

Getting started with LMQL is straightforward. Ensure you have Python 3.10 installed, then run the following command:

pip install lmql

For users who wish to run models on a local GPU, it is recommended to install LMQL in an environment with a GPU-enabled PyTorch (version >= 1.11) and then install with:

pip install lmql[hf]

You can also launch the LMQL playground IDE after installation by running lmql playground to explore examples in a browser-based environment.

Examples

LMQL programs seamlessly blend Python syntax with LLM queries. Top-level strings are interpreted as query strings, where template variables like [GREETINGS] are automatically completed by the model. The where keyword allows you to specify constraints on the generated text, providing fine-grained control over LLM behavior.

Consider this example from the LMQL documentation:

"Greet LMQL:[GREETINGS]\n" where stops_at(GREETINGS, ".") and not "\n" in GREETINGS

if "Hi there" in GREETINGS:
    "Can you reformulate your greeting in the speech of \
     victorian-era English: [VIC_GREETINGS]\n" where stops_at(VIC_GREETINGS, ".")

"Analyse what part of this response makes it typically victorian:\n"

for i in range(4):
    "-[THOUGHT]\n" where stops_at(THOUGHT, ".")

"To summarize:[SUMMARY]"

This snippet demonstrates how LMQL allows you to combine algorithmic logic (like if statements and for loops) with LLM prompts, guiding the model's reasoning process and constraining its outputs effectively.

Why Use LMQL?

LMQL is engineered to make working with language models more efficient and powerful. Its key advantages include:

  • Python Syntax Integration: Write queries using familiar Python syntax, fully integrated with your existing Python environment.
  • Rich Control-Flow: Leverage full Python support for powerful control flow and logic within your prompting.
  • Advanced Decoding: Benefit from sophisticated decoding techniques such as beam search and best_k.
  • Powerful Constraints: Apply precise constraints to model output using logit masking, controlling token length, character-level patterns, data types, and stopping phrases.
  • Optimized Runtime: LMQL employs speculative execution, constraint short-circuiting, and tree-based caching for faster inference and efficient token usage.
  • Multi-Model Support: Seamlessly integrate with OpenAI API, Azure OpenAI, and Hugging Face Transformers models.
  • Extensive Applications: Implement advanced applications like schema-safe JSON decoding, algorithmic prompting, and interactive chat interfaces.

Links