POML: Structured Prompt Engineering for LLMs with a Markup Language

Summary

POML (Prompt Orchestration Markup Language) is a novel markup language developed by Microsoft for advanced prompt engineering. It brings structure, maintainability, and versatility to LLM applications by addressing common challenges like unstructured prompts and complex data integration. Developers can systematically organize prompt components, integrate diverse data types, and manage presentation variations for more sophisticated and reliable LLM interactions.

Use at your own risk

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.

Introduction

POML (Prompt Orchestration Markup Language) is a novel markup language developed by Microsoft, designed to bring structure, maintainability, and versatility to advanced prompt engineering for Large Language Models (LLMs). It addresses common challenges in prompt development, such as lack of structure, complex data integration, and format sensitivity. POML empowers developers to create more sophisticated and reliable LLM applications by providing a systematic way to organize prompt components, integrate diverse data types seamlessly, and manage presentation variations. Key features include structured prompting markup, comprehensive data handling, decoupled presentation styling, an integrated templating engine, and a rich development toolkit.

Installation

Getting started with POML is straightforward, whether you prefer a Visual Studio Code extension or SDKs for Node.js and Python.

Visual Studio Code Extension

Install the POML extension directly from the Visual Studio Code Marketplace. After installation, ensure you configure your preferred LLM model, API key, and endpoint within the extension settings to enable prompt testing.

Node.js (via npm)

npm install pomljs

Python (via pip)

pip install poml

For more detailed installation instructions, including nightly builds, please refer to the official documentation.

Examples

Here is a simple POML example that demonstrates defining a role, a task, including an image, and specifying an output format for an LLM.

<poml>
  <role>You are a patient teacher explaining concepts to a 10-year-old.</role>
  <task>Explain the concept of photosynthesis using the provided image as a reference.</task>

  <img src="photosynthesis_diagram.png" alt="Diagram of photosynthesis" />

  <output-format>
    Keep the explanation simple, engaging, and under 100 words.
    Start with "Hey there, future scientist!".
  </output-format>
</poml>

This example illustrates how POML allows for clear separation of concerns, making prompts more readable and manageable. With the POML toolkit, this prompt can be easily rendered and tested with a vision LLM.

Why Use POML

POML offers significant advantages for anyone working with LLMs, from individual developers to large teams:

Structured Prompting: Employs an HTML-like syntax with semantic components, such as <role>, <task>, and <example>, promoting modular design and enhancing prompt readability, reusability, and maintainability.
Comprehensive Data Integration: Incorporates specialized data components like <document>, <table>, and <img> to seamlessly embed or reference external data sources, including text files, spreadsheets, and images, with customizable formatting options.
Decoupled Presentation Styling: Features a CSS-like styling system that separates content from presentation. This allows developers to modify styling, such as verbosity or syntax format, via <stylesheet> definitions or inline attributes without altering core prompt logic, effectively mitigating LLM format sensitivity.
Dynamic Prompt Generation: Includes a built-in templating engine with support for variables ({{ }}), loops (for), conditionals (if), and variable definitions (<let>) for dynamically generating complex, data-driven prompts.
Rich Development Toolkit: Provides an essential IDE extension for Visual Studio Code, offering syntax highlighting, context-aware auto-completion, real-time previews, inline diagnostics, and integrated interactive testing. SDKs for Node.js and Python ensure seamless integration into various application workflows.