Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Summary

Paper2Code is an innovative multi-agent LLM system designed to automate the generation of code repositories directly from scientific papers in machine learning. It employs a sophisticated three-stage pipeline, encompassing planning, analysis, and code generation, each managed by specialized agents. This approach ensures faithful and high-quality implementations, outperforming existing baselines on relevant benchmarks.

Repository Info

Updated on January 1, 2026
View on GitHub

Introduction

Paper2Code is a cutting-edge multi-agent LLM system that streamlines the process of transforming scientific papers, particularly in machine learning, into functional code repositories. It operates through a meticulously designed three-stage pipeline: planning, analysis, and code generation, with each stage handled by specialized agents. This system has demonstrated superior performance against strong baselines on both Paper2Code and PaperBench datasets, consistently producing high-quality and faithful implementations.

Read the original paper on arXiv: Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Installation

Getting started with Paper2Code is straightforward. You can choose between using the OpenAI API or leveraging open-source models with vLLM.

For OpenAI API:

pip install openai
export OPENAI_API_KEY="<YOUR_OPENAI_API_KEY>"
# Navigate to the scripts directory to run examples
# cd scripts

For Open Source Models with vLLM:

pip install vllm
# Navigate to the scripts directory to run examples
# cd scripts

Alternatively, you can install all dependencies at once:

pip install -r requirements.txt

Examples

Paper2Code can generate code from a given paper, such as 'Attention Is All You Need'.

Using OpenAI API (PDF-based JSON format):

export OPENAI_API_KEY="<YOUR_OPENAI_API_KEY>"
cd scripts
bash run.sh

Using Open Source Models with vLLM (PDF-based JSON format):

cd scripts
bash run_llm.sh

The output will be structured into planning, analyzing, and coding artifacts, culminating in a final output repository, for example, outputs/Transformer_repo.

Why Use Paper2Code

  • Automated Code Generation: Significantly reduces the manual effort and time required to translate research papers into executable code.
  • High-Quality Implementations: The multi-agent LLM system ensures that the generated code is faithful to the paper's methodology and produces high-quality results.
  • Benchmarked Performance: Proven to outperform strong baselines on established benchmarks like Paper2Code and PaperBench.
  • Flexibility: Supports both proprietary (OpenAI) and open-source (vLLM) large language models, offering users choice and control.
  • Structured Output: Generates a complete code repository with clear artifacts for each stage of the process.

Links