AI-Scientist-v2: Automated Scientific Discovery via Agentic Tree Search
This repository profile is provided by osrepos.com, an open source repository discovery platform.

Summary
AI-Scientist-v2 is an advanced agentic system designed for automated scientific discovery, capable of generating hypotheses, running experiments, analyzing data, and writing scientific manuscripts. This system has successfully produced the first workshop paper written entirely by AI and accepted through peer review, marking a significant step towards fully autonomous research.
Repository Information
Topics
Click on any tag to explore related repositories
Use at your own risk
OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.
Introduction
The AI Scientist-v2, developed by SakanaAI, represents a groundbreaking step in automated scientific discovery. This generalized, end-to-end agentic system is designed to autonomously conduct scientific research, from generating novel hypotheses to writing peer-reviewed papers. It distinguishes itself from its predecessor by removing reliance on human-authored templates, generalizing across various Machine Learning (ML) domains, and employing a progressive agentic tree search guided by an experiment manager agent.
This system has already achieved a notable milestone, generating the first workshop paper written entirely by AI and accepted through peer review, marking a significant step towards fully autonomous research. While AI Scientist-v2 takes a broader, more exploratory approach compared to v1, it is ideal for open-ended scientific exploration, pushing the boundaries of what AI can achieve in research.
Installation
To get started with AI Scientist-v2, you'll need a Linux environment with NVIDIA GPUs, CUDA, and PyTorch. The installation process involves setting up a Conda environment and installing necessary dependencies.
Create a Conda environment:
conda create -n ai_scientist python=3.11 conda activate ai_scientistInstall PyTorch with CUDA support: (Adjust
pytorch-cudaversion for your setup)conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidiaInstall PDF and LaTeX tools:
conda install anaconda::poppler conda install conda-forge::chktexInstall Python package requirements:
pip install -r requirements.txt
Ensure you set up API keys for supported models (OpenAI, Gemini, Claude via AWS Bedrock) and optionally for Semantic Scholar for enhanced literature search. Refer to the official repository for detailed instructions on API key configuration.
Examples
The AI Scientist-v2 workflow typically involves two main stages: generating research ideas and then running paper generation experiments.
1. Generate Research Ideas
First, you use the perform_ideation_temp_free.py script to brainstorm and refine research ideas based on a high-level topic description you provide. This script leverages LLMs and tools like Semantic Scholar to check for novelty.
Example Command:
python ai_scientist/perform_ideation_temp_free.py \
--workshop-file "ai_scientist/ideas/my_research_topic.md" \
--model gpt-4o-2024-05-13 \
--max-num-generations 20 \
--num-reflections 5
This will generate a JSON file containing structured research ideas, which will be used in the next step.
2. Run AI Scientist-v2 Paper Generation Experiments
Once you have your research ideas, you can launch the main pipeline to run experiments via agentic tree search, analyze results, and generate a paper draft.
Example Command:
python launch_scientist_bfts.py \
--load_ideas "ai_scientist/ideas/my_research_topic.json" \
--load_code \
--add_dataset_ref \
--model_writeup o1-preview-2024-09-12 \
--model_citation gpt-4o-2024-11-20 \
--model_review gpt-4o-2024-11-20 \
--model_agg_plots o3-mini-2025-01-31 \
--num_cite_rounds 20
After completion, you will find a timestamped log folder in experiments/ containing the tree visualization and, eventually, the generated PDF paper.
Why Use It
AI Scientist-v2 offers a powerful platform for accelerating scientific discovery and exploring new research frontiers. Its key advantages include:
- Full Autonomy: It automates the entire research lifecycle, from hypothesis generation to paper writing, significantly reducing human effort and time.
- Generalization: Unlike previous versions, it can generalize across various ML domains, making it a versatile tool for diverse research areas.
- Agentic Tree Search: The progressive agentic tree search, guided by an experiment manager, allows for more exploratory and less template-dependent research.
- Pioneering AI Research: It represents a significant milestone in AI's capability to contribute to scientific literature, having produced a peer-reviewed paper entirely by AI.
Caution: This codebase will execute Large Language Model (LLM)-written code. It is crucial to run this within a controlled sandbox environment (e.g., a Docker container) due to potential risks like dangerous packages or unintended processes.
Links
- GitHub Repository: https://github.com/SakanaAI/AI-Scientist-v2
- Paper: https://pub.sakana.ai/ai-scientist-v2/paper
- Blog Post: https://sakana.ai/ai-scientist-first-publication/
- ICLR2025 Workshop Experiment: https://github.com/SakanaAI/AI-Scientist-ICLR2025-Workshop-Experiment
Related repositories
Similar repositories that may be relevant next.

AuditNLG: Auditing Generative AI for Trustworthiness
June 25, 2026
AuditNLG is an open-source library from Salesforce designed to enhance the trustworthiness of generative AI language models. It provides state-of-the-art techniques to detect and improve factualness, safety, and constraint adherence in AI-generated text. This library simplifies the process of auditing AI outputs, offering explanations and alternative suggestions for problematic content.

Odysseus: A Comprehensive Self-Hosted AI Workspace for Productivity
June 25, 2026
Odysseus is a powerful self-hosted AI workspace designed to integrate various AI-powered tools into a single platform. It offers functionalities for chat, agents, deep research, document management, email, and calendar, supporting both local and API models. This comprehensive solution aims to enhance productivity and streamline AI workflows in a private environment.

Headroom: Drastically Reduce LLM Token Usage for AI Agents
June 25, 2026
Headroom is an innovative context compression layer for AI agents, designed to significantly reduce token usage for LLMs. It achieves 60-95% fewer tokens across various inputs like tool outputs, logs, files, and RAG chunks, all while preserving answer accuracy. This powerful tool enhances efficiency and cost-effectiveness for AI interactions.

spacy-llm: Integrating LLMs into Structured NLP Pipelines with spaCy
June 24, 2026
spacy-llm seamlessly integrates Large Language Models (LLMs) into spaCy, offering a modular system for rapid prototyping and transforming unstructured LLM responses into robust outputs for various NLP tasks. It supports a wide range of LLMs, including OpenAI, Cohere, Anthropic, and open-source models, enabling users to combine the power of LLMs with spaCy's production-ready capabilities. This package allows for quick experimentation and the creation of efficient, reliable, and controlled NLP systems.
Source repository
Open the original repository on GitHub.