NVIDIA Isaac GR00T: A Foundation Model for Generalist Robots

Introduction

NVIDIA Isaac GR00T N1.6 is an open vision-language-action (VLA) foundation model engineered for generalized humanoid robot skills. This cross-embodiment model processes multimodal input, including language and images, to execute manipulation tasks across various environments. GR00T N1.6 is trained on a diverse mixture of robot data, encompassing bimanual, semi-humanoid, and extensive humanoid datasets, making it adaptable for specific embodiments, tasks, and environments through post-training. The N1.6 version introduces significant improvements over N1.5, including an updated model architecture with an internal NVIDIA Cosmos-Reason-2B VLM variant, a larger DiT, and enhanced pretraining data.

Installation

To get started with NVIDIA Isaac GR00T, you first need to clone the repository, ensuring submodules are included:

git clone --recurse-submodules https://github.com/NVIDIA/Isaac-GR00T
cd Isaac-GR00T

If you've already cloned without submodules, initialize them separately:

git submodule update --init --recursive

GR00T uses uv for dependency management. Ensure you have uv v0.8.4+ installed. Then, set up the environment:

uv sync --python 3.10
uv pip install -e .

CUDA 12.4 is recommended, though CUDA 11.8 has also been verified to work with compatible flash-attn versions. For a containerized setup, refer to the Docker Setup Guide.

Examples

NVIDIA Isaac GR00T offers a comprehensive workflow for robot control, from data preparation to evaluation. You can quickly start by downloading a pre-trained checkpoint and running the policy server for a specific embodiment, such as GR1:

# On GPU server: Start the policy server
uv run python gr00t/eval/run_gr00t_server.py --embodiment-tag GR1 --model-path nvidia/GR00T-N1.6-3B

The repository provides convenient scripts for validating zero-shot performance, fine-tuning the pre-trained GR00T N1.6 model on custom data, and running inference. Examples for various configurations and robot embodiments are available in the examples/ directory. Detailed guides cover data preparation, inference, and fine-tuning for new embodiments. The project also supports academic simulation benchmarks like LIBERO, SimplerEnv, and RoboCasa for evaluation.

Why Use NVIDIA Isaac GR00T

NVIDIA Isaac GR00T N1.6 is an invaluable resource for researchers and professionals in robotics. It provides a robust platform to:

Leverage a pre-trained foundation model for advanced robot control.
Efficiently fine-tune the model on small, custom datasets to adapt to specific tasks.
Deploy the model for inference, enabling real-world robot applications.

The primary focus is on empowering users to customize robot behaviors through flexible fine-tuning capabilities, accelerating research and development in generalist robotics.

NVIDIA Isaac GR00T: A Foundation Model for Generalist Robots

Summary

Repository Info

Tags

Introduction

Installation

Examples

Why Use NVIDIA Isaac GR00T

Links