# Agent-S: Open Agentic Framework for Human-like Computer Use

This repository profile is provided by osrepos.com, an open source repository discovery platform.

Source: osrepos.com
Repository profile: https://osrepos.com/repo/simular-ai-agent-s
Generated for open source discovery and AI-assisted research.

Agent-S is an open agentic framework designed to enable autonomous interaction with computers, allowing AI agents to use machines like humans. It provides intelligent GUI agents that learn from past experiences to perform complex tasks. This framework is a cutting-edge solution for AI automation and advanced agent-based systems.

GitHub: https://github.com/simular-ai/Agent-S
OSRepos URL: https://osrepos.com/repo/simular-ai-agent-s

## Summary

Agent-S is an open agentic framework designed to enable autonomous interaction with computers, allowing AI agents to use machines like humans. It provides intelligent GUI agents that learn from past experiences to perform complex tasks. This framework is a cutting-edge solution for AI automation and advanced agent-based systems.

## Topics

- AI Agents
- Computer Automation
- GUI Agents
- Python
- Machine Learning
- Agentic Framework
- RAG
- Planning

## Repository Information

Last analyzed by OSRepos: Mon Dec 15 2025 08:01:20 GMT+0000 (Western European Standard Time)
Detail views: 7
GitHub clicks: 5

## Safety Notice

OSRepos shares public repositories for knowledge and discovery only. Review source code, dependencies, licenses, and security implications before running or installing anything.

## Content

## Introduction

**Agent-S** is an innovative open-source framework from Simular AI, designed to empower AI agents to interact with computers autonomously, much like a human user. At its core, Agent-S aims to build intelligent GUI agents capable of learning from past experiences and executing complex tasks across various operating systems, including Windows, macOS, and Linux.

The framework has achieved state-of-the-art results on challenging benchmarks like OSWorld, WindowsAgentArena, and AndroidWorld, with its latest iteration, Agent S3, demonstrating performance approaching human-level accuracy. Whether you are interested in advanced AI, automation, or contributing to cutting-edge agent-based systems, Agent-S offers a robust and flexible platform.

For more details, visit the [Agent-S GitHub repository](https://github.com/simular-ai/Agent-S "Agent-S GitHub repository" target="_blank").

## Installation

Getting started with Agent-S is straightforward. Follow these steps to set up the framework on your machine.

### Prerequisites

*   **Single Monitor**: Agent-S is optimized for single monitor setups.
*   **Security**: The agent executes Python code to control your computer, so use it with caution in trusted environments.
*   **Supported Platforms**: Agent-S supports Linux, macOS, and Windows.

### Installation Steps

To install Agent S3 without cloning the repository, use pip:

bash
pip install gui-agents


If you plan to contribute or test changes, clone the repository and install in editable mode:

bash
pip install -e .


Additionally, `pytesseract` requires Tesseract OCR to be installed:

bash
brew install tesseract


### API Configuration

You need to configure your API keys for the language models. Choose one of the following methods:

#### Option 1: Environment Variables

Add your API keys to your shell configuration file (e.g., `.bashrc` or `.zshrc`):

bash
export OPENAI_API_KEY=<YOUR_API_KEY>
export ANTHROPIC_API_KEY=<YOUR_ANTHROPIC_API_KEY>
export HF_TOKEN=<YOUR_HF_TOKEN>


#### Option 2: Python Script

Set environment variables within your Python script:

python
import os
os.environ["OPENAI_API_KEY"] = "<YOUR_API_KEY>"


Agent-S supports various models including Azure OpenAI, Anthropic, Gemini, Open Router, and vLLM inference. For optimal performance, it is recommended to use [UI-TARS-1.5-7B](https://huggingface.co/ByteDance-Seed/UI-TARS-1.5-7B "UI-TARS-1.5-7B on Hugging Face" target="_blank") as the grounding model.

## Examples

Agent-S can be run via a command-line interface (CLI) or integrated into your Python projects using its SDK.

### CLI Usage

The recommended setup for Agent S3 involves using **OpenAI gpt-5-2025-08-07** as the main model, paired with **UI-TARS-1.5-7B** for grounding.

Run Agent S3 with the required parameters:

bash
agent_s \
    --provider openai \
    --model gpt-5-2025-08-07 \
    --ground_provider huggingface \
    --ground_url http://localhost:8080 \
    --ground_model ui-tars-1.5-7b \
    --grounding_width 1920 \
    --grounding_height 1080


#### Local Coding Environment (Optional)

For tasks requiring code execution, enable the local coding environment:

bash
agent_s \
    --provider openai \
    --model gpt-5-2025-08-07 \
    --ground_provider huggingface \
    --ground_url http://localhost:8080 \
    --ground_model ui-tars-1.5-7b \
    --grounding_width 1920 \
    --grounding_height 1080 \
    --enable_local_env


**Warning**: The local coding environment executes arbitrary Python and Bash code locally. Use this feature only in trusted environments and with trusted inputs.

### SDK Usage Snippet

Here's a brief example of how to use the `gui_agents` SDK to query the agent:

python
import pyautogui
import io
from gui_agents.s3.agents.agent_s import AgentS3
from gui_agents.s3.agents.grounding import OSWorldACI

# ... (engine_params and grounding_engine_params setup as per README) ...

grounding_agent = OSWorldACI(
    # ... parameters ...
)

agent = AgentS3(
    # ... parameters ...
)

# Get screenshot.
screenshot = pyautogui.screenshot()
buffered = io.BytesIO()
screenshot.save(buffered, format="PNG")
screenshot_bytes = buffered.getvalue()

obs = {
  "screenshot": screenshot_bytes,
}

instruction = "Close VS Code"
info, action = agent.predict(instruction=instruction, observation=obs)

exec(action[0])


## Why Use Agent-S?

Agent-S stands out as a powerful tool for several reasons:

*   **Human-like Computer Interaction**: It enables AI agents to understand and interact with graphical user interfaces (GUIs) in a way that mimics human behavior, bridging the gap between AI and computer use.
*   **State-of-the-Art Performance**: With Agent S3, the framework achieves leading results on benchmarks like OSWorld, WindowsAgentArena, and AndroidWorld, demonstrating strong generalization capabilities.
*   **Open and Extensible Framework**: Being open-source, Agent-S provides a flexible foundation for researchers and developers to build upon, customize, and integrate into their own projects.
*   **Multi-Platform Support**: It runs seamlessly across Windows, macOS, and Linux, making it versatile for various environments.
*   **Advanced Agentic Capabilities**: Features like reflection agents and an optional local coding environment enhance the agent's ability to plan, execute, and debug complex tasks.
*   **Flexible Model Integration**: Supports a wide range of LLM providers and grounding models, allowing users to choose the best fit for their needs.

## Links

*   **GitHub Repository**: [https://github.com/simular-ai/Agent-S](https://github.com/simular-ai/Agent-S "Agent-S GitHub Repository" target="_blank")
*   **Simular AI Agent-S Page**: [https://www.simular.ai/agent-s](https://www.simular.ai/agent-s "Simular AI Agent-S Page" target="_blank")
*   **Agent S3 Blog Post**: [https://www.simular.ai/articles/agent-s3](https://www.simular.ai/articles/agent-s3 "Agent S3 Blog Post" target="_blank")
*   **Agent S3 Paper (arXiv)**: [https://arxiv.org/abs/2510.02250](https://arxiv.org/abs/2510.02250 "Agent S3 Paper on arXiv" target="_blank")
*   **Agent S2 Paper (arXiv)**: [https://arxiv.org/abs/2504.00906](https://arxiv.org/abs/2504.00906 "Agent S2 Paper on arXiv" target="_blank")
*   **Agent S1 Paper (arXiv)**: [https://arxiv.org/abs/2410.08164](https://arxiv.org/abs/2410.08164 "Agent S1 Paper on arXiv" target="_blank")
*   **Discord Community**: [https://discord.gg/E2XfsK9fPV](https://discord.gg/E2XfsK9fPV "Agent-S Discord Community" target="_blank")
*   **Try Agent S in Simular Cloud**: [https://cloud.simular.ai/](https://cloud.simular.ai/ "Simular Cloud" target="_blank")
*   **UI-TARS-1.5-7B Grounding Model**: [https://huggingface.co/ByteDance-Seed/UI-TARS-1.5-7B](https://huggingface.co/ByteDance-Seed/UI-TARS-1.5-7B "UI-TARS-1.5-7B on Hugging Face" target="_blank")