Agent-S: Open Agentic Framework for Human-like Computer Use

This repository profile is provided by osrepos.com, an open source repository discovery platform.

Agent-S: Open Agentic Framework for Human-like Computer Use

Summary

Agent-S is an open agentic framework designed to enable autonomous interaction with computers, allowing AI agents to use machines like humans. It provides intelligent GUI agents that learn from past experiences to perform complex tasks. This framework is a cutting-edge solution for AI automation and advanced agent-based systems.

Repository Information

Analyzed by OSRepos on December 15, 2025

Use at your own risk

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.

Introduction

Agent-S is an innovative open-source framework from Simular AI, designed to empower AI agents to interact with computers autonomously, much like a human user. At its core, Agent-S aims to build intelligent GUI agents capable of learning from past experiences and executing complex tasks across various operating systems, including Windows, macOS, and Linux.

The framework has achieved state-of-the-art results on challenging benchmarks like OSWorld, WindowsAgentArena, and AndroidWorld, with its latest iteration, Agent S3, demonstrating performance approaching human-level accuracy. Whether you are interested in advanced AI, automation, or contributing to cutting-edge agent-based systems, Agent-S offers a robust and flexible platform.

For more details, visit the Agent-S GitHub repository.

Installation

Getting started with Agent-S is straightforward. Follow these steps to set up the framework on your machine.

Prerequisites

  • Single Monitor: Agent-S is optimized for single monitor setups.
  • Security: The agent executes Python code to control your computer, so use it with caution in trusted environments.
  • Supported Platforms: Agent-S supports Linux, macOS, and Windows.

Installation Steps

To install Agent S3 without cloning the repository, use pip:

pip install gui-agents

If you plan to contribute or test changes, clone the repository and install in editable mode:

pip install -e .

Additionally, pytesseract requires Tesseract OCR to be installed:

brew install tesseract

API Configuration

You need to configure your API keys for the language models. Choose one of the following methods:

Option 1: Environment Variables

Add your API keys to your shell configuration file (e.g., .bashrc or .zshrc):

export OPENAI_API_KEY=<YOUR_API_KEY>
export ANTHROPIC_API_KEY=<YOUR_ANTHROPIC_API_KEY>
export HF_TOKEN=<YOUR_HF_TOKEN>

Option 2: Python Script

Set environment variables within your Python script:

import os
os.environ["OPENAI_API_KEY"] = "<YOUR_API_KEY>"

Agent-S supports various models including Azure OpenAI, Anthropic, Gemini, Open Router, and vLLM inference. For optimal performance, it is recommended to use UI-TARS-1.5-7B as the grounding model.

Examples

Agent-S can be run via a command-line interface (CLI) or integrated into your Python projects using its SDK.

CLI Usage

The recommended setup for Agent S3 involves using OpenAI gpt-5-2025-08-07 as the main model, paired with UI-TARS-1.5-7B for grounding.

Run Agent S3 with the required parameters:

agent_s \
    --provider openai \
    --model gpt-5-2025-08-07 \
    --ground_provider huggingface \
    --ground_url http://localhost:8080 \
    --ground_model ui-tars-1.5-7b \
    --grounding_width 1920 \
    --grounding_height 1080

Local Coding Environment (Optional)

For tasks requiring code execution, enable the local coding environment:

agent_s \
    --provider openai \
    --model gpt-5-2025-08-07 \
    --ground_provider huggingface \
    --ground_url http://localhost:8080 \
    --ground_model ui-tars-1.5-7b \
    --grounding_width 1920 \
    --grounding_height 1080 \
    --enable_local_env

Warning: The local coding environment executes arbitrary Python and Bash code locally. Use this feature only in trusted environments and with trusted inputs.

SDK Usage Snippet

Here's a brief example of how to use the gui_agents SDK to query the agent:

import pyautogui
import io
from gui_agents.s3.agents.agent_s import AgentS3
from gui_agents.s3.agents.grounding import OSWorldACI

# ... (engine_params and grounding_engine_params setup as per README) ...

grounding_agent = OSWorldACI(
    # ... parameters ...
)

agent = AgentS3(
    # ... parameters ...
)

# Get screenshot.
screenshot = pyautogui.screenshot()
buffered = io.BytesIO()
screenshot.save(buffered, format="PNG")
screenshot_bytes = buffered.getvalue()

obs = {
  "screenshot": screenshot_bytes,
}

instruction = "Close VS Code"
info, action = agent.predict(instruction=instruction, observation=obs)

exec(action[0])

Why Use Agent-S?

Agent-S stands out as a powerful tool for several reasons:

  • Human-like Computer Interaction: It enables AI agents to understand and interact with graphical user interfaces (GUIs) in a way that mimics human behavior, bridging the gap between AI and computer use.
  • State-of-the-Art Performance: With Agent S3, the framework achieves leading results on benchmarks like OSWorld, WindowsAgentArena, and AndroidWorld, demonstrating strong generalization capabilities.
  • Open and Extensible Framework: Being open-source, Agent-S provides a flexible foundation for researchers and developers to build upon, customize, and integrate into their own projects.
  • Multi-Platform Support: It runs seamlessly across Windows, macOS, and Linux, making it versatile for various environments.
  • Advanced Agentic Capabilities: Features like reflection agents and an optional local coding environment enhance the agent's ability to plan, execute, and debug complex tasks.
  • Flexible Model Integration: Supports a wide range of LLM providers and grounding models, allowing users to choose the best fit for their needs.

Links

Related repositories

Similar repositories that may be relevant next.

Phoenix: AI Observability and Evaluation Platform for LLMs

Phoenix: AI Observability and Evaluation Platform for LLMs

June 28, 2026

Phoenix is an open-source AI observability platform from Arize AI, designed for comprehensive experimentation, evaluation, and troubleshooting of LLM applications. It provides robust features including OpenTelemetry-based tracing, LLM evaluation, and systematic prompt management. This platform helps developers optimize and debug their AI models effectively across various environments.

AI ObservabilityLLM EvaluationPrompt Engineering
Odysseus: A Comprehensive Self-Hosted AI Workspace for Productivity

Odysseus: A Comprehensive Self-Hosted AI Workspace for Productivity

June 25, 2026

Odysseus is a powerful self-hosted AI workspace designed to integrate various AI-powered tools into a single platform. It offers functionalities for chat, agents, deep research, document management, email, and calendar, supporting both local and API models. This comprehensive solution aims to enhance productivity and streamline AI workflows in a private environment.

AI WorkspaceSelf-HostedPython
Loop Library: Practical Repeatable AI-Agent Workflows

Loop Library: Practical Repeatable AI-Agent Workflows

June 21, 2026

The Loop Library is a GitHub repository offering reusable AI agent workflows for various domains like engineering, content, and design. It introduces the concept of "loops," which are structured, repeatable instructions that guide AI agents through multi-step tasks. This skill enables agents to learn from results, adapt, and complete complex tasks more reliably than with one-shot prompts.

AI AgentsAgentic WorkflowsPrompt Engineering
agency-agents: Your Complete AI Agency of Specialized Experts

agency-agents: Your Complete AI Agency of Specialized Experts

May 30, 2026

agency-agents offers a comprehensive collection of over 140 meticulously crafted AI agent personalities, designed to act as specialized experts across various domains. From frontend development to marketing and sales, each agent comes with a unique voice, proven processes, and deliverable-focused outcomes. This repository provides a ready-to-deploy AI dream team to transform your workflow and accelerate project delivery.

AI AgentsShellDeveloper Tools

Source repository

Open the original repository on GitHub.

View on GitHub
OS
OSRepos

Analysis and discovery of open source repositories. Find interesting projects and follow their updates.

Monitor your website with YourWebsiteScore

OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of third-party repository code is at your own risk. Always review source code, dependencies, licenses, and security implications before running anything.

© 2025 OSRepos. Built with Nuxt 3 and lots of ❤️