{"name":"Agent-S: Open Agentic Framework for Human-like Computer Use","description":"Agent-S is an open agentic framework designed to enable autonomous interaction with computers, allowing AI agents to use machines like humans. It provides intelligent GUI agents that learn from past experiences to perform complex tasks. This framework is a cutting-edge solution for AI automation and advanced agent-based systems.","github":"https://github.com/simular-ai/Agent-S","url":"https://osrepos.com/repo/simular-ai-agent-s","source":"osrepos.com","sourceDescription":"This repository profile is provided by osrepos.com, an open source repository discovery platform.","repositoryProfile":"https://osrepos.com/repo/simular-ai-agent-s","generatedFor":"open source discovery and AI-assisted research","markdown":"https://osrepos.com/repo/simular-ai-agent-s.md","json":"https://osrepos.com/repo/simular-ai-agent-s.json","topics":["AI Agents","Computer Automation","GUI Agents","Python","Machine Learning","Agentic Framework","RAG","Planning"],"keywords":["AI Agents","Computer Automation","GUI Agents","Python","Machine Learning","Agentic Framework","RAG","Planning"],"stars":null,"summary":"Agent-S is an open agentic framework designed to enable autonomous interaction with computers, allowing AI agents to use machines like humans. It provides intelligent GUI agents that learn from past experiences to perform complex tasks. This framework is a cutting-edge solution for AI automation and advanced agent-based systems.","content":"## Introduction\n\n**Agent-S** is an innovative open-source framework from Simular AI, designed to empower AI agents to interact with computers autonomously, much like a human user. At its core, Agent-S aims to build intelligent GUI agents capable of learning from past experiences and executing complex tasks across various operating systems, including Windows, macOS, and Linux.\n\nThe framework has achieved state-of-the-art results on challenging benchmarks like OSWorld, WindowsAgentArena, and AndroidWorld, with its latest iteration, Agent S3, demonstrating performance approaching human-level accuracy. Whether you are interested in advanced AI, automation, or contributing to cutting-edge agent-based systems, Agent-S offers a robust and flexible platform.\n\nFor more details, visit the [Agent-S GitHub repository](https://github.com/simular-ai/Agent-S \"Agent-S GitHub repository\" target=\"_blank\").\n\n## Installation\n\nGetting started with Agent-S is straightforward. Follow these steps to set up the framework on your machine.\n\n### Prerequisites\n\n*   **Single Monitor**: Agent-S is optimized for single monitor setups.\n*   **Security**: The agent executes Python code to control your computer, so use it with caution in trusted environments.\n*   **Supported Platforms**: Agent-S supports Linux, macOS, and Windows.\n\n### Installation Steps\n\nTo install Agent S3 without cloning the repository, use pip:\n\nbash\npip install gui-agents\n\n\nIf you plan to contribute or test changes, clone the repository and install in editable mode:\n\nbash\npip install -e .\n\n\nAdditionally, `pytesseract` requires Tesseract OCR to be installed:\n\nbash\nbrew install tesseract\n\n\n### API Configuration\n\nYou need to configure your API keys for the language models. Choose one of the following methods:\n\n#### Option 1: Environment Variables\n\nAdd your API keys to your shell configuration file (e.g., `.bashrc` or `.zshrc`):\n\nbash\nexport OPENAI_API_KEY=<YOUR_API_KEY>\nexport ANTHROPIC_API_KEY=<YOUR_ANTHROPIC_API_KEY>\nexport HF_TOKEN=<YOUR_HF_TOKEN>\n\n\n#### Option 2: Python Script\n\nSet environment variables within your Python script:\n\npython\nimport os\nos.environ[\"OPENAI_API_KEY\"] = \"<YOUR_API_KEY>\"\n\n\nAgent-S supports various models including Azure OpenAI, Anthropic, Gemini, Open Router, and vLLM inference. For optimal performance, it is recommended to use [UI-TARS-1.5-7B](https://huggingface.co/ByteDance-Seed/UI-TARS-1.5-7B \"UI-TARS-1.5-7B on Hugging Face\" target=\"_blank\") as the grounding model.\n\n## Examples\n\nAgent-S can be run via a command-line interface (CLI) or integrated into your Python projects using its SDK.\n\n### CLI Usage\n\nThe recommended setup for Agent S3 involves using **OpenAI gpt-5-2025-08-07** as the main model, paired with **UI-TARS-1.5-7B** for grounding.\n\nRun Agent S3 with the required parameters:\n\nbash\nagent_s \\\n    --provider openai \\\n    --model gpt-5-2025-08-07 \\\n    --ground_provider huggingface \\\n    --ground_url http://localhost:8080 \\\n    --ground_model ui-tars-1.5-7b \\\n    --grounding_width 1920 \\\n    --grounding_height 1080\n\n\n#### Local Coding Environment (Optional)\n\nFor tasks requiring code execution, enable the local coding environment:\n\nbash\nagent_s \\\n    --provider openai \\\n    --model gpt-5-2025-08-07 \\\n    --ground_provider huggingface \\\n    --ground_url http://localhost:8080 \\\n    --ground_model ui-tars-1.5-7b \\\n    --grounding_width 1920 \\\n    --grounding_height 1080 \\\n    --enable_local_env\n\n\n**Warning**: The local coding environment executes arbitrary Python and Bash code locally. Use this feature only in trusted environments and with trusted inputs.\n\n### SDK Usage Snippet\n\nHere's a brief example of how to use the `gui_agents` SDK to query the agent:\n\npython\nimport pyautogui\nimport io\nfrom gui_agents.s3.agents.agent_s import AgentS3\nfrom gui_agents.s3.agents.grounding import OSWorldACI\n\n# ... (engine_params and grounding_engine_params setup as per README) ...\n\ngrounding_agent = OSWorldACI(\n    # ... parameters ...\n)\n\nagent = AgentS3(\n    # ... parameters ...\n)\n\n# Get screenshot.\nscreenshot = pyautogui.screenshot()\nbuffered = io.BytesIO()\nscreenshot.save(buffered, format=\"PNG\")\nscreenshot_bytes = buffered.getvalue()\n\nobs = {\n  \"screenshot\": screenshot_bytes,\n}\n\ninstruction = \"Close VS Code\"\ninfo, action = agent.predict(instruction=instruction, observation=obs)\n\nexec(action[0])\n\n\n## Why Use Agent-S?\n\nAgent-S stands out as a powerful tool for several reasons:\n\n*   **Human-like Computer Interaction**: It enables AI agents to understand and interact with graphical user interfaces (GUIs) in a way that mimics human behavior, bridging the gap between AI and computer use.\n*   **State-of-the-Art Performance**: With Agent S3, the framework achieves leading results on benchmarks like OSWorld, WindowsAgentArena, and AndroidWorld, demonstrating strong generalization capabilities.\n*   **Open and Extensible Framework**: Being open-source, Agent-S provides a flexible foundation for researchers and developers to build upon, customize, and integrate into their own projects.\n*   **Multi-Platform Support**: It runs seamlessly across Windows, macOS, and Linux, making it versatile for various environments.\n*   **Advanced Agentic Capabilities**: Features like reflection agents and an optional local coding environment enhance the agent's ability to plan, execute, and debug complex tasks.\n*   **Flexible Model Integration**: Supports a wide range of LLM providers and grounding models, allowing users to choose the best fit for their needs.\n\n## Links\n\n*   **GitHub Repository**: [https://github.com/simular-ai/Agent-S](https://github.com/simular-ai/Agent-S \"Agent-S GitHub Repository\" target=\"_blank\")\n*   **Simular AI Agent-S Page**: [https://www.simular.ai/agent-s](https://www.simular.ai/agent-s \"Simular AI Agent-S Page\" target=\"_blank\")\n*   **Agent S3 Blog Post**: [https://www.simular.ai/articles/agent-s3](https://www.simular.ai/articles/agent-s3 \"Agent S3 Blog Post\" target=\"_blank\")\n*   **Agent S3 Paper (arXiv)**: [https://arxiv.org/abs/2510.02250](https://arxiv.org/abs/2510.02250 \"Agent S3 Paper on arXiv\" target=\"_blank\")\n*   **Agent S2 Paper (arXiv)**: [https://arxiv.org/abs/2504.00906](https://arxiv.org/abs/2504.00906 \"Agent S2 Paper on arXiv\" target=\"_blank\")\n*   **Agent S1 Paper (arXiv)**: [https://arxiv.org/abs/2410.08164](https://arxiv.org/abs/2410.08164 \"Agent S1 Paper on arXiv\" target=\"_blank\")\n*   **Discord Community**: [https://discord.gg/E2XfsK9fPV](https://discord.gg/E2XfsK9fPV \"Agent-S Discord Community\" target=\"_blank\")\n*   **Try Agent S in Simular Cloud**: [https://cloud.simular.ai/](https://cloud.simular.ai/ \"Simular Cloud\" target=\"_blank\")\n*   **UI-TARS-1.5-7B Grounding Model**: [https://huggingface.co/ByteDance-Seed/UI-TARS-1.5-7B](https://huggingface.co/ByteDance-Seed/UI-TARS-1.5-7B \"UI-TARS-1.5-7B on Hugging Face\" target=\"_blank\")","metrics":{"detailViews":7,"githubClicks":5},"dates":{"published":null,"modified":"2025-12-15T08:01:20.000Z"}}