OpenWebAgent: An Open Toolkit for LLM- and LMM-based Web Agents
This repository profile is provided by osrepos.com, an open source repository discovery platform.
Summary
OpenWebAgent is an open toolkit designed to empower model-based web agents, streamlining human-computer interactions by automating tasks on webpages. It offers a convenient framework for developing LLM- and LMM-based web agents, providing both plugin and server source code for easy integration and customization. This project was featured as an ACL'24 Demo, showcasing its innovative approach to web automation.
Repository Information
Topics
Click on any tag to explore related repositories
Use at your own risk
OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.
Introduction
OpenWebAgent is an innovative open toolkit designed to facilitate the development of LLM- and LMM-based web agents. It aims to streamline human-computer interactions by enabling these agents to automate various tasks directly on webpages. This project, featured as an ACL'24 Demo, provides both plugin and server source code, allowing users to easily integrate their own models into the backend to create a functional web browsing agent. Key features include a high-performance HTML parser, a unique interaction workflow, and a streamlined user interface.
Installation
Setting up OpenWebAgent involves configuring both a browser plugin and a backend server.
Plugin Setup
To get the browser plugin running:
- You can download the
extension.zipfile and unzip it to add it directly to your Chrome browser. - Alternatively, if you wish to modify the source code, navigate to the
plugindirectory and install dependencies:cd plugin npm install - Then, build the extension:
This will create annpm run buildopenwebagent-extensionfolder, which you can install as an unpacked plugin in Chrome. - For more detailed instructions, refer to the
README.mdlocated in theplugin/directory.
Server Setup
To set up the backend server:
- Configure
config/server_config.yamlto specify your planner arguments and model. For example:planner_args: provider: "openai" model: "gpt-4-turbo-2024-04-09" n_workers: 2 - Configure your MongoDB Atlas. You can also save data locally, but remember to update
config/mongo_config.yamlaccordingly:mongo_args: base_url: "<your-url>" dbname: "<your-db-name>" username: "<your-username>" - Add your API keys to the
.envfile:OPENAI_KEY="<your-token>" LOG_DB_PASSWD="<your-db-password>" OPENAI_API_URL="<your-openai-url>" # optional - Download the required server packages:
cd server bash setup.sh - Finally, start the server:
python agent/run_server.py - For further details, consult the
README.mdin theserver/directory.
Examples
Once both the plugin and server are configured, OpenWebAgent allows you to automate complex tasks on webpages. The ready-to-use plugin integrates seamlessly with your browser, enabling the agent to interpret user intent, process web content, and execute actions. This powerful combination streamlines interactions, making web-based tasks more efficient and automated.
Why Use It
OpenWebAgent stands out for several compelling reasons:
- High-Performance HTML Parser: It simplifies complex HTML structures, significantly boosting document processing speed and accuracy for the agent.
- Unique Interaction Workflow: The modular workflow effectively integrates user intent, action history, and parsed HTML, ensuring coherent actions and facilitating easy integration of various models.
- Streamlined User Interface: The toolkit offers an intuitive, ready-to-use interface where users can effortlessly track processes and control tasks with minimal setup.
- Open Toolkit: Its open nature allows developers to easily incorporate their own LLM or LMM models, making it highly adaptable and customizable for specific needs.
Links
- GitHub Repository: https://github.com/THUDM/OpenWebAgent
If you find OpenWebAgent useful, please consider citing their paper:
@inproceedings{iong2024openwebagent,
title = {OpenWebAgent: An Open Toolkit to Enable Web Agents on Large Language Models},
author = {Iat Long Iong and Xiao Liu and Yuxuan Chen and Hanyu Lai and Shuntian Yao and Pengbo Shen and Hao Yu and Yuxiao Dong and Jie Tang},
booktitle = {ACL 2024 System Demonstration Track},
year = {2024}
}Related repositories
Similar repositories that may be relevant next.

Frontend Slides: Create Stunning Web Presentations with AI Coding Agents
June 28, 2026
Frontend Slides is an innovative GitHub repository that empowers users to create beautiful web presentations using AI coding agents. It simplifies the design process by offering visual style discovery and can even convert existing PowerPoint files into elegant HTML slides. This project is ideal for non-designers seeking professional, dependency-free presentations.
OrbitDB: Peer-to-Peer Databases for the Decentralized Web
June 22, 2026
OrbitDB is a serverless, distributed, peer-to-peer database designed for the decentralized web. It leverages IPFS for data storage and Libp2p Pubsub for automatic synchronization, ensuring eventual consistency through Merkle-CRDTs. This makes OrbitDB an excellent choice for p2p, decentralized, blockchain, and local-first web applications, offering various database types like event logs, documents, and key-value stores.

Open-Higgsfield-AI: Free, Self-Hosted AI Image Generation & Cinema Studio
June 15, 2026
Open-Higgsfield-AI offers an open-source, self-hosted alternative for AI image generation and a cinema studio. It provides access to over 20 models, including Flux, SDXL, Midjourney, and Ideogram, allowing users to create stunning visuals and cinematic content. This MIT-licensed project is fully customizable and designed for local operation.

Nextcloud Office Online: Seamless Document Integration
June 13, 2026
The `nextcloud/officeonline` repository provides an integration app for Nextcloud, enabling users to edit documents directly within their Nextcloud instance using an on-premise Office Online Server. This solution facilitates collaborative document editing and viewing, enhancing productivity for Nextcloud users. It specifically supports self-hosted Office Online Server deployments, not cloud-based Office 365.
Source repository
Open the original repository on GitHub.