GLM-5: Flagship Models for Long-Horizon Agentic Engineering
This repository profile is provided by osrepos.com, an open source repository discovery platform.

Summary
GLM-5 is a series of flagship models, including GLM-5.2, GLM-5.1, and GLM-5, developed by zai-org for complex systems engineering and long-horizon agentic tasks. These models offer advanced coding capabilities, impressive context lengths, and state-of-the-art performance on various benchmarks. They are designed to sustain effective problem-solving over extended sessions through iterative reasoning and strategy revision.
Repository Information
Topics
Click on any tag to explore related repositories
Use at your own risk
OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.
Introduction
The GLM-5 series, developed by zai-org, represents a significant advancement in large language models tailored for complex systems engineering and long-horizon agentic tasks. This repository showcases GLM-5, GLM-5.1, and the latest GLM-5.2, each building upon its predecessor with enhanced capabilities.
GLM-5.2
GLM-5.2 is the latest flagship model, making a substantial leap in long-horizon task capability with a solid 1M-token context. Its new features include robust 1M context stability, advanced coding with flexible effort levels, and an improved architecture featuring IndexShare, which reduces per-token FLOPs by 2.9x at 1M context length. GLM-5.2 demonstrates state-of-the-art performance on coding benchmarks, outperforming other open-source models and closing the gap with frontier closed-source models.
GLM-5.1
GLM-5.1 is designed for agentic engineering, offering significantly stronger coding capabilities. It achieves state-of-the-art performance on SWE-Bench Pro and excels in real-world terminal tasks. A key innovation of GLM-5.1 is its ability to remain effective over much longer horizons, handling ambiguous problems with better judgment and sustaining productivity through iterative reasoning, experimentation, and strategy revision over hundreds of rounds.
GLM-5
GLM-5 targets complex systems engineering and long-horizon agentic tasks. It scales significantly from GLM-4.5, increasing parameters and pre-training data. It integrates DeepSeek Sparse Attention (DSA) to reduce deployment costs while maintaining long-context capacity. GLM-5 also leverages slime, a novel asynchronous RL infrastructure, to improve training throughput and efficiency, leading to best-in-class performance among open-source models across reasoning, coding, and agentic tasks.
Installation
The GLM-5 series models are available for download and local deployment. You can access the models through Hugging Face and ModelScope.
To serve GLM-5 series models locally, several frameworks are supported:
- SGLang (v0.5.13.post1+), see cookbook
- vLLM (v0.23.0+), see recipes
- Transformers (v0.5.12+), see transformers docs
- KTransformers (v0.5.12+), see tutorial
- For deployment on the
Ascend NPUplatform, inference frameworks such as vLLM-Ascend, xLLM, and SGLang are supported, see here.
Examples
GLM-5 models support controlling the thinking budget through the reasoning_effort parameter. This parameter accepts two levels: max (default) and high. If reasoning_effort is unset or set to any value other than high, the model runs at Max. To use the High level, you must explicitly pass reasoning_effort="high". Thinking can be turned off entirely by setting enable_thinking=false.
Why Use GLM-5?
The GLM-5 series offers compelling advantages for developers and researchers working with advanced AI:
- Exceptional Long-Horizon Capability: GLM-5.2 provides a stable 1M-token context, enabling sustained work on complex, long-duration tasks.
- State-of-the-Art Agentic Engineering: GLM-5.1 and GLM-5 excel in agentic tasks, demonstrating superior problem-solving, iterative reasoning, and strategic revision over extended sessions.
- Advanced Coding Performance: The models achieve leading scores on standard coding benchmarks like Terminal-Bench and SWE-bench Pro.
- Efficient Deployment: Features like DeepSeek Sparse Attention in GLM-5 reduce deployment costs while preserving long-context capacity.
- Strong Benchmark Results: Consistent top performance across a wide range of academic and real-world benchmarks, including Vending Bench 2, showcasing robust planning and resource management.
Links
- GitHub Repository: zai-org/GLM-5
- GLM-5.2 Blog: Read the GLM-5.2 blog
- GLM-5 Technical Report: arXiv:2602.15763
- Z.ai API Platform: Use GLM-5.2 API services
- Try GLM-5.2 at Z.ai: Visit z.ai
- Hugging Face: zai-org/GLM-5.2, zai-org/GLM-5.1, zai-org/GLM-5
- ModelScope: ZhipuAI/GLM-5.2, ZhipuAI/GLM-5.1, ZhipuAI/GLM-5
Related repositories
Similar repositories that may be relevant next.
Claude Code System Prompts: Deconstructing Agentic AI Coding Assistants
May 23, 2026
This repository offers a deep dive into the inner workings of modern agentic AI coding assistants. It reconstructs prompt patterns, agent coordination strategies, and security mechanisms, providing insights into how tools like Claude Code operate. The project serves as a valuable resource for understanding the architectural patterns behind these advanced AI systems.

AutoGen: A Programming Framework for Agentic AI
March 30, 2026
AutoGen is a versatile programming framework from Microsoft designed for building multi-agent AI applications. It empowers AI agents to operate autonomously or collaborate seamlessly with human users, streamlining the execution of complex tasks. The framework offers a layered, extensible design, providing both high-level APIs for rapid prototyping and low-level components for fine-grained control.

mini-swe-agent: The Minimal AI Agent for Solving GitHub Issues
March 18, 2026
mini-swe-agent is a remarkably simple yet powerful AI agent, comprising just 100 lines of Python code. It's designed to solve GitHub issues and assist in command-line tasks, achieving over 74% on the SWE-bench verified benchmark. This project offers a radically simple approach to AI-driven software engineering, avoiding complex configurations and large monorepos.

joinly: Make Your Meetings Accessible to AI Agents
January 12, 2026
joinly.ai is an open-source connector middleware designed to integrate AI agents into video calls. It enables agents to actively participate, interact in real-time, and perform tasks during meetings across platforms like Google Meet, Zoom, and Microsoft Teams. The project emphasizes a privacy-first approach, offering self-hosting capabilities and flexibility with various LLM, TTS, and STT providers.
Source repository
Open the original repository on GitHub.