StarRocks: The Fastest Open Query Engine for Data Lakehouse Analytics
This repository profile is provided by osrepos.com, an open source repository discovery platform.

Summary
StarRocks is an open-source, high-performance query engine optimized for sub-second analytics across data lakehouses. It delivers best-in-class performance for multi-dimensional, real-time, and ad-hoc queries, making it a versatile solution for complex data analysis. This Linux Foundation project is engineered to provide speed and flexibility for various analytical scenarios.
Repository Information
Topics
Click on any tag to explore related repositories
Use at your own risk
OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.
Introduction
StarRocks is the world's fastest open query engine, designed for sub-second analytics both on and off the data lakehouse. As a Linux Foundation project, it offers best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries, adapting flexibly to nearly any scenario. Its native vectorized SQL engine and compatibility with the MySQL protocol make it a powerful and accessible tool for modern data analysis.
Installation
To get started with StarRocks, it is recommended to refer to the official documentation for detailed installation and deployment guides. You can find quick start tutorials and comprehensive deployment overviews, including options for setting up development environments or deploying manually.
Examples
StarRocks excels in various analytical scenarios, from real-time dashboards to complex ad-hoc queries directly on data lakes. It is used by numerous companies for high-performance data processing. You can explore its capabilities through:
- Direct Data Lake Querying: Access data directly from Apache Hive™, Apache Iceberg™, Delta Lake™, and Apache Hudi™ without prior import.
- Real-time Analytics: Power applications requiring immediate insights from constantly updating data.
- Multi-dimensional Analysis: Perform complex analytical queries with sub-second response times.
- Demo Repository: Explore practical examples and use cases in the official StarRocks Demo repository.
Why Use StarRocks?
StarRocks offers several compelling advantages for data analytics:
- ? Native Vectorized SQL Engine: Leverages CPU parallel computing for 5 to 10 times faster query returns in multi-dimensional analyses.
- ? Standard SQL & MySQL Compatibility: Supports ANSI SQL syntax and is compatible with the MySQL protocol, allowing integration with various clients and BI tools.
- ? Smart Query Optimization: Utilizes a Cost-Based Optimizer (CBO) to generate efficient execution plans, significantly improving data analysis efficiency.
- ? Real-time Update Capabilities: Supports upsert/delete operations based on primary keys, ensuring efficient querying even with concurrent updates.
- ? Intelligent Materialized Views: Materialized views are automatically updated during data import and intelligently selected during query execution.
- ? Direct Data Lake Querying: Eliminates the need for data import by directly accessing data in Apache Hive™, Apache Iceberg™, Delta Lake™, and Apache Hudi™.
- ?? Resource Management: Provides features to limit resource consumption for queries and ensure isolation and efficient resource use among tenants.
- ? Easy to Maintain: Features a streamlined architecture that simplifies deployment, maintenance, and scaling, with agile query plan tuning and automatic data recovery.
Links
- GitHub Repository: StarRocks/starrocks
- Official Website: starrocks.io
- Documentation: docs.starrocks.io
- Download: Community Download
- Slack Community: Join StarRocks on Slack
- YouTube Channel: StarRocks Labs
- Contributing Guide: CONTRIBUTING.md
Related repositories
Similar repositories that may be relevant next.

Bklit: Open-Source Analytics SaaS (Discontinued)
June 12, 2026
Bklit was a privacy-focused, open-source analytics platform designed for modern web applications, offering real-time tracking of pageviews, events, and user sessions. Built with a powerful SDK and a beautiful dashboard, it provided robust features for web analytics. Please note, Bklit Analytics has been discontinued, and its hosted service, npm packages, and self-hosted infrastructure are no longer maintained.

LangWatch: The Platform for LLM Evaluations and AI Agent Testing
April 28, 2026
LangWatch is an open-source platform designed for end-to-end LLM evaluations and AI agent testing. It helps teams test, simulate, evaluate, and monitor LLM-powered agents both before release and in production. Built for robust regression testing, simulations, and production observability, LangWatch eliminates the need for custom tooling.
Streamystats: Advanced Analytics and AI for Your Jellyfin Library
March 20, 2026
Streamystats is a powerful statistics service designed for Jellyfin, offering comprehensive analytics and data visualization for your media library. It provides detailed dashboards, user-specific watch history, and advanced AI features like chat and personalized recommendations. This project enhances the Jellyfin experience by transforming raw viewing data into insightful, actionable information.
Aptabase: Open Source, Privacy-First Analytics for Apps
March 5, 2026
Aptabase is an open-source, privacy-first analytics platform designed for mobile, desktop, and web applications. It offers a simple, user-friendly dashboard for essential metrics and provides an extensive list of SDKs for various frameworks. This robust solution ensures compliance with privacy regulations like GDPR while delivering valuable app insights.
Source repository
Open the original repository on GitHub.