Repository History

Explore all analyzed open source repositories

Topic: AI
Vexa: Self-Hosted Meeting Intelligence Platform with Real-Time Transcripts

Vexa: Self-Hosted Meeting Intelligence Platform with Real-Time Transcripts

Vexa is an open-source, self-hostable meeting intelligence platform designed for real-time transcription across Google Meet and Microsoft Teams. It provides a multi-user API that deploys bots to meetings, offering robust data sovereignty and flexible deployment options for various enterprise needs. Built with Python, Vexa supports real-time multilingual transcription and translation.

Jan 1, 2026
View Details
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper2Code is an innovative multi-agent LLM system designed to automate the generation of code repositories directly from scientific papers in machine learning. It employs a sophisticated three-stage pipeline, encompassing planning, analysis, and code generation, each managed by specialized agents. This approach ensures faithful and high-quality implementations, outperforming existing baselines on relevant benchmarks.

Jan 1, 2026
View Details
big_vision: Google Research's Codebase for Large-Scale Vision Models

big_vision: Google Research's Codebase for Large-Scale Vision Models

big_vision is Google Research's official codebase for training large-scale vision models using Jax/Flax. It has been instrumental in developing prominent architectures like Vision Transformer, SigLIP, and MLP-Mixer. This repository offers a robust starting point for researchers to conduct scalable vision experiments on GPUs and Cloud TPUs, scaling seamlessly from single cores to distributed setups.

Dec 31, 2025
View Details
NVIDIA Isaac GR00T: A Foundation Model for Generalist Robots

NVIDIA Isaac GR00T: A Foundation Model for Generalist Robots

NVIDIA Isaac GR00T N1.6 is an open vision-language-action (VLA) foundation model designed for generalized humanoid robot skills. It enables robots to perform manipulation tasks in diverse environments by taking multimodal input, including language and images. Researchers and professionals can leverage this model for fine-tuning on custom datasets and deploying it for inference.

Dec 30, 2025
View Details
HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation

HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation

HunyuanVideo-Avatar is a cutting-edge project by Tencent-Hunyuan for high-fidelity, audio-driven human animation. Utilizing a multimodal diffusion transformer, it generates dynamic, emotion-controllable, and multi-character dialogue videos. This innovative system addresses critical challenges in character consistency, emotion alignment, and multi-character animation, making it suitable for diverse applications like e-commerce and social media.

Dec 30, 2025
View Details
context-engineering-intro: Master AI Coding Assistants with Context Engineering

context-engineering-intro: Master AI Coding Assistants with Context Engineering

Context Engineering represents a powerful evolution beyond traditional prompt engineering, focusing on providing comprehensive information to AI coding assistants for end-to-end task completion. The coleam00/context-engineering-intro repository offers a robust template and step-by-step guide to implement this discipline effectively. It enables developers to leverage AI, particularly with tools like Claude Code, to build complex features with greater consistency and fewer failures.

Dec 29, 2025
View Details
OmniParser: A Vision-Based Tool for GUI Agent Screen Parsing

OmniParser: A Vision-Based Tool for GUI Agent Screen Parsing

OmniParser is a comprehensive tool developed by Microsoft for parsing user interface screenshots into structured, understandable elements. It significantly enhances the ability of vision-based models, such as GPT-4V, to generate accurate actions grounded in specific regions of a GUI. This project aims to advance pure vision-based GUI agents by providing robust screen parsing capabilities.

Dec 28, 2025
View Details
Memori: SQL Native Memory Layer for LLMs and AI Agents

Memori: SQL Native Memory Layer for LLMs and AI Agents

Memori is an SQL Native Memory Layer designed for LLMs, AI Agents, and Multi-Agent Systems. It provides a robust and flexible solution for managing long-short term memory, integrating seamlessly with existing software and infrastructure. This project aims to enhance AI systems with persistent, structured memory capabilities, making them more intelligent and context-aware.

Dec 28, 2025
View Details
Clarity-Upscaler: Free and Open-Source AI Image Upscaler & Enhancer

Clarity-Upscaler: Free and Open-Source AI Image Upscaler & Enhancer

Clarity-Upscaler is an open-source AI image upscaler and enhancer, offering a free alternative to tools like Magnific. Built with Python, this repository provides powerful features for high-resolution image generation and enhancement, supporting various integration methods for developers and users alike.

Dec 25, 2025
View Details
TextMachina: A Python Framework for MGT Dataset Generation

TextMachina: A Python Framework for MGT Dataset Generation

TextMachina is a modular and extensible Python framework designed for creating high-quality, unbiased datasets for Machine-Generated Text (MGT) tasks. It supports detection, attribution, and boundary detection, offering a user-friendly pipeline with LLM integrations, prompt templating, and bias mitigation. This tool streamlines the process of building robust models for understanding and identifying AI-generated content.

Dec 21, 2025
View Details
Local Deep Research: AI-Powered, Privacy-Focused Research Assistant for Academia

Local Deep Research: AI-Powered, Privacy-Focused Research Assistant for Academia

Local Deep Research is an AI-powered assistant designed for deep, iterative research, achieving high accuracy on benchmarks. It supports both local and cloud LLMs, searches over 10 sources including academic papers and private documents, and ensures privacy with local, encrypted operations. This tool is ideal for researchers, students, and professionals seeking accurate, transparent, and secure information retrieval.

Dec 21, 2025
View Details
DeepScrape: Intelligent Web Scraping & LLM-Powered Data Extraction

DeepScrape: Intelligent Web Scraping & LLM-Powered Data Extraction

DeepScrape is an AI-powered web scraping tool designed for intelligent data extraction using LLMs. It leverages Playwright for browser automation and supports both cloud (OpenAI) and local LLMs (Ollama, vLLM) for transforming web content into structured JSON. This versatile tool is ideal for modern web applications, RAG pipelines, and various data workflows, offering privacy-first data processing.

Dec 19, 2025
View Details
Page 1