Repository History

187 repositories tagged with AI

Topic: AI

Optimum: Accelerate Hugging Face Models with Hardware Optimization

Optimum is an extension of Hugging Face Transformers, Diffusers, TIMM, and Sentence-Transformers, designed to provide a suite of optimization tools. It enables maximum efficiency for training and running models on targeted hardware, simplifying the process for developers. This library helps users achieve significant performance gains across various machine learning workflows.

Analyzed Jan 6, 2026

View Details

bolt.diy: AI-Powered Full-Stack Web Development with Any LLM in Your Browser

bolt.diy is an open-source project that empowers developers to prompt, run, edit, and deploy full-stack web applications directly in their browser. It offers unparalleled flexibility by supporting over 19 different Large Language Models (LLMs), allowing users to choose their preferred AI for code generation and development tasks. This tool streamlines the development workflow, making AI-assisted coding accessible and highly customizable.

Analyzed Jan 5, 2026

View Details

notesGPT: AI-Powered Voice Notes with Transcription and Summarization

notesGPT is an innovative open-source project that allows users to record voice notes and leverage AI to transcribe, summarize, and extract actionable tasks from them. Built with a modern tech stack including Convex, Next.js, and Together.ai, it streamlines the process of turning spoken ideas into organized information. This tool is ideal for anyone looking to enhance their productivity by efficiently managing their voice recordings.

Analyzed Jan 5, 2026

View Details

Vexa: Self-Hosted Meeting Intelligence Platform with Real-Time Transcripts

Vexa is an open-source, self-hostable meeting intelligence platform designed for real-time transcription across Google Meet and Microsoft Teams. It provides a multi-user API that deploys bots to meetings, offering robust data sovereignty and flexible deployment options for various enterprise needs. Built with Python, Vexa supports real-time multilingual transcription and translation.

Analyzed Jan 1, 2026

View Details

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper2Code is an innovative multi-agent LLM system designed to automate the generation of code repositories directly from scientific papers in machine learning. It employs a sophisticated three-stage pipeline, encompassing planning, analysis, and code generation, each managed by specialized agents. This approach ensures faithful and high-quality implementations, outperforming existing baselines on relevant benchmarks.

Analyzed Jan 1, 2026

View Details

big_vision: Google Research's Codebase for Large-Scale Vision Models

big_vision is Google Research's official codebase for training large-scale vision models using Jax/Flax. It has been instrumental in developing prominent architectures like Vision Transformer, SigLIP, and MLP-Mixer. This repository offers a robust starting point for researchers to conduct scalable vision experiments on GPUs and Cloud TPUs, scaling seamlessly from single cores to distributed setups.

Analyzed Dec 31, 2025

View Details

NVIDIA Isaac GR00T: A Foundation Model for Generalist Robots

NVIDIA Isaac GR00T N1.6 is an open vision-language-action (VLA) foundation model designed for generalized humanoid robot skills. It enables robots to perform manipulation tasks in diverse environments by taking multimodal input, including language and images. Researchers and professionals can leverage this model for fine-tuning on custom datasets and deploying it for inference.

Analyzed Dec 30, 2025

View Details

HunyuanVideo-Avatar: High-Fidelity Audio-Driven Human Animation

HunyuanVideo-Avatar is a cutting-edge project by Tencent-Hunyuan for high-fidelity, audio-driven human animation. Utilizing a multimodal diffusion transformer, it generates dynamic, emotion-controllable, and multi-character dialogue videos. This innovative system addresses critical challenges in character consistency, emotion alignment, and multi-character animation, making it suitable for diverse applications like e-commerce and social media.

Analyzed Dec 30, 2025

View Details

context-engineering-intro: Master AI Coding Assistants with Context Engineering

Context Engineering represents a powerful evolution beyond traditional prompt engineering, focusing on providing comprehensive information to AI coding assistants for end-to-end task completion. The coleam00/context-engineering-intro repository offers a robust template and step-by-step guide to implement this discipline effectively. It enables developers to leverage AI, particularly with tools like Claude Code, to build complex features with greater consistency and fewer failures.

Analyzed Dec 29, 2025

View Details

OmniParser: A Vision-Based Tool for GUI Agent Screen Parsing

OmniParser is a comprehensive tool developed by Microsoft for parsing user interface screenshots into structured, understandable elements. It significantly enhances the ability of vision-based models, such as GPT-4V, to generate accurate actions grounded in specific regions of a GUI. This project aims to advance pure vision-based GUI agents by providing robust screen parsing capabilities.

Analyzed Dec 28, 2025

View Details

Memori: SQL Native Memory Layer for LLMs and AI Agents

Memori is an SQL Native Memory Layer designed for LLMs, AI Agents, and Multi-Agent Systems. It provides a robust and flexible solution for managing long-short term memory, integrating seamlessly with existing software and infrastructure. This project aims to enhance AI systems with persistent, structured memory capabilities, making them more intelligent and context-aware.

Analyzed Dec 28, 2025

View Details

Clarity-Upscaler: Free and Open-Source AI Image Upscaler & Enhancer

Clarity-Upscaler is an open-source AI image upscaler and enhancer, offering a free alternative to tools like Magnific. Built with Python, this repository provides powerful features for high-resolution image generation and enhancement, supporting various integration methods for developers and users alike.

Analyzed Dec 25, 2025

View Details

Previous Page 9 Next