Repository History

Explore all analyzed open source repositories

Topic: Deep Learning
Translation Agent: Agentic Translation with LLM Reflection Workflow

Translation Agent: Agentic Translation with LLM Reflection Workflow

Translation Agent is a Python demonstration of an agentic workflow for machine translation, leveraging large language models (LLMs) and a reflection process. This innovative approach aims to improve translation quality by having the LLM translate, reflect on its output, and then refine the translation based on its own suggestions. It offers significant customizability for style, idioms, and regional language variations, making it a promising direction for future translation technologies.

Mar 24, 2026
View Details
LLMBox: A Comprehensive Python Library for LLM Training and Evaluation

LLMBox: A Comprehensive Python Library for LLM Training and Evaluation

LLMBox is a comprehensive Python library designed for implementing Large Language Models, offering a unified training pipeline and extensive model evaluation capabilities. It provides a one-stop solution for both training and utilizing LLMs, emphasizing flexibility and efficiency. Developers can leverage its diverse training strategies and blazingly fast inference for their LLM projects.

Mar 16, 2026
View Details
WiFi-3D-Fusion: Real-Time 3D Human Pose Estimation from WiFi Signals

WiFi-3D-Fusion: Real-Time 3D Human Pose Estimation from WiFi Signals

WiFi-3D-Fusion is an innovative open-source research project that leverages WiFi CSI signals and deep learning to estimate 3D human pose. It uniquely fuses wireless sensing with computer vision techniques, providing next-generation spatial awareness. This project offers real-time motion detection and visualization, showcasing a novel approach to understanding human movement in 3D space.

Mar 15, 2026
View Details
Magenta RT: Live Music Generation on Your Local Device

Magenta RT: Live Music Generation on Your Local Device

Magenta RealTime (Magenta RT) is an open-source Python library for live music audio generation on local devices. It allows users to create music using both text and audio prompts, serving as a powerful tool for real-time creative audio exploration. This library is the on-device companion to Google's MusicFX DJ Mode and the Lyria RealTime API.

Mar 6, 2026
View Details
KBLaM: Knowledge Base Augmented Language Models for Enhanced LLMs

KBLaM: Knowledge Base Augmented Language Models for Enhanced LLMs

KBLaM, developed by Microsoft, is the official implementation of "Knowledge Base Augmented Language Models" presented at ICLR 2025. This innovative method enhances Large Language Models by directly integrating external knowledge bases, offering an efficient alternative to traditional Retrieval-Augmented Generation (RAG) and in-context learning. It eliminates external retrieval modules and scales computationally linearly with knowledge base size, rather than quadratically.

Feb 28, 2026
View Details
GigaSLAM: Large-Scale Monocular SLAM with Hierarchical Gaussian Splats

GigaSLAM: Large-Scale Monocular SLAM with Hierarchical Gaussian Splats

GigaSLAM is a groundbreaking monocular SLAM framework designed for kilometer-scale outdoor environments. It leverages hierarchical Gaussian splats and neural networks to achieve efficient, scalable mapping and high-fidelity rendering. This system addresses the challenges of large-scale tracking and mapping using only RGB input, extending the applicability of Gaussian Splatting SLAM to unbounded outdoor scenes.

Feb 28, 2026
View Details
FlashAttention: Fast and Memory-Efficient Exact Attention

FlashAttention: Fast and Memory-Efficient Exact Attention

FlashAttention is a cutting-edge library from Dao-AILab, designed to provide fast and memory-efficient exact attention for deep learning models. It significantly accelerates transformer training and inference by optimizing memory usage and computational speed. This makes it an essential tool for researchers and developers working with large-scale AI models.

Feb 18, 2026
View Details
AIMET: Advanced Quantization and Compression for Neural Networks

AIMET: Advanced Quantization and Compression for Neural Networks

AIMET, the AI Model Efficiency Toolkit, is an open-source Python library developed by Qualcomm Innovation Center, Inc. It provides advanced techniques for quantizing and compressing trained deep learning models. This toolkit helps improve runtime performance and reduce memory footprint, making models more efficient for deployment on edge devices while minimizing accuracy loss.

Feb 3, 2026
View Details
LLM Reasoners: Advanced Library for Large Language Model Reasoning

LLM Reasoners: Advanced Library for Large Language Model Reasoning

LLM Reasoners is a powerful Python library designed to significantly enhance the complex reasoning capabilities of Large Language Models. It offers a comprehensive suite of cutting-edge search algorithms, intuitive visualization tools, and optimized performance for efficient LLM inference. The library prioritizes rigorous implementation and reproducibility, making it a reliable tool for researchers and developers in the AI field.

Feb 2, 2026
View Details
awesome-AI-books: A Curated Collection of AI and Machine Learning Resources

awesome-AI-books: A Curated Collection of AI and Machine Learning Resources

The awesome-AI-books repository by zslucky is a comprehensive collection of AI-related books and PDFs, designed for learning and research. It offers a wide range of resources, from introductory theory and mathematics to advanced topics like deep learning and quantum AI. This repository also includes links to various AI playground models and research organizations, making it an invaluable hub for anyone interested in artificial intelligence.

Jan 31, 2026
View Details
parakeet-mlx: Nvidia's Parakeet ASR Models on Apple Silicon with MLX

parakeet-mlx: Nvidia's Parakeet ASR Models on Apple Silicon with MLX

parakeet-mlx is an open-source project that implements Nvidia's advanced Automatic Speech Recognition (ASR) Parakeet models for Apple Silicon, leveraging the MLX framework for optimized performance. This Python library offers both a command-line interface and a flexible Python API, enabling efficient transcription of audio files, including real-time streaming capabilities. It provides a powerful solution for developers and researchers working with speech processing on Apple hardware.

Jan 29, 2026
View Details
CSM: A Conversational Speech Generation Model by SesameAILabs

CSM: A Conversational Speech Generation Model by SesameAILabs

CSM (Conversational Speech Model) is an advanced speech generation model from SesameAILabs, designed to create RVQ audio codes from text and audio inputs. It leverages a Llama backbone and a smaller audio decoder for Mimi audio codes, enabling high-quality, context-aware speech synthesis. The model is now natively available in Hugging Face Transformers, making it accessible for researchers and developers.

Jan 11, 2026
View Details
Page 1