Repository History
41 repositories tagged with Deep Learning

AudioSep: Foundation Model for Open-Domain Sound Separation with Language Queries
AudioSep is a groundbreaking foundation model for open-domain sound separation, allowing users to isolate specific sounds using natural language descriptions. It demonstrates strong performance and impressive zero-shot generalization across various tasks, including audio event, musical instrument, and speech separation. This powerful tool simplifies complex audio processing with intuitive text-based queries.

JAX: Composable Transformations for Python+NumPy Programs
JAX is a powerful Python library designed for high-performance numerical computing and large-scale machine learning. It offers composable function transformations like automatic differentiation, JIT compilation to accelerators (GPU/TPU), and auto-vectorization. This powerful combination allows developers to write flexible and efficient numerical programs.
Translation Agent: Agentic Translation with LLM Reflection Workflow
Translation Agent is a Python demonstration of an agentic workflow for machine translation, leveraging large language models (LLMs) and a reflection process. This innovative approach aims to improve translation quality by having the LLM translate, reflect on its output, and then refine the translation based on its own suggestions. It offers significant customizability for style, idioms, and regional language variations, making it a promising direction for future translation technologies.

LLMBox: A Comprehensive Python Library for LLM Training and Evaluation
LLMBox is a comprehensive Python library designed for implementing Large Language Models, offering a unified training pipeline and extensive model evaluation capabilities. It provides a one-stop solution for both training and utilizing LLMs, emphasizing flexibility and efficiency. Developers can leverage its diverse training strategies and blazingly fast inference for their LLM projects.

WiFi-3D-Fusion: Real-Time 3D Human Pose Estimation from WiFi Signals
WiFi-3D-Fusion is an innovative open-source research project that leverages WiFi CSI signals and deep learning to estimate 3D human pose. It uniquely fuses wireless sensing with computer vision techniques, providing next-generation spatial awareness. This project offers real-time motion detection and visualization, showcasing a novel approach to understanding human movement in 3D space.
Magenta RT: Live Music Generation on Your Local Device
Magenta RealTime (Magenta RT) is an open-source Python library for live music audio generation on local devices. It allows users to create music using both text and audio prompts, serving as a powerful tool for real-time creative audio exploration. This library is the on-device companion to Google's MusicFX DJ Mode and the Lyria RealTime API.

KBLaM: Knowledge Base Augmented Language Models for Enhanced LLMs
KBLaM, developed by Microsoft, is the official implementation of "Knowledge Base Augmented Language Models" presented at ICLR 2025. This innovative method enhances Large Language Models by directly integrating external knowledge bases, offering an efficient alternative to traditional Retrieval-Augmented Generation (RAG) and in-context learning. It eliminates external retrieval modules and scales computationally linearly with knowledge base size, rather than quadratically.

GigaSLAM: Large-Scale Monocular SLAM with Hierarchical Gaussian Splats
GigaSLAM is a groundbreaking monocular SLAM framework designed for kilometer-scale outdoor environments. It leverages hierarchical Gaussian splats and neural networks to achieve efficient, scalable mapping and high-fidelity rendering. This system addresses the challenges of large-scale tracking and mapping using only RGB input, extending the applicability of Gaussian Splatting SLAM to unbounded outdoor scenes.

FlashAttention: Fast and Memory-Efficient Exact Attention
FlashAttention is a cutting-edge library from Dao-AILab, designed to provide fast and memory-efficient exact attention for deep learning models. It significantly accelerates transformer training and inference by optimizing memory usage and computational speed. This makes it an essential tool for researchers and developers working with large-scale AI models.

AIMET: Advanced Quantization and Compression for Neural Networks
AIMET, the AI Model Efficiency Toolkit, is an open-source Python library developed by Qualcomm Innovation Center, Inc. It provides advanced techniques for quantizing and compressing trained deep learning models. This toolkit helps improve runtime performance and reduce memory footprint, making models more efficient for deployment on edge devices while minimizing accuracy loss.

LLM Reasoners: Advanced Library for Large Language Model Reasoning
LLM Reasoners is a powerful Python library designed to significantly enhance the complex reasoning capabilities of Large Language Models. It offers a comprehensive suite of cutting-edge search algorithms, intuitive visualization tools, and optimized performance for efficient LLM inference. The library prioritizes rigorous implementation and reproducibility, making it a reliable tool for researchers and developers in the AI field.
awesome-AI-books: A Curated Collection of AI and Machine Learning Resources
The awesome-AI-books repository by zslucky is a comprehensive collection of AI-related books and PDFs, designed for learning and research. It offers a wide range of resources, from introductory theory and mathematics to advanced topics like deep learning and quantum AI. This repository also includes links to various AI playground models and research organizations, making it an invaluable hub for anyone interested in artificial intelligence.