Repository History
81 repositories tagged with Machine Learning

KBLaM: Knowledge Base Augmented Language Models for Enhanced LLMs
KBLaM, developed by Microsoft, is the official implementation of "Knowledge Base Augmented Language Models" presented at ICLR 2025. This innovative method enhances Large Language Models by directly integrating external knowledge bases, offering an efficient alternative to traditional Retrieval-Augmented Generation (RAG) and in-context learning. It eliminates external retrieval modules and scales computationally linearly with knowledge base size, rather than quadratically.

Verifiers: Environments for LLM Reinforcement Learning and Evaluation
Verifiers is a Python library by Prime Intellect AI for building environments to train and evaluate Large Language Models (LLMs). It enables the creation of custom environments with datasets, model harnesses, and reward functions, supporting reinforcement learning, capability evaluation, and synthetic data generation. This library is tightly integrated with the Prime Intellect ecosystem, including their Environments Hub and training framework.

TEN VAD: Low-Latency, High-Performance Voice Activity Detector
TEN VAD is a low-latency, high-performance, and lightweight Voice Activity Detector (VAD) designed for real-time enterprise use. It provides accurate frame-level speech activity detection, outperforming common alternatives like WebRTC VAD and Silero VAD. This system is crucial for enhancing conversational AI by reducing end-to-end latency and improving speech segment extraction.

LLMSanitize: An Open-Source Library for Contamination Detection in NLP and LLM Datasets
LLMSanitize is an open-source Python library designed for detecting contamination in NLP datasets and Large Language Models (LLMs). It offers a comprehensive suite of methods, ranging from string matching to model likelihood and embedding similarity, to ensure data integrity. This tool is crucial for researchers and developers working with LLMs to maintain the reliability of their models and evaluations.

AIMET: Advanced Quantization and Compression for Neural Networks
AIMET, the AI Model Efficiency Toolkit, is an open-source Python library developed by Qualcomm Innovation Center, Inc. It provides advanced techniques for quantizing and compressing trained deep learning models. This toolkit helps improve runtime performance and reduce memory footprint, making models more efficient for deployment on edge devices while minimizing accuracy loss.

LLM Reasoners: Advanced Library for Large Language Model Reasoning
LLM Reasoners is a powerful Python library designed to significantly enhance the complex reasoning capabilities of Large Language Models. It offers a comprehensive suite of cutting-edge search algorithms, intuitive visualization tools, and optimized performance for efficient LLM inference. The library prioritizes rigorous implementation and reproducibility, making it a reliable tool for researchers and developers in the AI field.

AI Engineering Toolkit: 100+ Libraries for LLM Development
The AI Engineering Toolkit is a comprehensive, curated list featuring over 100 libraries and frameworks essential for AI engineers. It provides battle-tested tools, frameworks, and reference implementations to develop, deploy, and optimize applications built with Large Language Models. This resource aims to help engineers build better LLM apps faster, smarter, and production-ready.
awesome-AI-books: A Curated Collection of AI and Machine Learning Resources
The awesome-AI-books repository by zslucky is a comprehensive collection of AI-related books and PDFs, designed for learning and research. It offers a wide range of resources, from introductory theory and mathematics to advanced topics like deep learning and quantum AI. This repository also includes links to various AI playground models and research organizations, making it an invaluable hub for anyone interested in artificial intelligence.

Faiss: Efficient Similarity Search and Clustering for Dense Vectors
Faiss is a library developed by Meta's Fundamental AI Research (FAIR) group, designed for efficient similarity search and clustering of dense vectors. It offers a comprehensive suite of algorithms capable of handling vector sets of any size, including those that exceed RAM capacity. With complete wrappers for Python/numpy and GPU implementations, Faiss provides robust solutions for various vector comparison tasks.

vLLM CLI: A Powerful Command-Line Interface for Serving LLMs with vLLM
vLLM CLI is an intuitive command-line interface tool designed to simplify serving Large Language Models using vLLM. It offers both interactive and direct CLI modes, enabling efficient model management, real-time server monitoring, and advanced configuration. This tool streamlines the deployment and management of LLMs, making it accessible for various use cases.

lagent: A Lightweight Framework for Building LLM-Based Agents
lagent is a lightweight, open-source framework developed by InternLM, designed for efficiently building large language model (LLM)-based agents. It provides a PyTorch-inspired design philosophy, making it intuitive for developers to create and manage multi-agent applications. This framework simplifies the process of agent communication, memory management, and tool integration.

CSM: A Conversational Speech Generation Model by SesameAILabs
CSM (Conversational Speech Model) is an advanced speech generation model from SesameAILabs, designed to create RVQ audio codes from text and audio inputs. It leverages a Llama backbone and a smaller audio decoder for Mimi audio codes, enabling high-quality, context-aware speech synthesis. The model is now natively available in Hugging Face Transformers, making it accessible for researchers and developers.