Repository History

80 repositories tagged with Machine Learning

Topic: Machine Learning

TRELLIS: Structured 3D Latents for Scalable and Versatile 3D Generation

TRELLIS is the official repository for a CVPR'25 Spotlight paper on "Structured 3D Latents for Scalable and Versatile 3D Generation." This Microsoft project introduces a powerful model for generating high-quality 3D assets from text or image prompts. It supports diverse output formats like Radiance Fields, 3D Gaussians, and meshes, offering flexible editing capabilities.

Analyzed Apr 1, 2026

View Details

Jieba: The Leading Python Library for Chinese Text Segmentation

Jieba is a highly popular and efficient Python library designed for Chinese text segmentation. It offers various cutting modes, including accurate, full, and search engine modes, making it versatile for different NLP tasks. With features like custom dictionaries and part-of-speech tagging, Jieba provides a comprehensive solution for processing Chinese text.

Analyzed Mar 31, 2026

View Details

AudioSep: Foundation Model for Open-Domain Sound Separation with Language Queries

AudioSep is a groundbreaking foundation model for open-domain sound separation, allowing users to isolate specific sounds using natural language descriptions. It demonstrates strong performance and impressive zero-shot generalization across various tasks, including audio event, musical instrument, and speech separation. This powerful tool simplifies complex audio processing with intuitive text-based queries.

Analyzed Mar 30, 2026

View Details

JAX: Composable Transformations for Python+NumPy Programs

JAX is a powerful Python library designed for high-performance numerical computing and large-scale machine learning. It offers composable function transformations like automatic differentiation, JIT compilation to accelerators (GPU/TPU), and auto-vectorization. This powerful combination allows developers to write flexible and efficient numerical programs.

Analyzed Mar 26, 2026

View Details

index-tts-lora: High-Quality Speech Synthesis with LoRA Fine-tuning

index-tts-lora offers a robust solution for high-quality speech synthesis, leveraging LoRA fine-tuning on the index-tts framework. It significantly enhances prosody and naturalness for both single and multi-speaker voices. This project provides practical methods for training and inference, making advanced voice synthesis more accessible.

Analyzed Mar 23, 2026

View Details

Infinity: High-Throughput, Low-Latency Serving for Text Embeddings and Reranking

Infinity is a powerful, high-throughput, and low-latency REST API designed for serving various AI models, including text embeddings, reranking, and multi-modal models. It supports deploying any model from HuggingFace with fast inference backends optimized for diverse accelerators. This engine simplifies the deployment and usage of advanced AI models for developers.

Analyzed Mar 17, 2026

View Details

LLMBox: A Comprehensive Python Library for LLM Training and Evaluation

LLMBox is a comprehensive Python library designed for implementing Large Language Models, offering a unified training pipeline and extensive model evaluation capabilities. It provides a one-stop solution for both training and utilizing LLMs, emphasizing flexibility and efficiency. Developers can leverage its diverse training strategies and blazingly fast inference for their LLM projects.

Analyzed Mar 16, 2026

View Details

Rio: Build Web and Desktop Apps in Pure Python, No JavaScript Needed

Rio is an innovative Python framework that allows developers to create web and desktop applications using pure Python, eliminating the need for HTML, CSS, or JavaScript. It provides a modern, declarative UI approach with over 50 built-in components, making app development efficient and enjoyable. With Rio, you can build powerful, type-safe applications that run seamlessly across different environments.

Analyzed Mar 9, 2026

View Details

Magenta RT: Live Music Generation on Your Local Device

Magenta RealTime (Magenta RT) is an open-source Python library for live music audio generation on local devices. It allows users to create music using both text and audio prompts, serving as a powerful tool for real-time creative audio exploration. This library is the on-device companion to Google's MusicFX DJ Mode and the Lyria RealTime API.

Analyzed Mar 6, 2026

View Details

NUDGE: Lightweight Non-Parametric Embedding Fine-Tuning for Retrieval

NUDGE is a lightweight, non-parametric tool designed to fine-tune pre-trained embeddings, significantly enhancing retrieval and RAG pipelines. It operates by adjusting data embeddings directly, rather than modifying model parameters, to maximize accuracy. This approach often leads to over 10% improvement in retrieval accuracy and runs in minutes.

Analyzed Mar 4, 2026

View Details

maestro: Streamlining Fine-Tuning for Multimodal Models like PaliGemma 2 and Florence-2

maestro is a powerful tool designed to accelerate the fine-tuning process for multimodal models. It encapsulates best practices, handling configuration, data loading, reproducibility, and training loop setup efficiently. The project currently offers ready-to-use recipes for popular vision-language models, including Florence-2, PaliGemma 2, and Qwen2.5-VL.

Analyzed Mar 2, 2026

View Details

KBLaM: Knowledge Base Augmented Language Models for Enhanced LLMs

KBLaM, developed by Microsoft, is the official implementation of "Knowledge Base Augmented Language Models" presented at ICLR 2025. This innovative method enhances Large Language Models by directly integrating external knowledge bases, offering an efficient alternative to traditional Retrieval-Augmented Generation (RAG) and in-context learning. It eliminates external retrieval modules and scales computationally linearly with knowledge base size, rather than quadratically.

Analyzed Feb 28, 2026

View Details

Previous Page 3 Next