Repository History
2 repositories tagged with Embeddings

Infinity: High-Throughput, Low-Latency Serving for Text Embeddings and Reranking
Infinity is a powerful, high-throughput, and low-latency REST API designed for serving various AI models, including text embeddings, reranking, and multi-modal models. It supports deploying any model from HuggingFace with fast inference backends optimized for diverse accelerators. This engine simplifies the deployment and usage of advanced AI models for developers.

NUDGE: Lightweight Non-Parametric Embedding Fine-Tuning for Retrieval
NUDGE is a lightweight, non-parametric tool designed to fine-tune pre-trained embeddings, significantly enhancing retrieval and RAG pipelines. It operates by adjusting data embeddings directly, rather than modifying model parameters, to maximize accuracy. This approach often leads to over 10% improvement in retrieval accuracy and runs in minutes.