Repository History

Explore all analyzed open source repositories

Topic: Natural Language Processing
Shortest: AI-Powered Natural Language End-to-End Testing Framework

Shortest: AI-Powered Natural Language End-to-End Testing Framework

Shortest is an innovative AI-powered end-to-end testing framework that leverages natural language for test creation and execution. Built on Playwright and utilizing the Anthropic Claude API, it simplifies the QA process by allowing users to define tests in plain English. This tool integrates seamlessly into development workflows, offering features like GitHub 2FA support and email validation.

May 8, 2026
View Details
rag-from-scratch: Building Retrieval Augmented Generation Systems

rag-from-scratch: Building Retrieval Augmented Generation Systems

This repository by LangChain AI offers a comprehensive guide to understanding and implementing Retrieval Augmented Generation (RAG) from scratch. It includes a series of Jupyter notebooks and an accompanying video playlist, making complex RAG concepts accessible for practical application. The resource highlights RAG's advantages over fine-tuning for factual recall in Large Language Models (LLMs).

Apr 30, 2026
View Details
asta-paper-finder: A Frozen-in-Time Agent for Reproducing Paper Finder Evaluations

asta-paper-finder: A Frozen-in-Time Agent for Reproducing Paper Finder Evaluations

asta-paper-finder is a standalone, "frozen-in-time" version of the AllenAI Paper Finder agent. This repository provides the code specifically for reproducing evaluation results, allowing researchers to locate sets of papers based on content and metadata criteria. It offers a stable snapshot of the agent's core paper-finding capabilities.

Apr 24, 2026
View Details
Jieba: The Leading Python Library for Chinese Text Segmentation

Jieba: The Leading Python Library for Chinese Text Segmentation

Jieba is a highly popular and efficient Python library designed for Chinese text segmentation. It offers various cutting modes, including accurate, full, and search engine modes, making it versatile for different NLP tasks. With features like custom dictionaries and part-of-speech tagging, Jieba provides a comprehensive solution for processing Chinese text.

Mar 31, 2026
View Details
AudioSep: Foundation Model for Open-Domain Sound Separation with Language Queries

AudioSep: Foundation Model for Open-Domain Sound Separation with Language Queries

AudioSep is a groundbreaking foundation model for open-domain sound separation, allowing users to isolate specific sounds using natural language descriptions. It demonstrates strong performance and impressive zero-shot generalization across various tasks, including audio event, musical instrument, and speech separation. This powerful tool simplifies complex audio processing with intuitive text-based queries.

Mar 30, 2026
View Details
Translation Agent: Agentic Translation with LLM Reflection Workflow

Translation Agent: Agentic Translation with LLM Reflection Workflow

Translation Agent is a Python demonstration of an agentic workflow for machine translation, leveraging large language models (LLMs) and a reflection process. This innovative approach aims to improve translation quality by having the LLM translate, reflect on its output, and then refine the translation based on its own suggestions. It offers significant customizability for style, idioms, and regional language variations, making it a promising direction for future translation technologies.

Mar 24, 2026
View Details
KBLaM: Knowledge Base Augmented Language Models for Enhanced LLMs

KBLaM: Knowledge Base Augmented Language Models for Enhanced LLMs

KBLaM, developed by Microsoft, is the official implementation of "Knowledge Base Augmented Language Models" presented at ICLR 2025. This innovative method enhances Large Language Models by directly integrating external knowledge bases, offering an efficient alternative to traditional Retrieval-Augmented Generation (RAG) and in-context learning. It eliminates external retrieval modules and scales computationally linearly with knowledge base size, rather than quadratically.

Feb 28, 2026
View Details
Qwen3-Coder: Alibaba Cloud's Agentic Code LLM for Advanced Development

Qwen3-Coder: Alibaba Cloud's Agentic Code LLM for Advanced Development

Qwen3-Coder is a powerful large language model series from Alibaba Cloud's Qwen team, specifically designed for agentic coding. It offers exceptional performance in coding and agentic tasks, boasting long-context capabilities and support for a vast array of programming languages. This model sets new state-of-the-art results among open models, comparable to leading commercial alternatives.

Jan 3, 2026
View Details
Apple Health MCP: Query Your Apple Health Data with Natural Language and SQL

Apple Health MCP: Query Your Apple Health Data with Natural Language and SQL

The `apple-health-mcp` project is an MCP (Model Context Protocol) server designed for querying Apple Health data. It allows users to analyze their health metrics using natural language or direct SQL queries. This server integrates with clients like Claude Desktop, providing powerful tools for health data analysis.

Dec 5, 2025
View Details
Plexe: Build Machine Learning Models from Natural Language Prompts

Plexe: Build Machine Learning Models from Natural Language Prompts

Plexe is an innovative Python library that empowers developers to build machine learning models using natural language descriptions. It automates the entire model creation process, from intent to deployment, through an intelligent multi-agent architecture. This allows for rapid development and experimentation, making ML accessible and efficient.

Oct 14, 2025
View Details
Page 1