OS
OSRepos
HomeRepositoriesRSS

Repository History

Explore all analyzed open source repositories

Topic: Dataset Quality
LLMSanitize: An Open-Source Library for Contamination Detection in NLP and LLM Datasets

LLMSanitize: An Open-Source Library for Contamination Detection in NLP and LLM Datasets

LLMSanitize is an open-source Python library designed for detecting contamination in NLP datasets and Large Language Models (LLMs). It offers a comprehensive suite of methods, ranging from string matching to model likelihood and embedding similarity, to ensure data integrity. This tool is crucial for researchers and developers working with LLMs to maintain the reliability of their models and evaluations.

Feb 9, 2026
View Details
Page 1
OS
OSRepos

Analysis and discovery of open source repositories. Find interesting projects and follow their updates.

Monitor your website with YourWebsiteScore

Navigation

HomeRepositoriesSitemapRSS Feed

Legal

Privacy PolicyCookie Policy

© 2025 OSRepos. Built with Nuxt 3 and lots of ❤️