OS
OSRepos
HomeRepositoriesRSS

Repository History

Explore all analyzed open source repositories

Topic: Spark
Data Prep Kit: Accelerating Data Preparation for GenAI and LLM Applications

Data Prep Kit: Accelerating Data Preparation for GenAI and LLM Applications

Data Prep Kit is an open-source project designed to accelerate unstructured data preparation for GenAI and LLM applications. It provides a comprehensive set of modules and transforms to cleanse, transform, and enrich data for pre-training, fine-tuning, instruct-tuning LLMs, or building Retrieval Augmented Generation (RAG) applications. The kit is highly scalable, supporting processing from a laptop to data center scale using Python, Ray, and Spark runtimes.

Nov 1, 2025
View Details
Page 1
OS
OSRepos

Analysis and discovery of open source repositories. Find interesting projects and follow their updates.

Monitor your website with YourWebsiteScore

Navigation

HomeRepositoriesSitemapRSS Feed

Legal

Privacy PolicyCookie Policy

© 2025 OSRepos. Built with Nuxt 3 and lots of ❤️