Repository History
83 repositories tagged with LLM

PyPriompt: Python Library for Priority-Based Prompt Design
PyPriompt is a Python library for designing prompts, inspired by web design libraries like React and FastHTML. It intelligently manages context windows by using priorities to decide what information to include in the prompt, ensuring efficient use of token limits. This tool helps developers create dynamic and adaptable prompts for large language models.

Attachments: The Python Funnel for LLM Context and Multimodal Data
Attachments simplifies providing context to Large Language Models by transforming various file types into model-ready text and images. This Python library acts as a universal funnel, enabling developers to integrate diverse data sources like PDFs, images, web content, and even entire code repositories with just a few lines of code. It supports popular LLM APIs and frameworks, making multimodal AI development more accessible.
CORE: A Unified Memory System for Your AI Applications
CORE by RedPlanetHQ is an open-source project designed to provide a persistent, unified memory layer for AI applications. It leverages a temporal knowledge graph to prevent context loss across various AI tools, ensuring LLMs retain past conversations, preferences, and project history. This system significantly enhances AI interactions by making context available across different sessions and platforms.
txtinstruct: Building Instruction-Tuned Models with Custom Data
txtinstruct is a Python framework designed for training instruction-tuned models. It focuses on supporting open data and models, enabling users to build their own instruction-following datasets and train models without licensing ambiguity. This project simplifies the process of creating custom instruction-tuned solutions.

Open R1: An Open-Source Reproduction of DeepSeek-R1 for Advanced LLM Training
Open R1 is a Hugging Face project dedicated to creating a fully open reproduction of DeepSeek-R1, a powerful reasoning language model. This initiative provides comprehensive tools and recipes for training, evaluating, and generating data for large language models. It fosters community collaboration in AI research, enabling developers to build upon and understand the complex R1 pipeline.

llama-cpp-python: Python Bindings for llama.cpp
llama-cpp-python provides robust Python bindings for the popular llama.cpp library, enabling efficient local inference with large language models. It offers a high-level API compatible with OpenAI's API, facilitating easy integration into existing applications. The project also includes a powerful web server for local deployment and supports various hardware acceleration backends.

FuncVul: Function-Level Vulnerability Detection with LLMs and Code Chunks
FuncVul is an innovative model designed to detect vulnerabilities at the function level in C/C++ and Python code, addressing a critical gap in software supply chain security. By leveraging large language models (LLMs) and a code chunk-based approach, FuncVul significantly improves the precision of vulnerability identification. The model demonstrates superior performance compared to existing state-of-the-art methods, achieving high accuracy and F1 scores across various datasets.

Marker: High-Accuracy Document Conversion to Markdown and JSON
Marker is an open-source Python tool designed for high-accuracy conversion of documents like PDFs, images, and office files into Markdown, JSON, and HTML. It excels at preserving complex formatting, extracting images, and can leverage LLMs for even greater precision. This makes Marker a powerful solution for structured document intelligence.

Text Generation Inference: High-Performance LLM Serving by Hugging Face
Text Generation Inference (TGI) is a robust toolkit from Hugging Face designed for deploying and serving Large Language Models (LLMs) with high performance. It powers Hugging Face's production services, including Hugging Chat and their Inference API. TGI offers optimized text generation, supporting popular open-source LLMs and implementing advanced features for efficient and scalable inference.

Weave by Weights & Biases: A Toolkit for AI-Powered Applications
Weave is an open-source toolkit developed by Weights & Biases designed for building and managing AI-powered applications. It provides robust features for logging, debugging, and evaluating language model inputs and outputs, streamlining the development workflow for generative AI. Weave aims to bring rigor and best practices to the experimental process of AI software development.

Data Prep Kit: Accelerating Data Preparation for GenAI and LLM Applications
Data Prep Kit is an open-source project designed to accelerate unstructured data preparation for GenAI and LLM applications. It provides a comprehensive set of modules and transforms to cleanse, transform, and enrich data for pre-training, fine-tuning, instruct-tuning LLMs, or building Retrieval Augmented Generation (RAG) applications. The kit is highly scalable, supporting processing from a laptop to data center scale using Python, Ray, and Spark runtimes.

Magic: The All-in-One Open-Source AI Productivity Platform
Magic is the first open-source all-in-one AI productivity platform, integrating a generalist AI agent, workflow engine, instant messaging, and collaborative office system. It aims to help enterprises build and deploy AI applications to achieve significant productivity increases, offering a comprehensive suite of tools for intelligent automation and collaboration.