Giskard-OSS: Open-Source Evaluation & Testing Library for LLM Agents
This repository profile is provided by osrepos.com, an open source repository discovery platform.

Summary
Giskard-OSS is an open-source Python library designed for evaluating and testing AI systems, particularly LLM-based applications and traditional ML models. It automatically detects performance, bias, and security issues, offering comprehensive tools for ensuring the reliability and safety of AI. The library includes a powerful RAG Evaluation Toolkit (RAGET) for in-depth assessment of Retrieval Augmented Generation applications.
Repository Information
Topics
Click on any tag to explore related repositories
Use at your own risk
OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.
Introduction
Giskard-OSS is a robust open-source Python library dedicated to the evaluation and testing of AI systems. It provides essential tools for identifying and mitigating risks related to performance, bias, and security in various AI applications, from advanced LLM agents to traditional machine learning models. With Giskard-OSS, developers and researchers can ensure their AI systems are trustworthy and perform as expected.
Installation
Getting started with Giskard-OSS is straightforward. You can install the latest version directly from PyPi using pip. Giskard officially supports Python versions 3.9, 3.10, and 3.11.
pip install "giskard[llm]" -U
Examples
Giskard-OSS offers powerful functionalities to scan your AI models for issues and generate evaluation datasets. You can easily wrap your LLM agent and run Giskard's scan to automatically detect problems like hallucinations, harmful content generation, prompt injection, and bias.
For RAG applications, the library provides a specialized RAG Evaluation Toolkit (RAGET) that can automatically generate evaluation datasets, including questions, reference answers, and contexts, from your knowledge base. This allows for a granular assessment of each RAG component, such as the Generator, Retriever, Rewriter, Router, and Knowledge Base.
A quick way to explore Giskard-OSS in action is through their Colab notebook.
Why Use Giskard-OSS
Giskard-OSS is an invaluable tool for anyone developing or deploying AI systems. It provides automated detection of critical issues, helping to prevent costly failures and reputational damage. Its comprehensive RAG evaluation capabilities are particularly beneficial for complex LLM applications, offering insights into the performance of individual components. Being open-source, it fosters community collaboration and transparency, while its seamless integration with various tools makes it adaptable to diverse development workflows. By using Giskard-OSS, you can build more reliable, fair, and secure AI applications.
Links
Source repository
Open the original repository on GitHub.