Giskard-OSS: Open-Source Evaluation & Testing Library for LLM Agents

Introduction

Giskard-OSS is a robust open-source Python library dedicated to the evaluation and testing of AI systems. It provides essential tools for identifying and mitigating risks related to performance, bias, and security in various AI applications, from advanced LLM agents to traditional machine learning models. With Giskard-OSS, developers and researchers can ensure their AI systems are trustworthy and perform as expected.

Installation

Getting started with Giskard-OSS is straightforward. You can install the latest version directly from PyPi using pip. Giskard officially supports Python versions 3.9, 3.10, and 3.11.

pip install "giskard[llm]" -U

Examples

Giskard-OSS offers powerful functionalities to scan your AI models for issues and generate evaluation datasets. You can easily wrap your LLM agent and run Giskard's scan to automatically detect problems like hallucinations, harmful content generation, prompt injection, and bias.

For RAG applications, the library provides a specialized RAG Evaluation Toolkit (RAGET) that can automatically generate evaluation datasets, including questions, reference answers, and contexts, from your knowledge base. This allows for a granular assessment of each RAG component, such as the Generator, Retriever, Rewriter, Router, and Knowledge Base.

A quick way to explore Giskard-OSS in action is through their Colab notebook.

Why Use Giskard-OSS

Giskard-OSS is an invaluable tool for anyone developing or deploying AI systems. It provides automated detection of critical issues, helping to prevent costly failures and reputational damage. Its comprehensive RAG evaluation capabilities are particularly beneficial for complex LLM applications, offering insights into the performance of individual components. Being open-source, it fosters community collaboration and transparency, while its seamless integration with various tools makes it adaptable to diverse development workflows. By using Giskard-OSS, you can build more reliable, fair, and secure AI applications.

Giskard-OSS: Open-Source Evaluation & Testing Library for LLM Agents

Summary

Repository Information

Topics

Use at your own risk

Introduction

Installation

Examples

Why Use Giskard-OSS

Links

Source repository