LangTest: A Comprehensive Library for Safe & Effective Language Models

Introduction

LangTest, developed by PacificAI, is a powerful open-source Python library aimed at delivering safe and effective language models. It addresses the critical need for robust evaluation tools in the rapidly evolving field of AI, providing a comprehensive framework to test and improve the quality of both traditional NLP models and Large Language Models (LLMs). With its focus on responsible AI, LangTest helps developers identify and mitigate issues related to robustness, bias, fairness, and accuracy.

Installation

Getting started with LangTest is straightforward. You can install it using pip, with optional dependencies for transformers if needed:

!pip install langtest[transformers]

Examples

LangTest simplifies the process of generating and running tests. Here's a quick example demonstrating how to use it for a Named Entity Recognition (NER) task:

# Import and create a Harness object
from langtest import Harness
h = Harness(task='ner', model={"model":'dslim/bert-base-NER', "hub":'huggingface'})

# Generate test cases, run them and view a report
h.generate().run().report()

For more detailed examples and documentation, visit the official LangTest website.

Why Use LangTest?

LangTest stands out as an essential tool for anyone developing or deploying language models due to its extensive capabilities:

Comprehensive Testing: Generate and execute over 60 distinct types of tests with just one line of code, covering robustness, bias, representation, fairness, and accuracy.
Automated Data Augmentation: Automatically augment training data based on test results for select models, improving their performance.
Broad Framework Support: Supports popular NLP frameworks like Spark NLP, Hugging Face, and Transformers for tasks such as NER, Translation, and Text Classification.
LLM Evaluation: Offers robust support for testing LLMs from providers like OpenAI, Cohere, AI21, Hugging Face Inference API, and Azure-OpenAI. It covers critical LLM aspects including question answering, toxicity, clinical tests, legal support, factuality, sycophancy, and summarization.
Responsible AI Mission: The project is driven by a mission to make safe, robust, and fair AI models an everyday reality, providing tools that are often missing for data scientists.

LangTest: A Comprehensive Library for Safe & Effective Language Models

Summary

Repository Information

Topics

Use at your own risk

Introduction

Installation

Examples

Why Use LangTest?

Links

Source repository