BrowserAI: Run Local LLMs Directly in Your Browser with WebGPU

Introduction

BrowserAI is an open-source project that revolutionizes how developers integrate large language models (LLMs) into web applications. It allows you to run powerful LLMs, such as Llama, DeepSeek-Distill, and Kokoro, entirely within the user's browser. This approach ensures 100% privacy, eliminates server costs, and provides offline capabilities, making it ideal for a wide range of AI-powered web solutions. The project leverages WebGPU for near-native performance, offering a fast and efficient way to deploy AI directly on the client side.

Installation

Getting started with BrowserAI is straightforward. You can install it using npm or yarn:

npm install @browserai/browserai

yarn add @browserai/browserai

Examples

Basic Usage

import { BrowserAI } from '@browserai/browserai';

const browserAI = new BrowserAI();

// Load model with progress tracking
await browserAI.loadModel('llama-3.2-1b-instruct', {
  quantization: 'q4f16_1',
  onProgress: (progress) => console.log('Loading:', progress.progress + '%')
});

// Generate text
const response = await browserAI.generateText('Hello, how are you?');
console.log(response.choices[0].message.content);

Chat with System Prompt

const ai = new BrowserAI();
await ai.loadModel('gemma-2b-it');

const response = await ai.generateText([
  { role: 'system', content: 'You are a helpful assistant.' },
  { role: 'user', content: 'What is WebGPU?' }
]);
console.log(response.choices[0].message.content);

Structured Output Generation

const response = await browserAI.generateText('List 3 colors', {
  json_schema: {
    type: "object",
    properties: {
      colors: {
        type: "array",
        items: {
          type: "object",
          properties: {
            name: { type: "string" },
            hex: { type: "string" }
          }
        }
      }
    }
  },
  response_format: { type: "json_object" }
});
console.log(response.choices[0].message.content);

Why Use BrowserAI?

100% Private: All AI processing occurs locally in the user's browser, ensuring data privacy and security.
Zero Server Costs: Eliminate the need for expensive server infrastructure for AI inference, reducing operational costs significantly.
Offline Capable: Once models are downloaded, applications can function without an internet connection, enhancing accessibility and reliability.
WebGPU Accelerated: Benefit from near-native performance thanks to WebGPU acceleration, providing a fast and responsive user experience.
Developer Friendly: A simple SDK and API make it easy to integrate various LLMs, speech recognition, and text-to-speech capabilities into web projects.
Production Ready: Utilizes pre-optimized popular models, ready for deployment in real-world applications.

BrowserAI: Run Local LLMs Directly in Your Browser with WebGPU

Summary

Repository Info

Tags