Pipet: A Swiss-Army Tool for Web Scraping and Data Extraction
This repository profile is provided by osrepos.com, an open source repository discovery platform.

Summary
Pipet is a versatile command-line web scraper designed for hackers, enabling efficient data extraction from various online assets. It supports HTML parsing, JSON parsing, and client-side JavaScript evaluation, leveraging existing tools like `curl` and `playwright` for powerful and flexible scraping operations. This tool is ideal for tracking information, monitoring changes, and automating data collection tasks.
Repository Information
Topics
Click on any tag to explore related repositories
Use at your own risk
OSRepos shares public repositories for knowledge and discovery only. Any installation, execution, configuration, or use of code from these repositories is the user's own responsibility. Always review the repository, source code, dependencies, licenses, and security implications before running or installing anything. OSRepos is not responsible for issues, damages, or losses resulting from third-party repositories.
Introduction
Pipet is a powerful and flexible command-line web scraper, often described as a "swiss-army tool" for extracting data from online assets. Built with hackers in mind, it simplifies complex scraping tasks by supporting three primary modes of operation: HTML parsing, JSON parsing, and client-side JavaScript evaluation. Pipet cleverly integrates with existing tools like curl and playwright, and utilizes Unix pipes to extend its built-in capabilities, making it highly adaptable for various data extraction needs. Whether you need to track shipments, monitor stock prices, or get notified about concert tickets, Pipet provides a robust solution.
Installation
Getting started with Pipet is straightforward, with several installation options available:
Pre-built Binaries
The easiest way to install is by downloading the latest release for your operating system from the official Releases page. After downloading, make the binary executable with chmod +x pipet and run ./pipet.
Compile from Source
If you have Go installed on your system, you can compile and install Pipet directly:
go install github.com/bjesus/pipet/cmd/pipet@latest
Alternatively, you can run it without a full installation using go run.
Package Managers
Pipet is also available through various package managers:
Examples
Pipet's strength lies in its intuitive .pipet files, which define how and where to extract data. Here's a quick example to scrape Hacker News:
- Create a file named
hackernews.pipetwith the following content:curl https://news.ycombinator.com/ .title .titleline span > a .sitebit a - Run Pipet:
go run github.com/bjesus/pipet/cmd/pipet@latest hackernews.pipet # Or, if installed: pipet hackernews.pipetThis will display the latest Hacker News titles and their associated domains directly in your terminal.
Pipet offers many advanced features, including:
- Custom Separators: Use the
--separatorflag to format output. - JSON Output: Get results as a clean JSON structure with the
--jsonflag. - Templating: Render results into custom HTML or text templates.
- Unix Pipes Integration: Extend functionality by piping data to other command-line tools like
wcorhtmlq. - Monitoring: Set intervals and commands to run on changes, allowing you to track dynamic content.
Why Use Pipet?
Pipet stands out for its versatility and hacker-friendly design. Its ability to handle HTML, JSON, and JavaScript-rendered content means it can tackle almost any web scraping challenge. By integrating with curl for complex HTTP requests and playwright for headless browser automation, it provides powerful capabilities without reinventing the wheel. The use of Unix pipes allows for seamless integration into existing workflows and custom data processing. Furthermore, its monitoring features make it an excellent tool for staying updated on online information, from personal alerts to business intelligence.
Links
- GitHub Repository: https://github.com/bjesus/pipet
- Go Reference: https://pkg.go.dev/github.com/bjesus/pipet
- Releases Page: https://github.com/bjesus/pipet/releases/
Related repositories
Similar repositories that may be relevant next.

no-mistakes: AI-Driven Git Proxy for Flawless Pull Requests
June 30, 2026
no-mistakes is an innovative Git proxy that streamlines the pull request workflow by ensuring code quality before it reaches your remote. It uses an AI-driven validation pipeline in a disposable worktree, automatically applying safe fixes and escalating complex issues for human review. This tool helps developers maintain clean, high-quality codebases and open perfect PRs effortlessly.
Gogcli: Google Workspace Management from Your Terminal
June 24, 2026
Gogcli is a powerful command-line interface for Google Workspace, allowing users to manage Gmail, Calendar, Drive, Docs, Sheets, and many other services directly from their terminal. It is designed for both interactive use and robust automation, providing predictable output, agent safety features, and support for multiple accounts.

PinchTab: High-Performance Browser Automation for AI Agents
June 21, 2026
PinchTab is a high-performance browser automation bridge and multi-instance orchestrator, designed to give AI agents direct control over Chrome. Built in Go, it offers advanced stealth injection, real-time dashboards, and token-efficient web interaction. It supports both headless and headed modes, enabling robust and secure automation workflows for various applications.

Multigres: Vitess Adaptation for Scalable Postgres Databases
June 3, 2026
Multigres is an innovative project that adapts Vitess for use with PostgreSQL, aiming to bring advanced sharding and scalability features to Postgres environments. Currently in early development, it offers a promising solution for managing large-scale Postgres deployments. Users can explore its capabilities and contribute to its growth.
Source repository
Open the original repository on GitHub.