Repository History
3 repositories tagged with scraper

Douyin_TikTok_Download_API: High-Performance Scraper for Social Media Videos
Douyin_TikTok_Download_API is a high-performance, asynchronous tool for crawling data from Douyin, TikTok, and Bilibili. It supports API calls, online batch parsing, and no-watermark video downloads. This Python-based project offers a comprehensive solution for social media data extraction and management.
Newspaper3k: Advanced News and Article Extraction in Python
Newspaper3k is a powerful Python 3 library designed for news, full-text, and article metadata extraction. Inspired by the simplicity of 'requests' and the speed of 'lxml', it provides robust tools for scraping and curating articles from various sources. This library is ideal for developers needing to programmatically gather and process news content with advanced NLP capabilities.

Pipet: A Swiss-Army Tool for Web Scraping and Data Extraction
Pipet is a versatile command-line web scraper designed for hackers, enabling efficient data extraction from various online assets. It supports HTML parsing, JSON parsing, and client-side JavaScript evaluation, leveraging existing tools like `curl` and `playwright` for powerful and flexible scraping operations. This tool is ideal for tracking information, monitoring changes, and automating data collection tasks.