rupert-web-scraper

v1.0.0

提取并导出结构化网页数据（文本、表格、图片等），支持JSON、CSV、Markdown或SQL格式，遵守道德与合法的爬取规范。

Sourced from ClawHub, Authored by rupertnt034

Installation

Please help me install the skill `rupert-web-scraper` from SkillHub official store. npx skills add rupertnt034/rupert-web-scraper

Web Scraper Skill

Overview

Extract data from websites efficiently and ethically.

Capabilities

1. Data Extraction

Extract text content
Pull structured data
Capture tables
Get images/media

2. Formats

JSON output
CSV export
Markdown
SQL inserts

3. Features

Rate limiting
Caching
Retry logic
Error handling
Proxy support

4. Ethical Scraping

Respect robots.txt
Rate limits
User agent rotation
Legal compliance

Usage

Commands

scrape [URL] for [data]
extract [element] from [URL]
get table from [URL]
crawl [website] depth [n]
export [URL] to [format]

Examples

Input: "scrape example.com for product names and prices" Output:

{
  "products": [
    {"name": "Product A", "price": "$19.99"},
    {"name": "Product B", "price": "$29.99"}
  ]
}

Configuration

Rate Limits

Default: 1 request/second
Configurable: 0.1-10 req/s
Respect site limits

Output Options

JSON (default)
CSV
Markdown
SQL
Custom template

Best Practices

Always identify yourself
Cache responses
Handle errors gracefully
Stay within legal bounds
Don't overwhelm servers

Popularity

0 Stars

DLs

132

Installs

0

View Repository

AI Security

98

None

Audited by AI Guard