name: defuddle-web-cleaner description: extract clean article content from web pages using defuddle. use when a user provides a url or html and wants the readable article text, markdown version, or structured metadata. helpful for web scraping, research workflows, note taking, obsidian clipping, and converting web pages to markdown.

Defuddle Web Cleaner

Extract the main readable content from a web page.

This skill removes unnecessary elements such as: - navigation bars - sidebars - ads - comments - footers - social buttons

The result is clean article content.

Supported Inputs

URL
Raw HTML
Web page text

Output Format

Default output:

Title
Author
Site
Published date

Markdown article content

Alternative output (JSON):

{ title, author, site, description, published, content, contentMarkdown }

Processing Steps

Detect input type
Load page HTML
Run Defuddle parser
Extract metadata
Convert to Markdown if requested
Return clean content

Example

Input:

https://example.com/blog/ai

Output:

Title: AI is Changing Everything
Author: Jane Smith
Site: Example Blog

Markdown:

AI is Changing Everything

Artificial intelligence is transforming industries...

Tips

Use this skill when: - saving articles to Obsidian - building research datasets - cleaning webpages for LLM processing - summarizing articles