SkillHub

nutrient-document-processing-universal

v1.1.2

Universal (non-OpenClaw) Nutrient DWS document-processing skill for Agent Skills-compatible products. Best for Claude Code, Codex CLI, Gemini CLI, Cursor, Windsurf, OpenCode, and Copilot environments. Supports full workflows: convert PDF/Office/images, OCR, extract text/tables/key-values, redact PII...

Sourced from ClawHub, Authored by Jonathan Rhyne

Installation

Please help me install the skill `nutrient-document-processing-universal` from SkillHub official store. npx skills add jdrhyne/nutrient-document-processing-universal

Nutrient Document Processing (Universal Agent Skill)

Best for Claude Code/Codex/Gemini/Cursor/Windsurf and other non-OpenClaw agents. Process, convert, extract, OCR, redact, sign, and manipulate documents using the Nutrient DWS Processor API.

Setup

You need a Nutrient DWS API key. Get one free at https://dashboard.nutrient.io/sign_up/?product=processor.

If your agent supports MCP (Model Context Protocol), use the Nutrient DWS MCP Server. It provides all operations as native tools.

Configure your MCP client (e.g., claude_desktop_config.json or .mcp.json):

{
  "mcpServers": {
    "nutrient-dws": {
      "command": "npx",
      "args": ["-y", "@nutrient-sdk/dws-mcp-server"],
      "env": {
        "NUTRIENT_DWS_API_KEY": "YOUR_API_KEY",
        "SANDBOX_PATH": "/path/to/working/directory"
      }
    }
  }
}

Then use the MCP tools directly (e.g., convert_to_pdf, extract_text, redact, etc.).

Option 2: Direct API (curl)

For agents without MCP support, call the API directly:

export NUTRIENT_API_KEY="your_api_key_here"

All requests go to https://api.nutrient.io/build as multipart POST with an instructions JSON field.

Safety Boundaries

  • This skill sends documents to the Nutrient DWS API (api.nutrient.io) for processing. Documents may contain sensitive data — ensure your Nutrient account's data handling policies are acceptable.
  • It does NOT access local files beyond those explicitly passed for processing.
  • It does NOT store API keys or credentials beyond the current session.
  • MCP server mode (npx @nutrient-sdk/dws-mcp-server) downloads the official Nutrient MCP server package from npm at runtime.
  • All API calls require an explicit API key — no anonymous access is possible.

Operations

1. Convert Documents

Convert between PDF, DOCX, XLSX, PPTX, HTML, and image formats.

HTML to PDF:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"html":"index.html"}]}' 
  -o output.pdf

DOCX to PDF:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.docx"}]}' 
  -o output.pdf

PDF to DOCX/XLSX/PPTX:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"docx"}}' 
  -o output.docx

Image to PDF:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"image.jpg"}]}' 
  -o output.pdf

2. Extract Text and Data

Extract plain text:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"text"}}' 
  -o output.txt

Extract tables (as JSON, CSV, or Excel):

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"xlsx"}}' 
  -o tables.xlsx

Extract key-value pairs:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"extraction","strategy":"key-values"}]}' 
  -o result.json

3. OCR Scanned Documents

Apply OCR to scanned PDFs or images, producing searchable PDFs with selectable text.

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"scanned.pdf"}],"actions":[{"type":"ocr","language":"english"}]}' 
  -o searchable.pdf

Supported languages: english, german, french, spanish, italian, portuguese, dutch, swedish, danish, norwegian, finnish, polish, czech, turkish, japanese, korean, chinese-simplified, chinese-traditional, arabic, hebrew, thai, hindi, russian, and more.

4. Redact Sensitive Information

Pattern-based redaction (preset patterns):

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"preset","preset":"social-security-number"}]}' 
  -o redacted.pdf

Available presets: social-security-number, credit-card-number, email-address, north-american-phone-number, international-phone-number, date, url, ipv4, ipv6, mac-address, us-zip-code, vin, time.

Regex-based redaction:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"regex","regex":"\b[A-Z]{2}\d{6}\b"}]}' 
  -o redacted.pdf

AI-powered PII redaction:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"ai_redaction","criteria":"All personally identifiable information"}]}' 
  -o redacted.pdf

The criteria field accepts natural language (e.g., "Names and phone numbers", "Protected health information", "Financial account numbers").

5. Add Watermarks

Text watermark:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"watermark","text":"CONFIDENTIAL","fontSize":48,"fontColor":"#FF0000","opacity":0.5,"rotation":45,"width":"50%","height":"50%"}]}' 
  -o watermarked.pdf

Image watermark:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"watermark","imagePath":"logo.png","width":"30%","height":"30%","opacity":0.3}]}' 
  -o watermarked.pdf

6. Digital Signatures

Sign a PDF with CMS signature:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"sign","signatureType":"cms","signerName":"John Doe","reason":"Approval","location":"New York"}]}' 
  -o signed.pdf

Sign with CAdES-B-LT (long-term validation):

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"sign","signatureType":"cades","cadesLevel":"b-lt","signerName":"Jane Smith"}]}' 
  -o signed.pdf

7. Form Filling (Instant JSON)

Fill PDF form fields using Instant JSON format:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"form.pdf"}],"actions":[{"type":"fillForm","fields":[{"name":"firstName","value":"John"},{"name":"lastName","value":"Doe"},{"name":"email","value":"[email protected]"}]}]}' 
  -o filled.pdf

8. Merge and Split PDFs

Merge multiple PDFs:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"doc1.pdf"},{"file":"doc2.pdf"}]}' 
  -o merged.pdf

Extract specific pages:

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf","pages":{"start":0,"end":4}}]}' 
  -o pages1-5.pdf

9. Render PDF Pages as Images

curl -X POST https://api.nutrient.io/build 
  -H "Authorization: Bearer $NUTRIENT_API_KEY" 
  -F "[email protected]" 
  -F 'instructions={"parts":[{"file":"document.pdf","pages":{"start":0,"end":0}}],"output":{"type":"png","dpi":300}}' 
  -o page1.png

10. Check Credits

curl -X GET https://api.nutrient.io/credits 
  -H "Authorization: Bearer $NUTRIENT_API_KEY"

Best Practices

  1. Use the MCP server when your agent supports it — it handles file I/O, error handling, and sandboxing automatically.
  2. Set SANDBOX_PATH to restrict file access to a specific directory.
  3. Check credit balance before batch operations to avoid interruptions.
  4. Use AI redaction for complex PII detection; use preset/regex redaction for known patterns (faster, cheaper).
  5. Chain operations — the API supports multiple actions in a single call (e.g., OCR then redact).

Troubleshooting

Issue Solution
401 Unauthorized Check your API key is valid and has credits
413 Payload Too Large Files must be under 100 MB
Slow AI redaction AI analysis takes 60–120 seconds; this is normal
OCR quality poor Try a different language parameter or improve scan quality
Missing text in extraction Run OCR first on scanned documents

More Information

  • Full API reference — Detailed endpoints, parameters, and error codes
  • API Playground — Interactive API testing
  • API Documentation — Official guides
  • MCP Server repo — Source code and issues