SkillHub

easydoc-parse

v1.0.1

Use when tasks need EasyDoc REST API to convert unstructured documents into structured JSON or markdown on either China EasyLink platform or global EasyDoc platform. Trigger for requests about POST /v1/easydoc/parse and GET /v1/easydoc/parse/{task_id} (cn), POST /api/v1/parse and GET /api/v1/parse/{...

Sourced from ClawHub, Authored by Sycamore

Installation

Please help me install the skill `easydoc-parse` from SkillHub official store. npx skills add sycamore792/easydoc-parse

EasyLink EasyDoc Parse

Overview

Use this skill to call EasyDoc async parsing APIs and return stable structured output. Always follow the same lifecycle: select platform, validate inputs, submit task, poll result, normalize output.

RAG Retrieval

If the parsed output is being used for RAG, do not load the entire JSON file into context by default.

  1. Use grep-style search first
  2. If the host agent provides a text-search tool such as Grep, Search, or equivalent "search within file content" capability, use that tool first.
  3. Prefer grep-style search to locate candidate passages, headings, node ids, table markers, or metadata fields inside parsed JSON.
  4. Search for user query terms, entity names, date ranges, section headers, and node type values before opening any large file.
  5. Do not introduce a custom in-skill Python search script for this retrieval path.
  6. Do not shell out to grep or rg if the host agent already exposes an equivalent search tool.

  7. Read only local slices

  8. After the search tool identifies relevant hits, read only the matching lines or a narrow surrounding window.
  9. Extract only the needed nodes, sections, or pages for downstream summarization or embedding.

  10. Escalate to full-load only when necessary

  11. Load the full JSON only when the task truly requires global document structure, full-tree reconstruction, or complete export.
  12. If full-load is required, say why.

Onboarding

If user has no API key, guide first:

  1. cn platform key flow
  2. Open https://platform.easylink-ai.com
  3. Register or sign in
  4. Enter API key management page and create a key
  5. Store as EASYLINK_API_KEY

  6. global platform key flow

  7. Open https://platform.easydoc.sh
  8. Register or sign in
  9. Enter API key management page and create a key
  10. Store as EASYDOC_API_KEY

When user does not specify platform, ask whether they want cn or global first.

Platform Selection

Choose platform before calling any endpoint:

  1. cn platform
  2. Base URL: https://api.easylink-ai.com
  3. Submit: POST /v1/easydoc/parse
  4. Poll: GET /v1/easydoc/parse/{task_id}
  5. File form field: files
  6. Recommended modes: easydoc-parse-flash, easydoc-parse-premium

  7. global platform

  8. Base URL: https://api.easydoc.sh
  9. Submit: POST /api/v1/parse
  10. Poll: GET /api/v1/parse/{task_id}/result
  11. File form field: file
  12. Recommended mode: lite

Workflow

  1. Validate request inputs
  2. Require api-key from user input or secure environment variable.
  3. Require parse mode when needed; if omitted in script mode, use platform default (cn: easydoc-parse-premium, global: lite).
  4. Validate file type and size (<= 100MB) using platform-specific extension list.
  5. If key is missing, return platform-specific onboarding steps and expected env var name.

  6. Submit async parse task

  7. Use platform-specific submit URL and form-data file field.
  8. Include mode.
  9. Read task_id from response.

  10. Poll task status

  11. Use platform-specific result endpoint.
  12. Continue polling while task is pending or processing.
  13. Stop on terminal status (SUCCESS, ERROR, FAILED, COMPLETED, DONE) or timeout.

  14. Normalize output

  15. Keep raw response as raw.
  16. Return stable envelope for downstream consumers: task_id, status, files.

  17. Handle failures predictably

  18. Include task_id in error reports when available.
  19. Report HTTP status and response body for API errors.
  20. For parse failures, suggest mode switch or resubmission.

  21. Apply RAG-safe retrieval

  22. When parsed JSON is large, use the host agent's text-search tool or equivalent grep-style retrieval before any full read.
  23. Avoid pasting or loading entire parsed payloads into context unless the task depends on full-document traversal.

Quick Commands

China platform:

curl -X POST "https://api.easylink-ai.com/v1/easydoc/parse" 
  -H "api-key: $EASYLINK_API_KEY" 
  -F "[email protected]" 
  -F "mode=easydoc-parse-premium"

Global platform:

curl -X POST "https://api.easydoc.sh/api/v1/parse" 
  -H "api-key: $EASYDOC_API_KEY" 
  -F "file=@demo_document.pdf" 
  -F "mode=lite"

Bundled Python helper:

python3 scripts/easydoc_parse.py --platform cn --api-key "$EASYLINK_API_KEY" 
  --mode easydoc-parse-premium --file ./document.pdf --save ./result-cn.json

python3 scripts/easydoc_parse.py --platform global --api-key "$EASYDOC_API_KEY" 
  --mode lite --file ./document.pdf --save ./result-global.json

# key can come from environment if --api-key is omitted
export EASYLINK_API_KEY="your-cn-key"
python3 scripts/easydoc_parse.py --platform cn --file ./document.pdf --save ./result-cn.json

export EASYDOC_API_KEY="your-global-key"
python3 scripts/easydoc_parse.py --platform global --file ./document.pdf --save ./result-global.json

References And Scripts

  • Read references/easydoc-rest-api.md for endpoint-level differences between cn and global.
  • Use scripts/easydoc_parse.py for deterministic submit and polling.
  • Script default output is normalized; use --output-format raw for raw payload only.
  • In RAG workflows, prefer the host agent's built-in content search tool on saved JSON results before opening large file sections.

Output Contract

{
  "task_id": "string",
  "status": "SUCCESS|ERROR|PENDING|PROCESSING|FAILED|COMPLETED|DONE",
  "files": [
    {
      "file_name": "string",
      "markdown": "string or null",
      "nodes": []
    }
  ],
  "raw": {}
}