mineru-ocr-local-api
v1.1.4Parse complex PDFs and document images with MinerU through either the hosted MinerU API or the local open-source MinerU runtime. Use when Codex, OpenClaw, Claude Code, or similar coding agents need MinerU-based OCR, layout-aware Markdown extraction, formula extraction, local-file upload to the Miner...
Installation
MinerU OCR Local API
Hard Rules
- Use
python scripts/mineru_caller.pyfor every MinerU request. - Do not parse the document yourself as a fallback.
- If MinerU returns an error, show the error and stop.
- Treat the saved JSON envelope and generated artifact files as the source of truth.
- Prefer the top-level
textfield when the user asks for the full extracted document.
Choose The Mode
- Use
--mode apiwhen the user wants the hosted MinerU service, already hasMINERU_API_TOKEN, needs URL input, or wants the official cloud API workflow. - Use
--mode localwhen the user wants the open-source MinerU runtime fromhttps://github.com/opendatalab/MinerU, wants data to stay local, or explicitly asks for local parsing. - Use
--mode autoonly when the user does not care which mode is used.autoprefers API whenMINERU_API_TOKENis configured and falls back to local only for local files.
Standard Workflow
- For a hosted API parse from URL:
bash
python scripts/mineru_caller.py --mode api --file-url "https://example.com/paper.pdf" --pretty
- For a hosted API parse from a local file:
bash
python scripts/mineru_caller.py --mode api --file-path "C:/docs/paper.pdf" --pretty
- For a local open-source MinerU parse:
bash
python scripts/mineru_caller.py --mode local --file-path "C:/docs/paper.pdf" --pretty
- When the input is a local file and the user will need an IDE-accessible path, prefer saving beside the source file:
bash
python scripts/mineru_caller.py --mode local --file-path "C:/docs/paper.pdf" --download-dir "C:/docs/paper.mineru" --pretty
- Read these output fields before answering:
mode: actual execution mode usedtext: complete document Markdown fromfull.mdor<file_stem>.mdresult.submit: raw task-creation response for API URL parsingresult.batch: raw upload-batch response for API local-file parsingresult.poll: final API task-status payloadresult.local: local MinerU CLI invocation summaryartifacts.full_md_path: absolute path to the main Markdown fileartifacts.local_parse_dir: local MinerU parse directory when using--mode localartifacts.downloaded_zip: downloaded MinerU archive when using--mode api
Useful Flags
--mode api|local|auto: choose hosted API, local runtime, or automatic selection--no-wait: return after submission without polling; API mode only--no-download: skip downloadingfull_zip_url; API mode only--download-dir DIR: store API downloads or local MinerU outputs in a specific directory--language en: pass a language hint--ocr: force OCR mode--disable-formula: turn off formula extraction--local-cmd PATH: explicit path tomineru.exeormineru--local-python PATH: explicit Python path forpython -m mineru.cli.client--local-backend pipeline: choose the local MinerU backend--local-method auto|txt|ocr: choose the local MinerU parse method--local-model-source modelscope: useful in environments where Hugging Face access is restricted--local-device cpu: force a local inference device when needed
Present Results
- If the user asks for all text, show the top-level
textfield. - If the user asks where files were saved, report the paths in
artifacts. - If the output is large, give the saved file paths first and then the requested excerpt or summary.
- If API mode completed but archive download failed, report
artifacts.full_zip_url.
Configuration
For API mode:
MINERU_API_TOKEN
MINERU_API_BASE_URL=https://mineru.net
MINERU_API_TIMEOUT=60
MINERU_API_POLL_TIMEOUT=900
MINERU_API_POLL_INTERVAL=5
Windows PowerShell examples:
$env:MINERU_API_TOKEN="YOUR_MINERU_TOKEN"
setx MINERU_API_TOKEN "YOUR_MINERU_TOKEN"
Use the first command for the current terminal only. Use setx to persist it for future terminals, then restart Codex/Cursor or open a new terminal.
For local mode, configure at least one runtime entry point:
MINERU_LOCAL_CMD=C:pathtomineru.exe
MINERU_LOCAL_PYTHON=C:pathtopython.exe
MINERU_LOCAL_BACKEND=pipeline
MINERU_LOCAL_METHOD=auto
MINERU_LOCAL_LANG=ch
MINERU_LOCAL_MODEL_SOURCE=modelscope
MINERU_LOCAL_DEVICE_MODE=cpu
MINERU_LOCAL_TIMEOUT=3600
Local mode only supports --file-path. It does not accept --file-url.
Error Handling
- Missing API token: show the configuration error, tell the user to set
MINERU_API_TOKEN, and include one-shot plus persistent commands. - Missing local runtime: show the configuration error and stop.
- Failed task or failed local CLI run: show the error and stop.
- Poll timeout: tell the user the task id and that polling timed out.
- API archive download TLS error: rely on the built-in
curlfallback before reporting failure. - Missing expected output files: return any artifact paths that do exist and report the missing output.
References
references/output_schema.md: JSON envelope and artifact layout for both modes.
Load the reference file when:
- You need to explain which saved files matter.
- You need to inspect mode-specific artifacts such as downloaded_zip, local_parse_dir, middle_json, or content_list.
Verification
Validate the skill folder:
python /path/to/quick_validate.py /path/to/mineru-ocr-api-local
Run a local runtime check against a sample PDF:
python scripts/mineru_caller.py --mode local --file-path "D:/path/to/file.pdf" --pretty --stdout
Run an API runtime check against a remote PDF:
python scripts/mineru_caller.py --mode api --file-url "https://example.com/file.pdf" --pretty --stdout