PDF2Markdown CLI

Convert PDF and image documents to Markdown. Supports both pdf2markdown and pdf2md commands.

Run pdf2markdown --help or pdf2md <command> --help for options.

Prerequisites

Install and authenticate. Check with pdf2markdown --status.

pdf2markdown login
# or set PDF2MARKDOWN_API_KEY

If not ready, see rules/install.md. For output handling, see rules/security.md.

Workflow

Need	Command	When
Convert PDF/image	`parse`	File under ~30MB, have path or URL
Large file (async)	`parse-async`	File over ~30MB, or sync returns file_too_large error

Quick start

Parse (sync, ~30MB):

pdf2markdown document.pdf -o .pdf2markdown/output.md
pdf2markdown parse --url "https://example.com/doc.pdf" -o .pdf2markdown/doc.md
pdf2markdown parse file1.pdf file2.png -o .pdf2markdown/

# JSON output
pdf2markdown parse document.pdf --format json -o .pdf2markdown/result.json

Parse-async (large files, up to 100MB):

# Submit and wait
pdf2markdown parse-async large.pdf --wait -o .pdf2markdown/output.md
pdf2markdown parse-async --url "https://cdn.example.com/big.pdf" --wait -o .pdf2markdown/doc.md

# Submit only (poll later)
pdf2markdown parse-async large.pdf  # returns task_id
pdf2markdown parse-async <task_id> --status
pdf2markdown parse-async <task_id> --result -o .pdf2markdown/output.md

Options

Command	Key options
`parse`	`-u, --url`, `-o, --output`, `-f, --format` (markdown, json, all), `--page-images`, `--json`, `--pretty`
`parse-async`	`-u, --url`, `-o, --output`, `--wait`, `--status`, `--result`, `--poll-interval`, `--timeout`

Run pdf2markdown <command> --help for full details.

Output & Organization

Write results to .pdf2markdown/ with -o. Add .pdf2markdown/ to .gitignore.

pdf2markdown document.pdf -o .pdf2markdown/doc.md
pdf2markdown parse file1.pdf file2.pdf -o .pdf2markdown/

Naming: .pdf2markdown/{name}.md. For large outputs, use grep, head, or incremental reads. Always quote URLs — shell interprets ? and & as special characters.

Documentation

PDF2Markdown API Docs