markitdown-skill
v1.0.1OpenClaw agent skill for converting documents to Markdown. Documentation and utilities for Microsoft's MarkItDown library. Supports PDF, Word, PowerPoint, Excel, images (OCR), audio (transcription), HTML, YouTube.
Installation
MarkItDown Skill
Documentation and utilities for converting documents to Markdown using Microsoft's MarkItDown library.
Note: This skill provides documentation and a batch script. The actual conversion is done by the
markitdownCLI/library installed via pip.
When to Use
Use markitdown for: - 📄 Fetching documentation (README, API docs) - 🌐 Converting web pages to markdown - 📝 Document analysis (PDFs, Word, PowerPoint) - 🎬 YouTube transcripts - 🖼️ Image text extraction (OCR) - 🎤 Audio transcription
Quick Start
# Convert file to markdown
markitdown document.pdf -o output.md
# Convert URL
markitdown https://example.com/docs -o docs.md
Supported Formats
| Format | Features |
|---|---|
| Text extraction, structure | |
| Word (.docx) | Headings, lists, tables |
| PowerPoint | Slides, text |
| Excel | Tables, sheets |
| Images | OCR + EXIF metadata |
| Audio | Speech transcription |
| HTML | Structure preservation |
| YouTube | Video transcription |
Installation
The skill requires Microsoft's markitdown CLI:
pip install 'markitdown[all]'
Or install specific formats only:
pip install 'markitdown[pdf,docx,pptx]'
Common Patterns
Fetch Documentation
markitdown https://github.com/user/repo/blob/main/README.md -o readme.md
Convert PDF
markitdown document.pdf -o document.md
Batch Convert
# Using included script
python ~/.openclaw/skills/markitdown/scripts/batch_convert.py docs/*.pdf -o markdown/ -v
# Or shell loop
for file in docs/*.pdf; do
markitdown "$file" -o "${file%.pdf}.md"
done
Python API
from markitdown import MarkItDown
md = MarkItDown()
result = md.convert("document.pdf")
print(result.text_content)
Troubleshooting
"markitdown not found"
pip install 'markitdown[all]'
OCR Not Working
# Ubuntu/Debian
sudo apt-get install tesseract-ocr
# macOS
brew install tesseract
What This Skill Provides
| Component | Source |
|---|---|
markitdown CLI |
Microsoft's pip package |
markitdown Python API |
Microsoft's pip package |
scripts/batch_convert.py |
This skill (utility) |
| Documentation | This skill |
See Also
- USAGE-GUIDE.md - Detailed examples
- reference.md - Full API reference
- Microsoft MarkItDown - Upstream library