SkillHub

markitdown-skill

v1.0.1

OpenClaw agent skill for converting documents to Markdown. Documentation and utilities for Microsoft's MarkItDown library. Supports PDF, Word, PowerPoint, Excel, images (OCR), audio (transcription), HTML, YouTube.

Sourced from ClawHub, Authored by karmanverma

Installation

Please help me install the skill `markitdown-skill` from SkillHub official store. npx skills add karmanverma/markitdown-skill

MarkItDown Skill

Documentation and utilities for converting documents to Markdown using Microsoft's MarkItDown library.

Note: This skill provides documentation and a batch script. The actual conversion is done by the markitdown CLI/library installed via pip.

When to Use

Use markitdown for: - 📄 Fetching documentation (README, API docs) - 🌐 Converting web pages to markdown - 📝 Document analysis (PDFs, Word, PowerPoint) - 🎬 YouTube transcripts - 🖼️ Image text extraction (OCR) - 🎤 Audio transcription

Quick Start

# Convert file to markdown
markitdown document.pdf -o output.md

# Convert URL
markitdown https://example.com/docs -o docs.md

Supported Formats

Format Features
PDF Text extraction, structure
Word (.docx) Headings, lists, tables
PowerPoint Slides, text
Excel Tables, sheets
Images OCR + EXIF metadata
Audio Speech transcription
HTML Structure preservation
YouTube Video transcription

Installation

The skill requires Microsoft's markitdown CLI:

pip install 'markitdown[all]'

Or install specific formats only:

pip install 'markitdown[pdf,docx,pptx]'

Common Patterns

Fetch Documentation

markitdown https://github.com/user/repo/blob/main/README.md -o readme.md

Convert PDF

markitdown document.pdf -o document.md

Batch Convert

# Using included script
python ~/.openclaw/skills/markitdown/scripts/batch_convert.py docs/*.pdf -o markdown/ -v

# Or shell loop
for file in docs/*.pdf; do
  markitdown "$file" -o "${file%.pdf}.md"
done

Python API

from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("document.pdf")
print(result.text_content)

Troubleshooting

"markitdown not found"

pip install 'markitdown[all]'

OCR Not Working

# Ubuntu/Debian
sudo apt-get install tesseract-ocr

# macOS
brew install tesseract

What This Skill Provides

Component Source
markitdown CLI Microsoft's pip package
markitdown Python API Microsoft's pip package
scripts/batch_convert.py This skill (utility)
Documentation This skill

See Also

  • USAGE-GUIDE.md - Detailed examples
  • reference.md - Full API reference
  • Microsoft MarkItDown - Upstream library