pdfmux

@nameet

The smart PDF-to-Markdown router. One command, zero config.

$ pip install pdfmux click to copy

Usage

# convert a pdf to markdown
$ pdfmux invoice.pdf
✓ invoice.pdf → invoice.md (2 pages, 100% confidence)

# json output with metadata
$ pdfmux report.pdf -f json

# batch convert
$ pdfmux ./docs/ -o ./output/

# start mcp server for ai agents
$ pdfmux serve

How it works

We don't convert PDFs. We route them to whichever tool converts them best.

PDF TypeExtractorSpeedCost
DigitalPyMuPDF0.01s/pgFree
TablesDocling0.3-3s/pgFree
ScannedSurya OCR1-5s/pgFree
ComplexGemini Flash2-5s/pg~$0.01

Stats

Digital PDFs
0.01s/page
Table accuracy
97.9%
Cost (90% of PDFs)
$0
Output format
Markdown

MCP Server

Give your AI agent the ability to read PDFs:

{ "mcpServers": { "pdfmux": { "command": "pdfmux", "args": ["serve"] } } }

Optional extractors

$ pip install pdfmux[tables]  # Docling
$ pip install pdfmux[ocr]     # Surya OCR
$ pip install pdfmux[llm]     # Gemini Flash
$ pip install pdfmux[all]     # everything