pdf-to-md
approvedby kkbin505
This plugin has not been manually reviewed by Obsidian staff. Convert handwritten PDFs to Markdown with LaTeX formulas using AI (OpenAI GPT or Alibaba Qwen).
pdf-to-md
Writing by hand aligns more naturally with the flow of thought than typing in my mind.
This tool converts handwritten notes to Markdown using AI, designed as a seamless Obsidian Plugin.
中文文档 | English
🎉 Obsidian Plugin
An all-in-one Obsidian plugin that converts handwritten PDFs to Markdown in a single click!
Key Features:
- 📄 Right-click any PDF → "Convert to Markdown"
- 🤖 Support for GPT-4o, GPT-5.4, Alibaba Qwen (千问), and Google Gemini
- 📊 Real-time progress tracking with visual progress bar
- 🔐 Secure API key management (read-only environment variables check)
- ⚙️ Configurable DPI, timeout, retry, and file conflict handling
Plugin Installation
Method 1: Obsidian Plugin Marketplace (Recommended)
- Open Obsidian → Settings → Community Plugins
- Search for "pdf-to-md"
- Click Install and Enable
Method 2: Manual Installation
- Download the latest release from GitHub Releases
- Extract files to your Vault:
<your-vault>/.obsidian/plugins/pdf-to-md/ ├── main.js ├── pdf.worker.min.js └── manifest.json - Restart Obsidian and enable the plugin
Plugin Quick Start
1️⃣ Configure Environment Variables
Important: pdf-to-md reads API keys from environment variables only. No API keys are stored on disk. This is more secure.
Get Your API Keys:
- Alibaba Qwen (Recommended): https://dashscope.console.aliyun.com/apiKey
- OpenAI: https://platform.openai.com/api-keys
- Google Gemini: https://aistudio.google.com/
Set Environment Variables:
Windows (PowerShell - Run as Administrator):
# Alibaba Qwen
[System.Environment]::SetEnvironmentVariable('DASHSCOPE_API_KEY', 'sk-xxx...', 'User')
# OpenAI
[System.Environment]::SetEnvironmentVariable('OPENAI_API_KEY', 'sk-proj-xxx...', 'User')
# Google Gemini
[System.Environment]::SetEnvironmentVariable('GEMINI_API_KEY', 'AIzaSyxxx...', 'User')
Mac/Linux:
# Edit ~/.bashrc or ~/.zshrc (Mac users use ~/.zprofile), add:
export DASHSCOPE_API_KEY='sk-xxx...'
export OPENAI_API_KEY='sk-proj-xxx...'
export GEMINI_API_KEY='AIzaSyxxx...'
# Save and reload:
source ~/.bashrc # or source ~/.zshrc
⚠️ Restart Obsidian after setting environment variables (complete restart required, not just reload).
2️⃣ Select AI Provider
Open Obsidian Settings → PDF to Markdown:
- Select the AI model directly from the unified Model dropdown (supporting various GPT-4o, GPT-5.4, Qwen, and Gemini models).
3️⃣ Convert PDF
- Find PDF in Obsidian file browser
- Right-click → "Convert to Markdown"
- Wait for conversion (progress bar shows status)
- Converted
.mdfile is auto-saved
Example:
Input: my_notes.pdf
Output: my_notes_qwen.md (if using Qwen)
my_notes_gpt-5.4.md (if using GPT 5.4)

Qwen

GPT

Supported AI Models
| Provider | Model | Cost/Page | Speed | Quality |
|---|---|---|---|---|
| Alibaba Qwen 🏆 | qwen-vl-max | ¥0.00345 | 15-30s | Excellent |
| OpenAI | gpt-5.4-mini | $0.003 | 5-10s | Excellent+ |
Plugin Settings
| Option | Default | Description |
|---|---|---|
| Model | Qwen VL Max | Select the AI model from the unified list |
| API Key Status | Auto-detect | Shows environment variable status (read-only) |
| PDF Rendering DPI | 200 | Higher DPI = better quality but slower (100-400) |
| API Timeout | 60s | Maximum wait time for API response |
| Max Retries | 3 | Number of retry attempts on failure |
| File Conflict Handling | Model-based naming | How to handle existing output files |
File Conflict Strategies:
- Overwrite: Replace existing file (⚠️ loses previous content)
- Skip: Don't generate if file exists
- Add Timestamp: Append timestamp to filename
- Model-based Naming (Recommended): Append model name (e.g.,
my_notes_qwen.md)
📊 Performance
Real Results
See actual output from different models:
- Qwen Result - Best cost-effectiveness
- OpenAI Result - Highest accuracy
- Original PDF - Input example
📖 Background
While studying control theory, I fell in love with the handwriting experience of the iFlytek Smart Notebook. However, organizing notes in Obsidian proved frustrating: the native OCR was terrible at recognizing mathematical formulas.
I developed this plugin to solve that problem. It leverages powerful Vision Language Models (Qwen-VL, GPT-4o/GPT-5.4, and Gemini) to provide:
- Accurate Mixed Recognition: Seamlessly handles text and complex formulas
- LaTeX Math Formulas: Converts equations into clear Obsidian-renderable LaTeX ($...$ and $$...$$)
- Cost-Effective & Flexible: Choose cheap, fast, or ultra-accurate models inside Obsidian
🤝 Contributing
Issues and Pull Requests are welcome!
📄 License
MIT License
Enjoy pdf-to-md! If you find it helpful, please consider giving it a Star ⭐ on GitHub!
For plugin developers
Search results and similarity scores are powered by semantic analysis of your plugin's README. If your plugin isn't appearing for searches you'd expect, try updating your README to clearly describe your plugin's purpose, features, and use cases.