Knowledge AI
approvedby david4liu
This plugin has not been manually reviewed by Obsidian staff. AI for your vault: grounded RAG chat with citations, hybrid BM25 + vector + summary retrieval, multi-format indexing (MD/PDF/DOCX/XLSX/PPTX), image OCR/Vision, and artifact generation.
Knowledge AI for Obsidian
简体中文 · English
An AI assistant for your Obsidian vault: ask questions and get cited answers grounded in your notes, generate study guides / timelines / mind maps / slide decks from any folder.
Inspired by Google NotebookLM but fully local-first and provider-agnostic — bring your own API key, or run embeddings entirely on-device.
Status: pre-1.0. Public release of a previously private project; expect rough edges. Issues and PRs welcome.
Features
- Grounded RAG chat with inline
[N]citations that jump straight to the source chunk in your vault. - Hybrid retrieval — BM25 (Chinese-tokenized via
segmentWords) + vector embeddings + document-level summary search, merged with Reciprocal Rank Fusion. - Multi-format indexing — Markdown, PDF (text layer), DOCX, XLSX, PPTX. Extraction runs in a Web Worker so the UI stays responsive on large vaults.
- Image understanding — Tesseract OCR (
chi_sim+engby default) and optional LLM vision for image-bearing notes. - Structured artifact generation — summary, study guide, timeline, FAQ, briefing, mind map, and slide deck (PowerPoint export).
- Provider-agnostic — any OpenAI-compatible endpoint (DeepSeek, Moonshot/Kimi, GLM, Qwen, OpenAI, …). Local embeddings via
@xenova/transformers(multilingual-e5-small by default). - Save chat answers as notes — one click turns a cited answer into a permanent vault note.
- Internationalization — Chinese (zh-CN) and English (en) UI; auto-detects Obsidian's language.
Screenshots
Screenshots coming with v0.4.0.
Installation
From Obsidian Community Plugins
Passed Obsidian's automated review; the listing is live at community.obsidian.md. In-app availability follows once the community registry syncs. Once it appears in the in-app browser:
- Open Obsidian → Settings → Community plugins → Browse.
- Search "Knowledge AI" and install.
- Enable the plugin.
Manual install (current method)
- Download
main.js,manifest.json,styles.css, and the twoort-wasm*.wasmfiles from the latest release. - Copy them to
<your-vault>/.obsidian/plugins/notebook-ai/. - Reload Obsidian and enable Knowledge AI under Community plugins.
The plugin is desktop-only (mobile is out of scope due to embedding worker + WASM requirements).
Quick start
- Add a provider — Settings tab → Providers → Add. Paste your OpenAI-compatible base URL and API key. For DeepSeek use
https://api.deepseek.com/v1; for Moonshot usehttps://api.moonshot.cn/v1; for OpenAI usehttps://api.openai.com/v1. - Assign tasks — Settings → Task assignment. Pick a chat model, optionally an embedding model, optionally a summary model. Tasks default to your first provider's default model.
- (Optional) Enable vector retrieval — Settings → Vector retrieval. Toggle on; pick local (default: multilingual-e5-small, downloaded on first use, ~110 MB) or external API embeddings.
- Create a Notebook — Settings → Notebooks → Add, pick a folder. Or right-click any folder in the file explorer → "Create Notebook from this folder".
- Chat — Click the ribbon icon (or open command palette → "Open Notebook AI chat"). Ask questions, follow citations to the source.
- Generate artifacts — Switch to the "Artifacts" tab in the chat view; pick a kind (summary / study guide / timeline / …) → it streams.
See docs/USAGE.md for advanced topics (query expansion, summary backfill, OCR, custom system prompts, multi-folder notebooks).
Configuration tips
- Large vaults (> 2 000 documents) — turn on document-level summaries: open a notebook card → click "Backfill summaries". This precomputes a per-document summary so the retriever can pick the right document first, then dig into chunks. Default concurrency is 1 (kind to rate-limited Chinese providers); raise to 3-4 for DeepSeek/OpenAI.
- Chinese content — leave the default tokenizer alone; it falls back to character n-grams for CJK so BM25 still works without a Chinese segmenter dependency.
- Cost control — vector retrieval is local by default. The only API calls are: chat completions, optional reranking, optional document-summary generation, and optional API embeddings.
Privacy and data
- No telemetry. The plugin never phones home.
- All indexes and chunks live under
<vault>/.obsidian/plugins/knowledge-ai/data/(JSONL files; compacted on plugin reload). - API requests are sent directly to the provider you configure. Your API key is stored in the same plugin folder.
Development
npm install
npm run dev # incremental build (watch)
npm run build # production build
npm test # vitest
npm run type-check # tsc --noEmit
To test the build against a real vault:
node scripts/deploy.mjs <path-to-vault>
See CONTRIBUTING.md for the contribution workflow.
Acknowledgements
- Obsidian — the platform.
@xenova/transformersand ONNX Runtime Web — on-device embeddings.pdfjs-dist,mammoth,xlsx,@xmldom/xmldom— file extraction.MiniSearch— BM25 retrieval.Tesseract.js— OCR.- Google NotebookLM — UX inspiration.
License
MIT — see file for full text.
For plugin developers
Search results and similarity scores are powered by semantic analysis of your plugin's README. If your plugin isn't appearing for searches you'd expect, try updating your README to clearly describe your plugin's purpose, features, and use cases.