Scholia

approved

by shashanyu

This plugin has not been manually reviewed by Obsidian staff. Import academic PDFs and precompute context-aware hover glossary explanations.

25 downloadsUpdated 6d agoMIT

Scholia

An Obsidian desktop plugin for importing academic PDFs, preparing them for close reading, and serving cached glossary explanations on hover.

How to use it

The main entrypoint is Obsidian's Command Palette:

  1. Open Obsidian.
  2. Press Cmd+P on macOS or Ctrl+P on Windows/Linux.
  3. Type part of the action name, such as Import, Glossary, or Explain.
  4. Pick the action you want.

The most important commands are:

  • Import PDF and prepare for reading
  • Convert current PDF to Markdown only
  • Rebuild glossary for current paper
  • Extract terms and explain from current Markdown
  • Highlight key sentences for current paper
  • Explain term now

If you just want the normal workflow, use:

  1. Convert a digital PDF into Markdown.
  2. Optionally highlight key sentences.
  3. Precompute context-aware glossary entries.
  4. Read the paper in Obsidian with instant hover explanations from cache.

This plugin is designed for text-layer PDFs such as philosophy, logic, and adjacent humanities papers. Scanned OCR-heavy documents are out of scope for the current MVP.

What Scholia does

  • Imports a PDF into a per-paper folder inside your vault.
  • Converts the PDF to Markdown with a configurable backend.
  • Writes import diagnostics and warnings into _source/.
  • Optionally highlights key sentences after import.
  • Discovers likely technical terms and explains them in background batches.
  • Stores glossary entries as normal Markdown files under _glossary/.
  • Shows cached explanations on hover in the editor.
  • Falls back to an explicit Explain now action when a term has not been prepared yet.

Reading model

Scholia is intentionally cache-first:

  • Hover does not call the LLM live.
  • Term discovery and explanation happen after import.
  • Glossary notes stay in your vault as inspectable Markdown artifacts.
  • Definitions are grounded in the paper's local usage, not just generic dictionary glosses.

The default pipeline is:

PDF import -> key sentence highlighting (optional) -> glossary preprocessing -> hover from cache

PDF import

Scholia uses Paper2MDViaLLM as the normal PDF-to-Markdown path. Older Scholar-MD and Marker paths are kept internal for comparison and debugging, but they are not part of the normal settings flow.

Vault output layout

Importing a paper creates a folder like this:

my-paper/
  my-paper.pdf
  my-paper.md
  _source/
    import-quality.json
    import-warnings.md
    key-sentences.json
    paper2mdviallm.md
  _glossary/
    _status.md
    term-a.md
    term-b.md

Possible backend-specific artifacts include:

  • _source/paper2mdviallm.md
  • _source/scholar-md.md
  • _source/scholar-md.diagnostics.json

These files are meant to be inspectable. If import quality is medium or high risk, check _source/import-warnings.md before trusting formulas or symbols.

Requirements

  • Obsidian Desktop >= 1.5.0
  • A local filesystem vault
  • Node.js, only if you are building Scholia from source
  • Miniconda or Anaconda for the default PDF converter
  • At least one API key:
    • OpenAI for GPT models
    • Anthropic for Claude models

Notes:

  • The plugin is desktop-only.
  • Paper2MDViaLLM needs an API key because conversion itself is LLM-backed.
  • Hover explanations, glossary generation, and key-sentence selection also use the configured provider.
  • API keys are currently stored in the plugin's Obsidian data.json so the plugin can stay compatible with Obsidian 1.5.0+. If you later raise minAppVersion, migrating to SecretStorage is the cleaner review posture.

Disclosures

  • Scholia makes network requests to OpenAI and/or Anthropic when you run import, glossary generation, key-sentence selection, or Explain term now.
  • SEP enrichment, when enabled, also fetches public Stanford Encyclopedia of Philosophy pages.
  • The plugin reads PDFs and Markdown files from your vault and writes derived artifacts such as Markdown notes, _source/*, and _glossary/* back into the same vault.
  • The plugin can execute the configured local paper2mdviallm tool.
  • The plugin does not include telemetry or ads.

Install and configure

This repo is currently set up like a local plugin project rather than a packaged community release.

1. Install the plugin

Put Scholia under your vault's plugin directory:

YOUR_VAULT/.obsidian/plugins/scholia/

On Windows this looks like:

YOUR_VAULT\.obsidian\plugins\scholia\

If you downloaded a packaged build, the folder should contain:

manifest.json
main.js
styles.css

If you downloaded the source repo, place or clone it at the same scholia plugin path, then run:

npm install
npm run build

2. Enable it in Obsidian

  1. Open Obsidian.
  2. Go to Settings -> Community plugins.
  3. Enable Scholia.

3. Install the default PDF converter

In Terminal, PowerShell, or Anaconda Prompt:

conda create -n scholia python=3.11
conda activate scholia
pip install paper2mdviallm

4. Set the paper2mdviallm CLI path

In Obsidian, open Settings -> Community plugins -> Scholia.

Set CLI path for paper2mdviallm to the conda environment folder:

macOS/Linux: /Users/YOUR_NAME/miniconda3/envs/scholia
Windows:     C:\Users\YOUR_NAME\miniconda3\envs\scholia

If you use Anaconda instead of Miniconda, replace miniconda3 with anaconda3.

Scholia will resolve the executable inside that environment:

macOS/Linux: bin/paper2mdviallm
Windows:     Scripts\paper2mdviallm.exe

Use the environment folder path above, not a shell fragment like conda run -n scholia paper2mdviallm.

To find your conda environment path:

conda env list

5. Add an API key

In Settings -> Community plugins -> Scholia, enter at least one key:

  • OpenAI key for GPT models
  • Anthropic key for Claude models

The default Markdown generation model is gpt-5.4-mini, so fill the OpenAI key unless you change that model to a Claude model.

6. Run the first import

  1. Open a PDF in your vault.
  2. Press Cmd+P or Ctrl+P.
  3. Type Scholia or Import.
  4. Choose Import PDF and prepare for reading.
  5. Open the generated Markdown note and hover on prepared terms.

Plugin settings

The settings tab is split into four groups.

Markdown generation

  • CLI path for paper2mdviallm
  • Markdown generation model

Provider choice is inferred from the model name. The plugin injects the global OpenAI or Anthropic API key into the CLI environment. The CLI path should usually be a conda environment root such as /Users/YOUR_NAME/miniconda3/envs/scholia or C:\Users\YOUR_NAME\miniconda3\envs\scholia.

API keys

  • OpenAI key
  • Anthropic key

These are currently stored in the plugin's Obsidian data.json for compatibility with older supported Obsidian versions.

Reading prep

  • Reading prep provider
  • Reading prep model
  • Auto highlight key sentences after import
  • Key sentence density

This controls post-import prep: key-sentence selection plus glossary discovery and explanation.

Glossary

  • Max precomputed terms
  • Glossary folder name
  • Glossary explanation length
  • Hover delay

Glossary entries are written as Markdown files inside the paper folder, not hidden plugin storage.

Typical workflow

One-click reading prep

  1. Open a PDF in Obsidian.
  2. Press Cmd+P or Ctrl+P, type Import, and run Import PDF and prepare for reading.
  3. Wait for background preprocessing to finish.
  4. Open the generated Markdown note.
  5. Hover over prepared terms to read cached explanations.

Markdown-only import

  1. Open a PDF.
  2. Press Cmd+P or Ctrl+P, type Convert, and run Convert current PDF to Markdown only.
  3. Inspect the note and _source/import-warnings.md.
  4. Run Extract terms and explain from current Markdown when ready.

Manual term explanation

  1. Select a term in a Markdown note.
  2. Press Cmd+P or Ctrl+P, type Explain, and run Explain term now.
  3. The plugin writes a glossary note and future hovers use the cached result.

Key sentence highlighting

When enabled, import runs key-sentence highlighting before glossary preprocessing.

Details:

  • Highlights are written with Obsidian's native ==...== syntax.
  • Plugin-managed highlight bookkeeping lives in _source/key-sentences.json.
  • Re-runs remove only the highlights previously managed by the plugin.
  • Manual highlights outside that sidecar are left alone.

Development

Install JavaScript dependencies:

npm install

Build the plugin:

npm run build

Run the plugin test suite:

npm test

Run the Python tool tests:

npm run test:paper2mdviallm
npm run test:scholar-md

Run the PDF evaluation harness:

npm run eval:pdf-tools

Useful optional variants:

PDF_EVAL_DOCS=anaf011,logic-of-provability npm run eval:pdf-tools
PDF_EVAL_MINERU=1 PDF_EVAL_DOCS=anaf011 npm run eval:pdf-tools

Repo layout

src/                  Obsidian plugin runtime
prompts/              LLM prompts for term discovery, explanation, and key sentences
tests/                Node-side tests
docs/                 design and pipeline notes
eval/pdf-tools/       PDF backend comparison harness
tools/paper2mdviallm/ LLM-native PDF to Markdown CLI
tools/scholar-md/     lightweight digital PDF to Markdown CLI

Known boundaries

  • The MVP targets digital PDFs with an existing text layer.
  • Formula-heavy papers still need manual scrutiny.
  • Import warnings are advisory, not proof of correctness.
  • Hover is cache-only by design.
  • SEP is not part of the first-pass import and hover workflow.

Related docs

For plugin developers

Search results and similarity scores are powered by semantic analysis of your plugin's README. If your plugin isn't appearing for searches you'd expect, try updating your README to clearly describe your plugin's purpose, features, and use cases.