Wasm OCR

unlisted

by Kuro Latency

Offline OCR using NCNN and WebAssembly.

6 starsUpdated 5mo agoGPL-3.0
View on GitHub

Obsidian Wasm OCR

A high-performance, offline, client-side OCR (Optical Character Recognition) plugin for Obsidian, powered by NCNN and WebAssembly.

Features

  • Offline Privacy: All processing happens locally on your device. No data leaves your machine.
  • High Performance: Uses WebAssembly and Web Workers to run the PP-OCRv5 model efficiently without freezing the UI.
  • Multi-Image Support: Batch analyze images from your clipboard, current note, or selection.
  • Interactive Analysis:
    • Visual Selection: Drag to select text blocks, draw marquees to select multiple regions.
    • Zoom & Pan: Inspect high-resolution scans with ease.
    • Smart Merge: Automatically merges broken lines into coherent paragraphs.

Usage Guide

Getting Started

  1. Paste & Auto-OCR: Paste an image into your note. If "Auto-OCR on Paste" is enabled in settings, it will be analyzed automatically.
  2. Context Menu: Right-click any image in your note and select "Analyze Image".
  3. Command Palette: Use commands like "Analyze Current Image" or "Analyze All Images in Note".

Interactive Viewer

When the analysis panel opens, you can interact with the image and results:

ActionInteraction
Select TextLeft Click + Drag over text.
Select BlockDouble Click on a text box.
Multi-SelectCtrl/Cmd + Drag to draw a selection box (Add).
Shift + Drag to draw a selection box (Remove).
Ctrl/Cmd + Click to toggle individual boxes.
Pan ImageMiddle Mouse Button Drag.
ZoomMouse Wheel (Zooms to canvas center).
NavigateLeft/Right Arrow Keys, or drag/click the top progress bar.
Reset ViewClick the Reset icon (top-right of image).

Text Management

  • Copy: Click the "Copy" button at the bottom. It respects your current selection and sorting.
  • Merge Lines: Toggle the "Merge Lines" switch to combine broken text into paragraphs automatically.

Settings

  • Text Confidence Threshold: Adjust the slider to filter out low-confidence text detections (0.0 - 1.0).
  • Auto-OCR: Enable/Disable automatic analysis on paste.
  • Auto-Open Panel: Choose whether the side panel opens automatically when analysis starts.

Technical Details

  • Core: C++ with NCNN inference engine.
  • Model: PP-OCRv5 (Quantized).
  • Frontend: React + Zustand.

License

GPLv3

For plugin developers

Search results and similarity scores are powered by semantic analysis of your plugin's README. If your plugin isn't appearing for searches you'd expect, try updating your README to clearly describe your plugin's purpose, features, and use cases.