Snap2Note

unlisted

by bkrtura

Snap a photo with the native camera, let Gemini Vision recognize the text, and insert formatted Markdown at your cursor. The image is never saved to the vault.

1 starsUpdated 1mo agoMIT
View on GitHub

Snap2Note

One tap → capture → AI recognizes → Markdown inserted. Image never saved.

A mobile-first Obsidian plugin powered by Google Gemini Vision. Snap a photo with the native camera, let Gemini Vision recognize the text, and insert formatted Markdown at your cursor. The image is never saved to the vault.


Features

  • One-tap capture from the ribbon or the editor header button (top-right of each note)
  • Native camera on iOS and Android; file picker fallback on desktop
  • Gemini Vision AI — fast, accurate, multilingual, generous free tier
  • Markdown output — preserves headings, lists, tables; formulas in LaTeX
  • Ephemeral images — photo is never written to your vault
  • Customizable prompt — switch between OCR / translation / summarization
  • Model choice — Gemini 2.5 Flash (default), 1.5 Flash, 2.5 Pro
  • Client-side compression — resize to 1600 px before upload to save bandwidth
  • Configurable insert position — at cursor, new line, or end of document
  • Timeout control — abort slow requests on mobile networks
  • Debug log — timing info, never logs API key or image data

Installation

Manual (all platforms)

  1. Download the latest release: main.js, manifest.json, styles.css
  2. Create the folder <your-vault>/.obsidian/plugins/snap2note/
  3. Copy the three files into that folder
  4. Open Obsidian → Settings → Community plugins → reload → enable Snap2Note

Build from source

git clone https://github.com/sasyouhei/snap2note.git
cd snap2note
npm install
npm run build   # outputs main.js

Setup

  1. Get a free API key at Google AI Studio
  2. Obsidian → Settings → Snap2Note → paste the key
  3. Optionally adjust model, prompt, or compression

Usage

MethodAction
Camera icon (ribbon)Tap in the left sidebar — best for mobile
Camera icon (editor header)Tap in the top-right of any open note
Command paletteSnap2Note: Capture and insert
Command paletteSnap2Note: Pick image from file and insert

Flow: open a note → position your cursor → tap the icon → take photo → text is inserted automatically.


Settings

SettingDefaultDescription
Gemini API KeyRequired. From Google AI Studio.
Modelgemini-2.5-flashAlso: gemini-1.5-flash, gemini-2.5-pro
System PromptOCR → MarkdownFully customizable
Compress imageOnMax edge 1600 px, JPEG 0.8
Insert positionCursorCursor / new line / end of doc
Request timeout60 sAbort after N seconds
Debug logOffConsole output (never logs key or image)

Default system prompt:

You are a document recognition expert. Extract all text from the image and format it as Markdown. Use LaTeX for formulas. Output only the recognized content.


Privacy

  • Your image is sent only to Google's Gemini API (generativelanguage.googleapis.com) over HTTPS.
  • The image is held in memory as base64, sent, then the reference is immediately released. It is never written to your Obsidian vault.
  • Your API key is stored in the plugin's data.json (standard Obsidian local storage, not synced to Obsidian's servers).
  • No analytics, no telemetry, no third-party services beyond Gemini.

Troubleshooting

SymptomFix
Plugin not in list after installReload plugins (Settings → Community plugins → ⟳) or restart Obsidian
Camera icon missing on iPhoneSwipe from left edge, or tap ··· in the editor header
"Invalid API key"Re-paste the key from Google AI Studio (starts with AIza)
"Request timed out"Increase timeout in settings or check network
"Rate limit reached"Free tier — wait ~1 min and retry
"Image too large"Enable compression; or shoot at lower resolution
"No text recognized"Check lighting/focus; adjust system prompt
Camera won't open on mobileCheck Obsidian has camera permission in iOS/Android system settings

Development

npm run dev    # watch mode — auto-rebuilds on save
npm run build  # production build → main.js

Stack: TypeScript + esbuild + Obsidian API + Gemini REST API

src/
├── main.ts       Plugin entry — ribbon, commands, editor header action, main flow
├── capture.ts    Hidden <input> to trigger native camera / file picker
├── gemini.ts     Gemini Vision REST client with timeout and error mapping
├── settings.ts   Settings tab UI (7 settings)
├── types.ts      Shared types and GeminiError class
└── utils.ts      Image compression, base64 encoding, debug helpers

License

MIT — free to use, modify, and distribute.

For plugin developers

Search results and similarity scores are powered by semantic analysis of your plugin's README. If your plugin isn't appearing for searches you'd expect, try updating your README to clearly describe your plugin's purpose, features, and use cases.