Vision Insights
unlistedby Patrick Metzdorf
AI-powered image analysis for your notes with contextual insights
Obsidian Vision Insights
AI-powered image analysis for Obsidian with contextual insights using OpenAI's Vision API.
Overview
Vision Insights is an Obsidian plugin that brings AI-powered image analysis directly into your note-taking workflow. Simply right-click on any image link in your notes to select insights, summaries, text extraction, and detailed analysis using OpenAI's advanced vision models.
Features
π― 10 Specialized Analysis Actions
- π Smart Summary - Focused 2-3 sentence summaries capturing core messages and key takeaways
- π Extract Key Facts - Organized bulleted lists of specific data points, metrics, and actionable items
- πΌοΈ Generate Description - Comprehensive visual descriptions for accessibility and archival purposes
- π€ Identify Text (OCR) - Complete text extraction with preserved formatting and structure
- ποΈ Analyze Structure - Detailed breakdown of organizational patterns and information architecture
- π‘ Quick Insights - 4-6 actionable insights that go beyond surface-level observations
- π Analyze Data Visualization - Specialized analysis for charts, graphs, and data visualizations
- π₯ Extract Meeting Participants - List all visible meeting participants, names, avatars, and roles from screenshots
- ποΈ Analyze Meeting Content - Comprehensive analysis of meeting screenshots, shared content, context, and action items
- βοΈ Custom Vision Prompt - Provide your own prompt; output is guaranteed Obsidian Markdown
π Smart Integration
- Context Menu Integration - Right-click any image for instant analysis
- Universal Image Support - Works with
![[image.png]],, and<img>syntax - External Image Support - Analyze both vault images and external URLs
- Multiple Insertion Modes - Insert results at cursor, as blockquotes, callouts, new notes, daily notes, above/below the image, or replacing the image with a callout
β‘ Performance & Efficiency
- Intelligent Caching - Avoid repeat API calls with configurable cache duration and LRU cap
- Rate Limiting - Built-in request throttling to prevent API limits
- Model Selection - Choose between GPT-5 Mini, GPT-5 Nano, and GPT-5.2
- Batch Processing - Analyze all images in the current note and aggregate results
- Optional Downscaling - Reduce large image sizes before upload to save cost/time
Installation
This plugin is currently installed manually (not in the Community Plugin store yet).
Easiest: Install via BRAT (recommended)
- In Obsidian, install BRAT (Beta Reviewers Auto-update Tester) from Community Plugins
- Open BRAT settings and click Add Beta plugin
- Paste this repository URL:
https://github.com/batjko/obsidian-vision-insights - Choose vision-insights, then enable it in Community Plugins
This gives users one-click updates from new GitHub releases.
Manual Installation (zip package)
- Download
vision-insights-<version>.zipfrom the latest release - Unzip it
- Copy
manifest.json,main.js, and optionalstyles.cssinto:VaultFolder/.obsidian/plugins/vision-insights/ - Reload Obsidian and enable Vision Insights in Community Plugins
- Add your OpenAI API key in plugin settings
Manual Installation (direct files)
If you prefer not to use the zip package, download these release assets directly:
manifest.jsonmain.jsstyles.css(if present)
Copy them into VaultFolder/.obsidian/plugins/vision-insights/ and reload Obsidian.
Updating to a New Version
For manual installs, replace files in VaultFolder/.obsidian/plugins/vision-insights/ with the latest release assets and reload Obsidian.
Development Installation
# Clone the repository
git clone https://github.com/batjko/obsidian-vision-insights.git
cd obsidian-vision-insights
# Install dependencies
npm install
# Build for development
npm run dev
# Build for production
npm run build
Setup
1. Get OpenAI API Key
- Visit OpenAI Platform
- Create an account or sign in
- Navigate to API Keys section
- Generate a new API key
2. Configure Plugin
- Open Obsidian Settings
- Go to Vision Insights in the plugin settings
- Enter your OpenAI API key
- Click Test Connection to verify
- Customize your preferred settings
Usage
Basic Usage
- Right-click on any image link in your notes
- Select an analysis action from the Vision Insights menu
- Wait for analysis (usually 3-10 seconds)
- Review results in the popup modal
- Insert content using your preferred format
Supported Image Formats
- Vault Images:
![[image.png]]or - External URLs:
 - HTML Images:
<img src="image.png" alt="description"> - File Types: PNG, JPG, JPEG, GIF, WebP, BMP, TIFF
Insertion Modes
- At Cursor Position - Insert directly where your cursor is located
- As Blockquote - Wrap content in
>blockquote formatting - As Callout - Create Obsidian callout blocks with analysis results
- Create New Note - Generate a dedicated note for the analysis
- Append to Daily Note - Add to today's daily note
- Insert Above/Below Image - Place content adjacent to the image
- Replace Image with Callout - Swap the image for a formatted callout
Configuration Options
API Settings
- OpenAI API Key - Your OpenAI API key for vision analysis
- Preferred Model - Choose between GPT-5 Mini (recommended), GPT-5 Nano, and GPT-5.2
- Test Connection - Verify API key validity
Analysis Actions
Enable/disable specific analysis types:
- Smart Summary
- Extract Key Facts
- Generate Description
- Identify Text (OCR)
- Analyze Structure
- Quick Insights
- Analyze Data Visualization
Performance Settings
- Caching - Cache results to avoid repeat API calls (configurable duration: 1-168 hours) with max entries
- Rate Limiting - Minimum delay between requests (100-2000ms)
- Default Insertion Mode - Choose how results are inserted by default
- Downscaling - Toggle and max dimension for local images
- Per-Action Overrides - Model, temperature, image detail, and insertion mode per action
Examples
Smart Summary
This chart shows quarterly revenue growth from Q1 2023 to Q4 2023, with a notable 23% increase in Q4. The data indicates strong momentum in the enterprise segment, particularly in SaaS subscriptions which grew 45% year-over-year.
Extract Key Facts
β’ Q4 2023 revenue: $2.4M (23% increase)
β’ SaaS subscriptions: 45% YoY growth
β’ Enterprise segment: 67% of total revenue
β’ Customer acquisition cost: $1,200 (down 15%)
β’ Monthly recurring revenue: $890K
Quick Insights
β’ The sharp Q4 uptick suggests successful holiday marketing campaigns
β’ Enterprise focus is paying off with higher-value, stickier customers
β’ Declining CAC indicates improving marketing efficiency and brand recognition
β’ SaaS growth outpacing overall revenue suggests successful product-market fit
β’ Strong momentum entering 2024 with recurring revenue foundation
Development
Project Structure
obsidian-vision-insights/
βββ src/
β βββ types.ts # TypeScript interfaces and types
β βββ settings.ts # Plugin settings UI and management
β βββ image-handler.ts # Image detection and processing
β βββ openai-client.ts # OpenAI API integration
β βββ results-modal.ts # Results display modal
β βββ cache-manager.ts # Caching system
β βββ utils.ts # Utility functions
βββ main.ts # Main plugin entry point
βββ manifest.json # Plugin metadata
βββ package.json # Dependencies and scripts
βββ esbuild.config.mjs # Build configuration
Key Technologies
- TypeScript - Type-safe development
- OpenAI API v5 - Vision analysis capabilities
- Obsidian API - Native plugin integration
- ESBuild - Fast compilation and bundling
Build Scripts
npm run dev # Development build with watch mode
npm run build # Production build
npm run release # Create release package
Release Process
This plugin uses semantic versioning. To create a new release:
Interactive Release
npm run release
This will prompt you to select the type of version bump (patch, minor, or major).
Direct Release
npm run release:patch # For bug fixes (1.0.0 β 1.0.1)
npm run release:minor # For new features (1.0.0 β 1.1.0)
npm run release:major # For breaking changes (1.0.0 β 2.0.0)
The release script will:
- Bump the version in
package.json,manifest.json, andversions.json - Build the plugin
- Create installable assets (
main.js,manifest.json, optionalstyles.css, andvision-insights-<version>.zip) - Commit and tag the changes
- Push to GitHub
- Create/update a GitHub release with those assets
Pre-flight checks:
- Requires a clean working tree (no uncommitted changes)
- Fails if local branch is behind remote
- Requires authenticated GitHub CLI (
gh auth status)
Manual Release If you need to release the current version without bumping:
node scripts/release.mjs skip
Contributing
- The usual: Fork and submit a pull request
- Try to stick to the existing conventions.
- KISS
Troubleshooting
Common Issues
Plugin won't load
- Check Obsidian version compatibility (minimum 0.15.0)
- Verify all files are in the correct plugin directory
- Check browser console for error messages
API key errors
- Ensure API key is correctly entered without extra spaces
- Verify API key has sufficient credits and permissions
- Test connection using the built-in test button
Image detection fails
- Check image file exists and is accessible
- Verify image format is supported
- Try different image syntax formats
Slow analysis
- Check internet connection
- Consider using faster model (GPT-5 Nano)
- Adjust rate limiting settings if needed
Privacy & Security
- API Key Storage - Stored locally in Obsidian's plugin data
- Image Processing - Images sent to OpenAI for analysis according to their privacy policy
- No Telemetry - Plugin doesn't collect or transmit usage data
Acknowledgments
- Built for the Obsidian knowledge management platform
- Powered by OpenAI's Vision API
- Inspired by the Obsidian community's innovation in knowledge tools
For plugin developers
Search results and similarity scores are powered by semantic analysis of your plugin's README. If your plugin isn't appearing for searches you'd expect, try updating your README to clearly describe your plugin's purpose, features, and use cases.