STT & LLM

pending

by kevinraymond

Local speech-to-text transcription and LLM-powered text operations (summarize, auto-tag, custom prompts).

Updated 3mo agoMITDiscovered via Obsidian Unofficial Plugins

View on GitHub

STT & LLM

Local speech-to-text transcription and LLM-powered text operations for Obsidian.

Features

Speech-to-Text Recording: Transcribe voice to text directly into your notes using a local Whisper-based server
Summarize Selection: Generate summaries of selected text using your local LLM
Auto-Tag Notes: Automatically generate relevant tags for your notes
Custom Prompts: Send selected text to your LLM with custom instructions

All processing happens locally on your machine—no cloud services required.

Requirements

STT Server (Required for voice transcription)

This plugin requires a companion server for speech-to-text functionality:

stt-server - A local WebSocket server using OpenAI's Whisper model.

Quick Setup with Docker (Recommended)

The easiest way to run the STT server:

# CPU version
docker run -p 8765:8765 ghcr.io/kevinraymond/stt-server:cpu

# GPU version (NVIDIA)
docker run --gpus all -p 8765:8765 ghcr.io/kevinraymond/stt-server:gpu

The server will be available at ws://localhost:8765.

Manual Setup

If you prefer not to use Docker:

Prerequisites: Python 3.10+, uv, ffmpeg

git clone https://github.com/kevinraymond/stt-server.git
cd stt-server
uv sync
uv run obsidian-stt-server --auto

See the stt-server repository for advanced configuration.

LLM API (Optional, for text operations)

For summarization, auto-tagging, and custom prompts, you need a local LLM server with an OpenAI-compatible API:

LM Studio
Ollama
llama.cpp server
Any other OpenAI-compatible endpoint

Installation

From Community Plugins (Recommended)

Open Obsidian Settings
Go to Community Plugins and disable Safe Mode
Click Browse and search for "STT & LLM"
Install and enable the plugin

Manual Installation

Download main.js, manifest.json, and styles.css from the latest release
Create a folder obsidian-stt-llm in your vault's .obsidian/plugins/ directory
Copy the downloaded files into this folder
Reload Obsidian and enable the plugin in Settings → Community Plugins

Configuration

Open Settings → STT & LLM to configure:

Speech-to-Text

Server URL: WebSocket URL for the STT server (default: ws://localhost:8765)

LLM Settings

API URL: Your LLM server endpoint (default: http://localhost:11434)
Model: The model to use (e.g., llama3.2, mistral)
Summarization Prompt: Template for summarization requests
Tagging Prompt: Template for auto-tag generation
Custom Prompt Default: Default prompt for custom operations

Usage

Voice Recording

Click the microphone icon in the ribbon, or use the command "Toggle STT recording"
Speak into your microphone
Click again to stop—the transcription will be inserted at your cursor

LLM Operations

Once LLM is configured, additional controls appear:

Summarize: Select text → right-click → "Summarize Selection" (or use command palette)
Auto-Tag: Open a note → right-click → "Generate Tags for Note"
Custom Prompt: Select text → right-click → "Send with Custom Prompt"

Commands

Command	Description
Toggle STT recording	Start/stop voice recording
Summarize selection	Summarize the selected text
Generate tags for current note	Auto-generate tags based on note content
Send selection with custom prompt	Process selection with a custom instruction

Network Usage Disclosure

This plugin makes local network connections only:

STT Server: WebSocket connection to ws://localhost:8765 (configurable) for speech-to-text transcription
LLM API: HTTP requests to http://localhost:11434 (configurable) for text operations

No data is sent to external cloud services. All processing occurs on your local machine.

License

MIT

Support

If you find this plugin useful, consider sponsoring the development.

For plugin developers

Search results and similarity scores are powered by semantic analysis of your plugin's README. If your plugin isn't appearing for searches you'd expect, try updating your README to clearly describe your plugin's purpose, features, and use cases.

STT & LLM

STT & LLM

Features

Requirements

STT Server (Required for voice transcription)

Quick Setup with Docker (Recommended)

Manual Setup

LLM API (Optional, for text operations)

Installation

From Community Plugins (Recommended)

Manual Installation

Configuration

Speech-to-Text

LLM Settings

Usage

Voice Recording

LLM Operations

Commands

Network Usage Disclosure

License

Support

Voice Notes (Local Whisper)

AI Notes

Local Whisper Transcriber

Meeting Scribe

Vocalog

Whisper

Phonolite

Steno

Scribe

Say

For plugin developers