Seshat VTT

pending

by Matthias

Transcribe audio files with multiple STT providers and insert transcripts beneath audio links in your notes.

Updated 1mo agoMITDiscovered via Obsidian Unofficial Plugins

seshat-vtt

Transcribe audio files from Obsidian notes using multiple providers and insert transcript text directly under the audio reference.

Supported providers

OpenAI (/v1/audio/transcriptions)
Google Gemini (/v1beta/models/{model}:generateContent)
Groq (/openai/v1/audio/transcriptions)
Deepgram (/v1/listen)
AssemblyAI (/v2/upload + /v2/transcript)
Rev AI (/speechtotext/v1/jobs)
Speechmatics (/v2/jobs)
OpenAI-compatible custom endpoint ({base}/audio/transcriptions)

Usage

Open plugin settings and select your active provider.
Configure only the fields shown for that provider.
Open the markdown note you want to process.
Click the ribbon audio icon (Transcribe audio in current note).
The plugin scans only the currently open markdown note, transcribes audio references in that note, and inserts transcripts below each matching audio reference.

If the current note has no supported audio links, no transcripts are added.

Community Submission Disclosures

This plugin sends audio content (and optional prompt/language hints) to the configured third-party transcription provider over the network.
Using this plugin typically requires provider accounts and API keys, and may incur provider charges.
The plugin itself does not include telemetry, ads, or a self-update mechanism.
Provider API keys are stored in the plugin's local Obsidian data file (data.json) on your device. Do not commit that file to Git.

Notes

Request/poll timing is fixed to reasonable defaults in code and is no longer exposed in settings.
Use Default language and Prompt as hints for supported providers.
Settings now show only options for the currently selected provider.
Dynamic model dropdowns auto-refresh for OpenAI, Gemini, Groq, and OpenAI-compatible providers.
The ribbon action processes only the currently open markdown note.
Repeated references to the same audio file in a run reuse the same transcript API result to avoid duplicate charges.

Release

GitHub Actions release workflow: .github/workflows/release.yml
Required release assets: main.js, manifest.json, and styles.css
Tag name must exactly match manifest.json.version (no v prefix)

For plugin developers

Search results and similarity scores are powered by semantic analysis of your plugin's README. If your plugin isn't appearing for searches you'd expect, try updating your README to clearly describe your plugin's purpose, features, and use cases.

Seshat VTT

seshat-vtt

Supported providers

Usage

Community Submission Disclosures

Notes

Release

Transcription Audio

AI Transcriber

Speech to Text

Whisper

Steno

Igggy

Audio Transcription

SpeakNote

Vocalog

Smart Memos

For plugin developers