Seshat VTT

pending

by Matthias

Transcribe audio files with multiple STT providers and insert transcripts beneath audio links in your notes.

Updated 1mo agoMITDiscovered via Obsidian Unofficial Plugins
View on GitHub

seshat-vtt

Transcribe audio files from Obsidian notes using multiple providers and insert transcript text directly under the audio reference.

Supported providers

  • OpenAI (/v1/audio/transcriptions)
  • Google Gemini (/v1beta/models/{model}:generateContent)
  • Groq (/openai/v1/audio/transcriptions)
  • Deepgram (/v1/listen)
  • AssemblyAI (/v2/upload + /v2/transcript)
  • Rev AI (/speechtotext/v1/jobs)
  • Speechmatics (/v2/jobs)
  • OpenAI-compatible custom endpoint ({base}/audio/transcriptions)

Usage

  1. Open plugin settings and select your active provider.
  2. Configure only the fields shown for that provider.
  3. Open the markdown note you want to process.
  4. Click the ribbon audio icon (Transcribe audio in current note).
  5. The plugin scans only the currently open markdown note, transcribes audio references in that note, and inserts transcripts below each matching audio reference.

If the current note has no supported audio links, no transcripts are added.

Community Submission Disclosures

  • This plugin sends audio content (and optional prompt/language hints) to the configured third-party transcription provider over the network.
  • Using this plugin typically requires provider accounts and API keys, and may incur provider charges.
  • The plugin itself does not include telemetry, ads, or a self-update mechanism.
  • Provider API keys are stored in the plugin's local Obsidian data file (data.json) on your device. Do not commit that file to Git.

Notes

  • Request/poll timing is fixed to reasonable defaults in code and is no longer exposed in settings.
  • Use Default language and Prompt as hints for supported providers.
  • Settings now show only options for the currently selected provider.
  • Dynamic model dropdowns auto-refresh for OpenAI, Gemini, Groq, and OpenAI-compatible providers.
  • The ribbon action processes only the currently open markdown note.
  • Repeated references to the same audio file in a run reuse the same transcript API result to avoid duplicate charges.

Release

  • GitHub Actions release workflow: .github/workflows/release.yml
  • Required release assets: main.js, manifest.json, and styles.css
  • Tag name must exactly match manifest.json.version (no v prefix)

For plugin developers

Search results and similarity scores are powered by semantic analysis of your plugin's README. If your plugin isn't appearing for searches you'd expect, try updating your README to clearly describe your plugin's purpose, features, and use cases.