Semantic Search

approved

by bbawj

Semantic search for files using OpenAI's text embeddings.

148 stars6,753 downloadsUpdated 9mo agoGPL-3.0
View on GitHub

Semantic Search for Obsidian

Find what you are looking for based on what you mean. A new file switcher built using WASM and Rust.

Quickstart

  1. Setup API configuration in plugin settings
  2. Run Generate Input command
  3. Run Generate Embedding command
  4. Run Open Query Modal command and start semantic searching!

Demo

https://user-images.githubusercontent.com/53790951/231014867-ce37c097-3b22-412a-9b1a-74204b0f167c.mp4

Commands

CommandDescription
Generate InputGenerate input csv based on sections of your notes. Currently, sections are defined as text blocks between headings. Prepared input is saved as input.csv in your root folder.
Generate EmbeddingObtain embeddings via the configured API URL (this requires that the generate input command was successfully executed). Generated embeddings is saved as embedding.csv in your root folder.
Open Query ModalSemantic search through your notes using generated embeddings.
Recommend links using current selectionUses current editor selection as query input, automatically creating a markdown link with your choice. Can also be triggered in the context menu using the mouse right-click.

Configuration

SettingDescription
API URLAny arbitrary url endpoint for obtaining embeddings, but make sure the response JSON is supported (selected from "API repsonse type"). e.g. OpenAI: https://api.openai.com/v1/embeddings. Ollama: http://localhost:11434/api/embed
API KeyOptional API key that is placed into Bearer Auth HTTP header. This gets stored into data.json as per all obsidian plugin settings data so make sure you do not commit this file to a repository.
ModelThe model id, passed in the key "model" of request.
API response typeThe type of response JSON expected to be returned from the URL.
Section DelimetersRegex used to determine if the current line is the start of a new section. Sections are used to group related content together. Defaults to ., meaning every line starts a new section. E.g. matching every heading: ^#{1,6}
Folders to ignoreFolders to ignore when generating input. Enter folder paths separated by newlines.
Number of batchesNumber of batches used to call OpenAI's endpoint. If you have lots of data, and are facing invalid request errors, try increasing this number.
Enable link recommendation using {{}}Use {{}} as a way to trigger semantic search suggestions for file linking.
Enable cost estimationTurn on/off input cost estimation that is based on a flat rate of $0.0004 / 1000 tokens.
Enable debug mode loggingTurn on/off more verbose logging.

Installing

From Obsidian v1.0.0, this plugin can be activated from within Obsidian:

  1. Open Settings > Third-party plugin
  2. Make sure Safe mode is off
  3. Click Browse community plugins
  4. Search for "Semantic Search"
  5. Click the "Install" button
  6. Once installed, close the community plugins window
  7. Under the "Installed plugins" section, enable Semantic Search

From Github

  1. Download the latest release distribution
  2. Extract the the contents of the distribution zip file to your vault's plugins folder: /.obsidian/plugins/ Note: On MacOs the .obsidian folder may be hidden by default.
  3. Reload Obsidian
  4. Open Settings, third-party plugins, make sure safe mode is off and enable "Semantic Search" from there.

Contributing

Contributions are welcome!

Dependencies

  1. Rust and cargo
  2. wasm-pack

Getting Started

  1. Clone the repo
  2. cd into the newly created folder and run yarn install
  3. Run yarn run dev

Note

This plugin is very much experimental at the moment, use it at your own risk. Testing is done on Windows.

Thanks to Robert's blog post for the idea and inspiration!

For plugin developers

Search results and similarity scores are powered by semantic analysis of your plugin's README. If your plugin isn't appearing for searches you'd expect, try updating your README to clearly describe your plugin's purpose, features, and use cases.