The OCR Extractor plugin focuses on turning embedded documents and images into searchable text using optical character recognition. It processes attachments already present in notes and converts the extracted content into clean Markdown, placing it directly below the original file inside a collapsible callout. This approach keeps the raw files untouched while still making their contents visible, searchable, and indexable by both internal search and system-level tools. The plugin supports batch extraction, either for a single note or across the entire vault, with progress shown in the status bar and the option to cancel midway. Text extraction is powered by Mistral OCR, which handles complex layouts better than basic OCR engines.

OCR Extractor is an Obsidian plugin that uses OCR to extract text from PDFs, documents, images, etc. embedded in your notes. Different OCR engines (free or paid, local or cloud-based) are available, depending on your needs.

Following Obsidian's philosophy of storing data in an open, future-proof file format, the extracted text is added below the embedded attachment as an expandable callout. This means that the text will be searchable via Obsidian's built-in search, other search plugins, and even your operating system's native file search.

Installation

Install from Obsidian Community, or go to Settings → Community plugins → Browse and search for "OCR Extractor".

Usage

Click on the ribbon icon (or use the command palette) and select one of the options:

Extract text in active note
Extract text in folder
Extract text in all notes

You can also right-click on notes, folders, or a selection of notes to extract only those files. On mobile, text can only be extracted from the active note.

When extracting from multiple notes, you can track progress in the status bar and click it to cancel (or use the Cancel extraction command).

Additional options are available in the plugin settings, including Auto-extract attachments (automatically extract text when a new attachment is added to a note) and Prefer embedded PDF text (use text already embedded in a PDF instead of extracting with OCR).

OCR engines

Depending on your needs, you can choose which OCR engine to use. Select the OCR engine in the plugin settings and follow the setup steps below.

Tesseract

Tesseract (the default option) is a popular open source OCR engine. It has some limitations (only supports English text, can only process PDFs and images, is often less accurate), but it's completely free and local (ensuring your data is never sent to a third-party provider). This option requires no additional setup.

Mistral OCR

Mistral OCR is a powerful AI model for quickly extracting text from complex documents (including handwriting) and converting it to Markdown. It supports many different languages and file types. This option requires a paid Mistral AI account (at the time of writing, it costs $4 per 1000 pages processed). Attachments are sent to Mistral's OCR service for text extraction (see their privacy policy).

First, you need to create a Mistral AI account. Follow their Quickstart guide:

Create an account
Add payment information
Recommended: Set a monthly spending limit, to avoid any unexpected charges
Create an API key

Then, enter your API key in the plugin settings.

OpenAI-compatible API

This option allows you to use any AI model (LLM), either locally (e.g. with Ollama or LM Studio), or via a cloud provider like OpenRouter. This requires more setup, has higher system requirements, and is often slower, but, when used with a local model, it can allow you to get great results without ever sending attachments to a third-party service.

Example (Ollama with glm-ocr):

Download and install Ollama
Download a vision-capable model compatible with your hardware (e.g. glm-ocr):
```
ollama pull glm-ocr
```
In plugin settings, set OCR engine to OpenAI-compatible API
Set Base URL to the Ollama server's URL: http://localhost:11434/v1
Set Model to glm-ocr
Click Test to confirm the connection works

Custom command

For advanced use cases, you can provide a custom command that will be used to process attachments. This can be used, for example, to use a third-party API that isn't supported by the plugin, Tesseract with a custom configuration, native OS OCR options, or even a script that does custom preprocessing or postprocessing. Note that custom commands are not supported on mobile, so the plugin will use Tesseract instead.

Enter your command in Command in the plugin settings, where {input} is the path to the input attachment file and {output} is the path to the produced Markdown or text file containing the extracted text. To skip an unsupported attachment, don't create the output file.

Click Test to run the command on a sample image and confirm it correctly extracts the text. If the custom command only supports images, enable Convert PDFs to images.

Example (native OCR on macOS with macOCR):

macOCR (a third-party tool, review before installing) allows you to easily use Apple's built-in Vision OCR engine (which runs locally and is more accurate than Tesseract).

Install macOCR
Set Command in plugin settings:
```
ocr -i {input} > {output}
```
Enable Convert PDFs to images
Click Test to confirm the command works

Examples

The following examples show text extracted from three sample documents processed with each OCR engine: a study guide (a straightforward typed document with headers and bullet points), an academic paper (a complex multi-column document with equations and charts), and handwritten meeting notes (a photo of handwritten text). Each link opens a note (using Obsidian Publish) showing the original attachment alongside the extracted text, so you can see exactly what the plugin produces:

Tesseract: Study guide · Academic paper · Meeting notes
Mistral OCR: Study guide · Academic paper · Meeting notes
OpenAI-compatible API (GLM-OCR): Study guide · Academic paper · Meeting notes
Custom command (macOCR): Study guide · Academic paper · Meeting notes

Contributing

For details on how to report a bug, share a feature request, or contribute code, see the Contribution Guidelines. To report a security issue, see the Security Policy.

Translations

OCR Extractor is available in several languages. To request a new language (or to suggest an improvement for an existing translation), start a discussion.

License

OCR Extractor is licensed under the MIT License.

OCR Extractor

Description

Reviews

Stats

RequirementsExperimental

Latest Version

Changelog

Added

Changed

Fixed

README file from