← Hermes Field Notes
2026-06-02 · Google Drive · AI document search

Build a local Google Drive index for AI agent document search

Most people already have a useful knowledge base: a messy cloud drive full of PDFs, forms, receipts, spreadsheets, and exported documents. The hard part is making that material searchable by an AI assistant without pasting private files into every conversation.

A local Google Drive index solves that middle layer. It syncs document metadata and extracted text into a local search database, then lets the assistant answer with filenames, paths, links, and snippets instead of guessing.

Project link: the implementation is published as gregoryhorn/hermes-drive-index on GitHub: a local Google Drive full-text search index and Hermes Agent plugin powered by SQLite FTS5.

Infographic showing Google Drive files flowing into a local SQLite full-text index and AI agent search results
A local Drive index turns private files into fast, cited search results for an assistant.

The problem: cloud search is not agent context

Drive's own search is useful for a human clicking around. An AI agent needs a different shape of result: a ranked list with document names, paths, stable links, and short snippets that can be cited in a response. It also needs to know when a match is only metadata, not full text.

Without that boundary, assistants either ask the user to upload files manually or over-claim from filenames. Neither is good enough for operational work.

The architecture

The tool follows a simple local-first pattern:

Flow diagram showing Drive metadata crawl, text extraction, OCR fallback, SQLite FTS indexing, and cited answers
The useful part is not just indexing; it is returning grounded, citeable results.

Why local indexing helps

Local indexing keeps search fast and controllable. A scheduled incremental update can compare a manifest of Drive files and avoid downloading unchanged documents. The assistant can search in milliseconds during a conversation while the slower cloud sync happens in the background.

It also creates a clean privacy boundary. The public write-up can explain the architecture, but the actual file names, snippets, Drive links, and personal documents remain private.

What good results should include

A document-search tool should not just say "found it." Useful results include:

Copyable Hermes setup prompt

If you want a human or another Hermes instance to set up the repository as a working local Drive-search tool, copy this prompt into Hermes. It includes the practical installation path, privacy boundaries, and verification checkpoints.

Install and configure the Hermes Drive Index tool for this Hermes Agent instance.

Repository:
https://github.com/gregoryhorn/hermes-drive-index

Goal:
Set up a private local Google Drive document search index for Hermes Agent, backed by SQLite FTS5, so Hermes can use:
- drive_index_search
- drive_index_status
- drive_index_update

Important privacy rule:
Do not print, commit, expose, or summarize private Google Drive folder IDs, document links, snippets, tokens, local DB contents, or indexed file contents unless I explicitly approve. Treat the local index as private user data.

Steps:

1. Inspect the current Hermes install. Determine whether Hermes is installed via pipx, source checkout, or another Python environment. Identify the Python environment into which the package should be installed. Do not guess; verify with commands.

2. Clone the repository. Suggested location: ~/src/hermes-drive-index. If the directory already exists, inspect it instead of recloning blindly.

3. Install the package into the same environment Hermes uses. If Hermes is pipx-installed, prefer:

   pipx inject --editable hermes-agent ~/src/hermes-drive-index

   Otherwise, from the repo directory use:

   python -m pip install -e '.[test]'

   Verify the CLI is available:

   hermes-drive-index --help

4. Create the local config file at ~/.hermes/drive_index/config.toml using this shape, replacing placeholders with real local values:

   root_folder_name = "Personal Files"
   root_folder_id = "MY_GOOGLE_DRIVE_FOLDER_ID"
   base_dir = "/home/MY_USER/.hermes/drive_index/personal_files"
   db_path = "/home/MY_USER/.hermes/drive_index/personal_files/index.db"
   ocr_enabled = false
   ocr_image_enabled = false

   If you do not know my Google Drive folder ID, ask me for it. Do not invent one.

5. Enable the Hermes plugin in ~/.hermes/config.yaml by adding drive_index while preserving existing plugins:

   plugins:
     enabled:
       - drive_index

6. Run the health check:

   hermes-drive-index doctor --json

   If authentication or Google Drive access is missing, stop and tell me exactly what is needed.

7. Build the first local index:

   hermes-drive-index build --mode weekly_full --json

   Summarize only safe metrics: files scanned, indexed, full-text indexed, metadata-only indexed, skipped, failed, and duration. Do not paste private filenames, snippets, or Drive links unless I approve.

8. Verify CLI search and status:

   hermes-drive-index status --json
   hermes-drive-index search "test" --json

   For search results, only confirm that results are returned or not returned. Do not expose private snippets or links.

9. Verify Hermes plugin availability. Start a fresh Hermes session, or tell me if the gateway needs to be restarted before tools appear. Confirm these tools are available: drive_index_search, drive_index_status, and drive_index_update.

10. Final report. Include repo path, install method, config path, doctor status, index build metrics, CLI status/search result, Hermes tool availability, and any remaining manual steps.

Optional OCR:
Do not enable OCR by default. If I explicitly ask for OCR later, PDF OCR uses ocrmypdf and image OCR uses tesseract. Image OCR should stay off unless folder-scoped, to avoid indexing personal photos. Missing OCR tools should be treated as non-fatal and should fall back to metadata-only indexing.

Success condition:
Do not claim setup is complete until the package is installed, the config exists, doctor passes or the missing prerequisite is clearly identified, the index has been built or a specific blocker is documented, and the Hermes plugin tools are verified or the required restart/new-session step is clearly stated.

Short design prompt

Design a local document index for an AI assistant. Separate cloud crawl, local extraction, OCR fallback, SQLite full-text search, result ranking, citations, incremental updates, and privacy boundaries. The assistant must distinguish full-text matches from metadata-only matches and must cite filenames, paths, links, and snippets.
Rule of thumb: private documents should be indexed locally, cited carefully, and summarized publicly only after they are sanitized.

Google Drive indexAI document searchSQLite FTSOCR