Local Document Research with Claude Code and SQLite FTS5

People: David

Idea: Built a tool that lets you search a local collection of documents (example in this case is a collection from CTAHR) using natural language queries - no web searches needed, just Claude Code talking to a SQLite FTS5 index.

Details:

  • The tool indexes local PDFs, DOCX, PPTX, and XLSX files into a single SQLite database with full-text search
  • You ask a question in plain English and the AI agent figures out the right search terms and variations to run
  • Everything runs with a coding agent on local files - no browser, no API keys for external search services, no fetching data
  • Uses pdf-parse for PDFs, mammoth for DOCX, and officeparser for PPTX/XLSX extraction
  • The index supports incremental updates so you only re-process new or changed files
  • For large collections you can spin up triage sub-agents that process batches of documents in parallel
  • Sub-agents score and tier-rank documents by relevance, then merge findings into a shared state file
  • Extracted text gets cached locally so you don't re-parse the same document twice
  • Scanned PDFs without OCR are still a weak spot - not much text to pull from those
  • The whole thing is built around a research brief workflow so queries and results stay organized and reproducible

Read more