Like zipping a project, but keeping it searchable by meaning. Ask questions, then open only the files that matter.
Install.rag is a binary file format. It stores original file bytes (zstd-compressed), per-file metadata, and a semantic embedding table in a single archive. Files are preserved byte-for-byte and extractable individually.
rag pack walks a directory, extracts text from each file, computes 384-dimensional embeddings using a local ONNX model, compresses everything, and writes a single .rag file.
rag query reads the manifest and embeddings (typically a few hundred KB), classifies the query, and routes to the cheapest retrieval path. Metadata queries never decompress blobs. Semantic queries decompress only the top-K matching files.
dotRAG is like zipping a project, but making it searchable by meaning.
The original files are still inside the archive. Nothing gets flattened into a vague summary.
When you ask a question, dotRAG opens only the relevant files, so the model sees the same code and docs without scanning the whole folder.
179 MB → 440 KB. 171 MB was node_modules, excluded by default. The 440 KB archive includes real semantic embeddings — that overhead is the cost of searchability.
Tested across 4 real-world repositories (Django, rust-analyzer, Rails, Loopsy) with 14 retrieval queries and 6 MCP agent tasks. All results independently verified — not self-reported.
| Category | Queries | Recall@5 | MRR |
|---|---|---|---|
| Architecture | 4 | 100% | 0.75 |
| Symbol lookup (literal) | 4 | 100% | 1.00 |
| Semantic concept | 3 | 67% | 0.33 |
| Metadata | 3 | 0% | 0.00 |
| Overall (14 queries) | 14 | 79% | 0.57 |
| Task | dotRAG | Filesystem | Ratio |
|---|---|---|---|
| File discovery | 1,914 | 68,073 | 35.6× |
| Semantic search | 592 | 5,418 | 9.2× |
| File read | 343 | 332 | 1.0× |
| Conceptual query | 503 | 1,429 | 2.8× |
| Exact symbol lookup | 432 | 525 | 1.2× |
| Multi-step workflow | 593 | 3,052 | 5.1× |
Early beta lost symbol lookup to grep. Literal mode was added — dotRAG now wins or ties all 6 MCP tasks. Average 4.5× fewer tokens.
Semantic mode finds files by meaning via cosine similarity over 384-dim embeddings. Literal mode finds exact symbols with line numbers. Both run locally — no network, no API keys.
Respects .gitignore. Default exclusions for node_modules, .git, build artifacts, vendor dirs. Configurable include/exclude patterns and max file size.
Per-file blob addressing via byte offsets. Metadata queries read only the manifest. Semantic queries decompress only top-K blobs.
Original file bytes stored with zstd compression. Extracted files are identical to the source. SHA-256 content hashes for verification.
Local embedding model (~80 MB, cached after first download). No network, no API keys, no external services at query time.
Exposes six tools over stdio transport: search, read_file, list_files, list_archives, inspect, extract. Compatible with Claude Code, Cursor, Windsurf.
dotRAG implements the Model Context Protocol over stdio. It exposes six tools: dotrag_search, dotrag_read_file, dotrag_list_files, dotrag_list_archives, dotrag_inspect, dotrag_extract. Configuration for each client below.
~/Library/Application Support/Claude/claude_desktop_config.json
~/Library/Application Support/Code/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json
~/.codex/config.toml
.continue/mcpServers/dotrag.json
~/.cursor/mcp.json or .cursor/mcp.json
Settings → Tools → AI Assistant → Model Context Protocol (MCP) → Add
.roo/mcp.json
.vscode/mcp.json
~/.codeium/windsurf/mcp_config.json
~/.config/zed/settings.json
rag must be on $PATH. If not, use the absolute path (e.g. /usr/local/bin/rag) in the command field.
The .rag format is a binary archive with five sections laid out sequentially. All section offsets are stored in a fixed 512-byte header. No external indexes, no sidecar files.
| Decision | Rationale |
|---|---|
| Embeddings in binary float32, not JSON | 3× smaller archives. A 384-dim vector is 1,536 bytes in binary vs ~4,600 bytes in JSON. |
| Manifest separate from embeddings | Metadata queries (file counts, types, summaries) load only the manifest. Fast, no vector math needed. |
| Per-file blob compression | Random access. Decompress one file without touching the rest. Critical for selective query routing. |
| Fixed 512-byte header | All section offsets readable in a single seek. No scanning, no variable-length preamble. |
| Zstd compression throughout | Best ratio-to-speed tradeoff for mixed content. Decompression is ~1 GB/s on modern hardware. |
| Content hash in header | Integrity verification without reading the full archive. Detect corruption or tampering early. |
When you run rag query, the engine classifies your question and picks the cheapest path:
| Route | What loads | When used |
|---|---|---|
| MANIFEST_ONLY | Header + manifest | "How many files?" / "What types?" / "When was this packed?" |
| SELECTIVE_DECOMPRESS | Header + manifest + embeddings + top-K blobs | "How does auth work?" / "Find the database layer" |
| LITERAL_MATCH | Header + all blobs (text scan) | "class PeerRegistry" / exact symbol names / code identifiers |
| FULL_SCAN | Everything | Broad questions that need cross-file context |
The format is open. The archive layout and routing model are documented in this section so the file structure is inspectable without any external service. View full spec on GitHub →
A native desktop app for creating, searching, and extracting .rag archives. Drag-and-drop folders to compress, search across all archives, and browse file contents with syntax highlighting.
Double-click any .rag file to open it directly. The desktop app registers as a file handler during install.
No runtime dependencies. The embedding model (~80 MB ONNX) downloads on first use and caches at ~/.cache/huggingface/.
| rag pack <dir> | Create a .rag archive from a directory |
| rag query <archive> <q> | Semantic search within an archive |
| rag query-all <q> | Search across all registered archives |
| rag inspect <archive> | View metadata and verify integrity |
| rag extract <archive> | Extract files to disk |
| rag list-files <archive> | List files with summaries |
| rag index | Rebuild the machine-wide archive registry |
| rag mcp | Launch MCP server for AI agent integration |
Not a compression tool. A .rag archive is larger than a .tar.zst of the same files because it includes per-file embeddings and metadata. The overhead is the cost of searchability.
Archives are point-in-time snapshots. Source changes make the archive stale. Repack when the directory changes.
Semantic search works best on natural-language queries ("how does auth work?"). For exact identifiers, use --mode literal. Auto mode now falls back to literal when semantic returns no results.
--include-everything exists for forensic use. On real codebases it degrades search quality because generated files dilute the embedding space.