Documentation Index
Fetch the complete documentation index at: https://amd-gaia.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
File System Agent — Feature Specification
Branch: feature/chat-agent-file-navigation
Date: 2026-03-09
Status: Draft (v2 — post architecture review)
Owner: GAIA Team
1. Executive Summary
Enhance the GAIA Chat/RAG agent with a production-grade file system agent capable of browsing, searching, indexing, and deeply understanding a user’s PC file system. The goal is to provide Claude Code-caliber file navigation combined with persistent semantic indexing — giving the agent a “mental map” of the user’s machine that improves over time. This spec draws on analysis of 11 leading AI file system agents (Claude Code, Cursor, Copilot, Aider, Open Interpreter, Everything, MCP Filesystem, Anthropic Cowork, Windsurf, Cline, Devin) and maps their best capabilities onto GAIA’s existing infrastructure.2. Problem Statement
The current GAIA chat agent has solid foundational file tools (search_file, search_directory, read_file, search_file_content) and a mature RAG pipeline (FAISS + embeddings). However, it lacks:
| Gap | Impact |
|---|---|
| No persistent file system index/map | Agent forgets file locations between sessions |
| No structural understanding of the file system | Can’t answer “what projects do I have?” or “where are my tax docs?” |
| No metadata-aware search (size, date, type) | Can’t find “large files modified this week” |
| No file system statistics/dashboard | Can’t summarize disk usage or folder sizes |
| No bookmark/favorite system | User must re-navigate to the same places repeatedly |
| No file preview for rich formats | Limited to text content, no image/media metadata |
| No tree visualization | Hard to understand deep directory structures |
| No incremental index updates | Must re-index everything on changes |
| Limited content extraction | No DOCX, PPTX, XLSX content extraction |
3. Competitive Analysis Summary
3.1 Approaches Compared
| Agent | Strategy | Strengths | Weaknesses |
|---|---|---|---|
| Claude Code | Agentic search (Glob->Grep->Read, no index) | Highest precision, zero setup, fresh results | Token-heavy, no persistence |
| Cursor | Merkle tree + embeddings + AST | Fast incremental re-index, semantic search | Server-side processing, scales poorly >500K LOC |
| Aider | Repo map via tree-sitter AST + graph ranking | Elegant “table of contents” of codebase | Language-limited to tree-sitter support |
| Everything (voidtools) | NTFS MFT + change journal | Indexes millions of files in seconds | Name-only (no content search) |
| OpenAI File Search | Hosted RAG (auto chunk/embed) | 100M file scale, zero setup | Cloud-only, cost per query |
| MCP Filesystem | Structured tools with access control | Standard protocol, security annotations | Basic — no indexing or search intelligence |
| Windsurf | Codemaps + dependency graph + real-time flow | Deep cross-file understanding | Complex, code-focused |
| Open Interpreter | Code generation (Python/shell) | Full OS capability | No structure, high risk |
3.2 Key Insight: Hybrid Agentic + Indexed
The emerging consensus (2026) is that agentic search and RAG indexing serve different needs:- Agentic search (like Claude Code): Best for precision, freshness, ad-hoc exploration
- Persistent indexing (like Cursor/OpenAI): Best for repeated access, semantic queries, large collections
4. Architecture
4.1 Three-Layer Design
4.2 Component Diagram
4.3 Existing Tool Disposition
Critical decision: The existing FileSearchToolsMixin tools are replaced, not duplicated.
| Existing Tool | Disposition | Rationale |
|---|---|---|
search_file() | Replaced by find_files() | find_files() subsumes all search_file functionality plus adds index lookup, metadata filters, and smart scoping |
search_directory() | Replaced by find_files(search_type="name") | Directory search is a subset of unified find |
read_file() | Enhanced and moved to FileSystemToolsMixin | Add format support for DOCX, XLSX, images; keep same tool name for LLM familiarity |
search_file_content() | Enhanced and moved to FileSystemToolsMixin | Add context lines, exclusion patterns, result grouping |
FileSearchToolsMixin import is removed from ChatAgent and replaced with FileSystemToolsMixin. The old mixin remains available for other agents that don’t need the full file system feature set.
5. Feature Specification
5.1 Layer 1: File System Navigator
These tools give the agent the ability to browse and understand the file system interactively.IMPORTANT — Tool Decorator Pattern: GAIA’s@tooldecorator (src/gaia/agents/base/tools.py) extracts descriptions from docstrings, not from adescription=parameter. All tool code examples below use the correct pattern.
IMPORTANT — Path Validation: Every tool that accepts apathparameter MUST validate it throughPathValidator.is_path_allowed()before any filesystem access. This is enforced at the mixin level via a_validate_path()helper.
5.1.1 browse_directory(path, show_hidden, sort_by, filter_type)
Browse a directory with rich metadata display.
5.1.2 tree(path, max_depth, show_sizes, include_pattern, exclude_pattern)
Generate a tree visualization of directory structure.
5.1.3 file_info(path)
Get detailed information about a file or directory.
- Full path (resolved via
pathlib.Path) - File type (detected by
mimetypesstdlib, with optionalpython-magicenhancement) - Size (human-readable)
- Created / Modified dates
- MIME type
- Encoding detection (for text files, via
charset-normalizer) - Line count (for text files)
- Image dimensions (for images, via PIL if available)
- PDF page count (for PDFs)
- For directories: item count, total size, file type breakdown
5.1.4 read_file(path, lines, encoding) (ENHANCED existing tool)
Read file contents with smart formatting. Replaces the existing read_file() from FileSearchToolsMixin.
5.1.5 bookmark(action, path, label)
Manage file/directory bookmarks for quick access.
5.1.6 find_files(query, ...) (REPLACES search_file + search_directory)
Unified intelligent file search — the primary search entry point.
search_type="auto"):
- Check persistent index first (instant, if available)
- If query looks like a glob pattern -> use glob matching
- If query looks like a file name -> use name search
- If query contains content-like terms -> use content search
- Apply metadata filters (size, date, type) on results
- Current working directory (deepest)
- Home directory common locations
- All indexed directories
- Full drive search (only if
scope="everywhere"explicitly)
5.2 Deferred Tools (Phase 4+)
The following tools are deferred to reduce initial tool count and LLM confusion. They will be added after core tools are stable:| Tool | Phase | Rationale |
|---|---|---|
disk_usage(path, depth, top_n) | Phase 3 | Requires index to be performant |
compare_files(path1, path2) | Phase 4 | Niche use case, diff library needed |
find_duplicates(directory, method) | Phase 4 | Requires content hashing (opt-in) |
recent_files(days, file_type, directory) | Phase 3 | Can be done via find_files(date_range="this-week") |
find_by_metadata(criteria) | Merged | Absorbed into find_files() metadata parameters |
5.3 Layer 3: Persistent Knowledge Base (File System Index)
A SQLite-backed persistent index that gives the agent a lasting understanding of the user’s file system.5.3.1 Index Schema
- Added
schema_versiontable for migrations - Added
PRAGMA journal_mode=WALfor concurrent read/write - Removed
accessed_atcolumn (privacy-invasive, often inaccurate) - Made
content_hashDEFAULT NULL (opt-in, not computed during quick scan) - Removed
last_accessedfrom bookmarks (unnecessary) - Added
ON DELETE CASCADEto foreign keys - Added conditional index on
content_hash(only indexes non-null values)
5.3.2 Schema Migration Strategy
5.3.3 FileSystemIndexService Class
5.3.4 File System Map (LLM Context)
A condensed representation of the file system designed to fit in LLM context. Inspired by Aider’s repo map concept.5.3.5 Incremental Updates via Existing FileWatcher
Decision: Reuse the existingFileWatcherandFileChangeHandlerfromsrc/gaia/utils/file_watcher.pyinstead of creating a parallel watcher.
5.3.6 Initial Scan Strategy
The initial full scan needs to handle large file systems efficiently:5.4 Enhanced Document Indexing (RAG Upgrades)
5.4.1 New File Type Support
ExtendRAGSDK.index_document() to support:
| Format | Library | Extraction |
|---|---|---|
| DOCX | python-docx | Paragraphs, tables, headers, metadata |
| PPTX | python-pptx | Slide text, notes, speaker notes |
| XLSX | openpyxl | Sheet data, formulas (evaluated), headers |
| HTML | beautifulsoup4 | Visible text, headings, links |
| EPUB | ebooklib | Chapters, metadata |
| RTF | striprtf | Plain text extraction |
5.4.2 Smarter Chunking
Current chunking is line/character-based. Upgrade to content-aware chunking:- Max chunk size: 800 tokens
- Overlap: 200 tokens (25%)
- Preserve semantic boundaries (paragraph, function, section)
- Include parent context (file name, section header) in each chunk
5.4.3 Incremental Indexing with Metadata Change Detection
5.5 Layer 4: Data Scratchpad (SQLite Working Memory)
The critical missing piece for multi-document analysis. Gives the agent a structured working memory where it can accumulate, transform, and query extracted data using SQL.Key insight: LLMs are bad at math but great at extracting structured data from unstructured text. SQLite is perfect at math but can’t read PDFs. Combining them creates an agent that can process 12 months of credit card statements, extract every transaction, and produce perfect aggregations — something neither can do alone.
5.5.1 Why a Scratchpad?
| Without Scratchpad | With Scratchpad |
|---|---|
| Must fit all data in LLM context window | Process documents one at a time, accumulate in DB |
| LLM does math (inaccurate) | SQL does math (perfect) |
| Can’t handle 1000+ transactions | Handles millions of rows |
| Results lost between sessions | Persistent — pick up where you left off |
| No cross-document analysis | JOIN across tables from different documents |
5.5.2 Architecture
~/.gaia/file_index.db database (separate tables
from the file system index) or optionally in a per-session temp database.
5.5.3 Scratchpad Tools
5.5.4 Scratchpad Service
5.5.5 Multi-Document Processing Pipeline
The scratchpad enables a document processing pipeline pattern:max_steps=10 may be insufficient
for processing 12 documents. The config should be increased for data analysis tasks,
or the pipeline should batch multiple document extractions per step.
Recommended approach:
- Batch extraction: process 3-4 documents per LLM call (reduce step count)
- Or add a
max_stepsoverride for analysis mode:max_steps=30 - Or implement a
process_batch()tool that handles the loop internally
5.5.6 Security Constraints
| Constraint | Implementation |
|---|---|
| SQL injection prevention | Table names sanitized; parameterized queries via DatabaseMixin |
| Query restrictions | query_data() only allows SELECT statements |
| Table namespace | All scratchpad tables prefixed with scratch_ to isolate from system tables |
| Size limits | Max 100 tables, max 1M rows per table, max 100MB total scratchpad size |
| No external data | Scratchpad only stores data extracted from user’s own files |
| Cleanup | gaia fs scratchpad clear CLI command to wipe all scratchpad tables |
6. Demo Scenarios
6.1 Demo: Personal Finance Analyzer
“Find my credit card statements, analyze a year of spending, and tell me where my money is going.”Pipeline:
- Processes 12 real PDFs from the user’s actual PC
- Extracts ~600 transactions without hitting context limits
- SQL gives perfect math (no LLM hallucinated numbers)
- Finds hidden subscriptions automatically
- Actionable recommendations personalized to the user
- PDF table extraction (pdfplumber
extract_tables()) — add to extractors max_stepsincrease to 15-20 for analysis mode- Optionally: chart rendering in Electron UI (Recharts)
6.2 Demo: Tax Preparation Assistant
“Find all my tax-relevant documents and help me prepare for filing.”
6.3 Demo: Research Paper Literature Review
“I have a bunch of research papers on transformer architectures. Summarize them and find connections.”
6.4 Demo: Contract & Deadline Tracker
“Find all my contracts and leases, extract key dates and obligations.”
6.5 Demo: “Clean Up My PC”
“My PC is getting slow. Find what’s eating space and help me clean up.”
6.6 Demo: “Smart Project Onboarding”
“I just cloned a new project. Help me understand the codebase.”
6.7 What’s Needed for These Demos
| Capability | Status | Needed For |
|---|---|---|
File system search (find_files) | Spec’d (Phase 1) | All demos |
Directory browsing (browse_directory, tree) | Spec’d (Phase 1) | All demos |
| PDF text extraction | Existing (RAG) | Finance, Tax, Contracts |
| PDF table extraction (pdfplumber) | GAP — needs pdfplumber extract_tables() | Finance (critical) |
| DOCX/XLSX reading | Spec’d (Phase 4) | Tax, Research |
SQLite scratchpad (create_table, insert_data, query_data) | Spec’d above (Phase 2) | Finance, Tax, Research, Contracts |
| Multi-document batch processing | Needs max_steps increase or batch tool | Finance, Tax, Research |
| RAG indexing | Existing | Research, Onboarding |
| Disk usage analysis | Spec’d (Phase 3) | Cleanup demo |
| Duplicate detection | Spec’d (Phase 4) | Cleanup demo |
| Chart rendering (Electron UI) | GAP — needs Recharts in frontend | Finance (nice-to-have) |
| Calendar/reminder integration | GAP — not in scope | Contracts (nice-to-have) |
6.8 Priority Demo Implementation Order
| # | Demo | Impact | Effort | Phase Ready |
|---|---|---|---|---|
| 1 | Personal Finance Analyzer | Highest wow factor | Medium | Phase 2 + table extraction |
| 2 | Clean Up My PC | Most universal appeal | Low | Phase 3 |
| 3 | Contract Deadline Tracker | High practical value | Medium | Phase 2 + table extraction |
| 4 | Tax Preparation Assistant | High seasonal value | Medium | Phase 2 + DOCX/XLSX |
| 5 | Smart Project Onboarding | Developer audience | Low | Phase 1 + existing RAG |
| 6 | Research Literature Review | Academic audience | High | Phase 4 |
6.9 Agent Dashboard UI
The Electron/Web UI must provide full visibility into the agent’s state, the file system index, and the scratchpad database. This transforms the chat from a black box into a transparent, inspectable system.6.9.1 Dashboard Layout
6.9.2 Dashboard Tab (Agent State Overview)
A dedicated Dashboard tab showing the overall agent configuration and state:6.9.3 Scratchpad Tab (Data Explorer)
A dedicated Scratchpad tab with a full data explorer for inspecting tables:- Table list — shows all scratchpad tables with row counts
- Data grid — paginated table view with sortable columns
- SQL query bar — run ad-hoc SELECT queries against scratchpad
- Quick stats — auto-computed SUM/AVG/COUNT for numeric columns
- Export — download table data as CSV or JSON
- Schema view — show column names, types, and sample data
6.9.4 File Index Tab
A dedicated File Index tab for browsing the indexed file system:6.9.5 Inline Scratchpad Preview in Chat
When the agent uses scratchpad tools during a conversation, the chat area shows inline previews of the data — not just text descriptions:- Agent tool results include a structured marker (e.g.,
[TABLE:transactions:5 rows]) - The SSE handler passes structured data alongside the text response
MessageBubble.tsxdetects the marker and renders an interactiveDataTablecomponent- The
DataTablecomponent uses the same rendering as the Scratchpad tab
6.9.6 Frontend Dependencies for Dashboard
| Package | Purpose | Size |
|---|---|---|
recharts | Charts for spending breakdown, trends, disk usage | ~200 KB |
@tanstack/react-table | Sortable/paginated data tables for scratchpad | ~50 KB |
react-icons | File type icons for file index browser | ~20 KB |
package.json, not the Python backend.
6.9.7 API Endpoints for Dashboard
The dashboard needs dedicated API endpoints (added tosrc/gaia/api/):
7. Tool Registration Plan
7.1 New Mixin: FileSystemToolsMixin
Location: src/gaia/agents/tools/filesystem_tools.py (shared tools directory)
This mixin provides all Layer 1 and Layer 2 tools. Any agent can include it.
7.2 New Mixin: ScratchpadToolsMixin
Location: src/gaia/agents/tools/scratchpad_tools.py (shared tools directory)
7.3 ChatAgent Integration
FileSystemToolsMixin nor ScratchpadToolsMixin define
__init__. They are initialized via register_*_tools() called from the agent’s
_register_tools() method, following the same pattern as register_file_search_tools().
7.4 New Backend Services
Location:src/gaia/filesystem/ and src/gaia/scratchpad/
watcher.py— reuse existingFileWatcherfromgaia.utils.file_watcherextractors/media.py— deferred (audio/video metadata is niche)extractors/archive.py— deferred (ZIP listing is niche)chunkers/code_chunker.py— replaced withpython_chunker.py(no tree-sitter)
8. Configuration
8.1 ChatAgentConfig Additions
8.2 Feature Flags
The file system features can be fully disabled:--no-filesystem-indexCLI flag disables the index entirely- Without the index, tools still work but use direct filesystem access (slower)
- This is useful for privacy-sensitive environments
9. CLI Commands
9.1 gaia fs Subcommand
9.2 CLI Implementation
Add tosrc/gaia/cli.py following existing patterns (argparse subcommands):
10. Security & Privacy
10.1 Access Control
| Control | Implementation |
|---|---|
| Path validation | Every tool calls _validate_path() which uses PathValidator.is_path_allowed() |
| Symlink handling | Path.resolve() follows symlinks to real path; on Windows, check for junction points via os.path.islink() |
| Sensitive file detection | Three-tier response: BLOCK, SKIP, or WARN (see below) |
| Configurable exclusions | Platform-conditional defaults merged with user config |
| No content in index | SQLite stores metadata only — no file contents |
| Local-only | All indexing happens locally, nothing sent to cloud |
| Index file permissions | Set 0600 on file_index.db (user-only read/write) |
10.2 Sensitive File Handling
| Action | Patterns | Behavior |
|---|---|---|
| BLOCK (never index or read) | *.pem, *.key, *.p12, *.pfx, id_rsa, id_ed25519, *.keystore, .aws/credentials, .ssh/* | Skip entirely during scanning. If user explicitly requests via read_file, return “This file type is blocked for security.” |
| SKIP (don’t index, allow explicit read) | .env, .env.*, .npmrc, .pypirc, credentials*, secrets* | Skip during directory scanning. Allow read_file with a warning: “This file may contain sensitive data.” |
| WARN (index metadata, warn on read) | *password*, *token*, *secret* | Index file metadata (name, size, date). Warn when content is read. |
10.3 Default Exclusions (Platform-Conditional)
10.4 Index Security
The SQLite database at~/.gaia/file_index.db stores file paths, sizes, and modification dates. While no file content is stored, this metadata reveals the user’s file system structure.
Mitigations:
- Set restrictive file permissions (0600) on database file
- Document the risk in user-facing documentation
- Provide
gaia fs resetcommand to delete the index - Future consideration: SQLCipher encryption (deferred, adds native dependency)
11. Performance Targets
| Operation | Target | Strategy |
|---|---|---|
| Home directory structure scan | < 5 sec | Metadata-only walk, skip excluded dirs |
| File name search (indexed) | < 100 ms | SQLite FTS5 query |
| File name search (not indexed) | < 10 sec | Fallback to pathlib.rglob() |
| Content search (single dir) | < 5 sec | Python open() + regex per file |
| Directory tree (depth=3) | < 2 sec | Direct filesystem walk |
| File info | < 500 ms | os.stat() call |
| Incremental index update | < 1 sec | Size + mtime comparison only |
| Full re-scan (50K files) | < 60 sec | Background, non-blocking |
| SQLite concurrent read/write | No errors | WAL mode + retry logic |
| Scenario | Max Memory |
|---|---|
| Index with 50K files | < 50 MB (SQLite on disk) |
| Directory scan in progress | < 100 MB |
| File system map in memory | < 5 MB |
12. Implementation Phases
Phase 1: Core Navigator (Week 1-2)
Goal: 6 core tools operational, no index dependency.- Create
src/gaia/filesystem/package structure - Implement
FileSystemToolsMixinwithregister_filesystem_tools():browse_directory()— directory listing with metadatatree()— tree visualizationfile_info()— detailed file/directory infofind_files()— unified search (glob-based, no index yet)read_file()— enhanced file reading (text, code, CSV, JSON)bookmark()— in-memory bookmarks (persisted in Phase 2)
- Add
_validate_path()withPathValidatorintegration - Remove
FileSearchToolsMixinfromChatAgent, replace withFileSystemToolsMixin - Keep
FileSearchToolsMixinavailable for other agents - Add
ChatAgentConfigfilesystem fields - Add unit tests for all 6 tools (mock filesystem)
- Add integration tests with real filesystem
- Manual testing of navigation flow
Phase 2: Persistent Index + Data Scratchpad (Week 2-3)
Goal: SQLite-backed file system memory AND structured data analysis. File System Index:- Implement
FileSystemIndexServiceinheriting fromDatabaseMixin - Implement SQLite schema with WAL mode and FTS5
- Implement schema migration system (
schema_versiontable) - Implement
scan_directory()— Phase 1 quick scan (metadata only) - Implement FTS5 name/path search via
query_files() - Connect
find_files()to index for fast lookup (< 100ms) - Implement
bookmark()persistence via index service - Implement
auto_categorize()by extension - Add integrity check on startup with auto-rebuild
- Add
gaia fsCLI commands:scan,status,search,bookmarks,reset - Unit + integration tests for index service
- Test concurrent read/write (WAL mode)
- Create
src/gaia/scratchpad/package - Implement
ScratchpadServiceinheriting fromDatabaseMixin - Implement
ScratchpadToolsMixinwithregister_scratchpad_tools():create_table()— create analysis workspace tablesinsert_data()— bulk insert extracted data (JSON array input)query_data()— run SELECT queries for analysislist_tables()— show scratchpad contentsdrop_table()— cleanup after analysis
- Add table name sanitization and SQL injection prevention
- Add size limits (100 tables, 1M rows/table, 100MB total)
- Register
ScratchpadToolsMixinin ChatAgent - Add
gaia fs scratchpad clearCLI command - Unit tests for all 5 scratchpad tools
- Integration test: multi-document extraction pipeline
- Increase
max_stepsdefault to 20 for analysis workflows
- End-to-end test: Personal Finance Analyzer demo with sample PDFs
- End-to-end test: Tax Preparation demo with sample documents
Phase 3: Knowledge Base (Week 3-4)
Goal: Smart context, background maintenance, and additional tools.- Implement
FileSystemMapdataclass withto_context_string() - Implement on-demand map injection (via tool, not always-on)
- Integrate
FileWatcherfromgaia.utils.file_watcherfor real-time updates - Limit watching to bookmarked/scanned directories only
- Implement
disk_usage()tool (uses index data when available) - Add first-run experience flow (quick scan on first tool use)
- Implement
cleanup_stale()for removing deleted file entries - Implement periodic re-scan (configurable interval, default: weekly)
- Performance benchmarking against targets
- Add
gaia fs cleanupandgaia fs treeCLI commands
Phase 4: Enhanced Extraction (Week 4-5)
Goal: Rich document support, smart chunking, and remaining tools.- Implement content extractors:
- Office formats (DOCX, PPTX, XLSX) — optional dependencies
- Enhanced PDF (wrapping existing
rag/pdf_utils) - Image metadata (PIL/Pillow if available)
- HTML content extraction (beautifulsoup4)
- Implement smart chunkers:
- Markdown chunker (header/section boundaries)
- Prose chunker (paragraph boundaries)
- Python chunker (stdlib
astmodule) - Table chunker (header-preserving)
- Integrate extractors with RAG pipeline
- Implement incremental indexing with metadata change detection
- Add
compare_files()andfind_duplicates()tools - Opt-in content hashing for duplicate detection
- End-to-end testing with diverse file types
Phase 5: Polish & Testing (Week 5-6)
Goal: Production-ready quality.- Performance benchmarking against all targets (time + memory)
- Large file system stress testing (100K+ files)
- Windows/Linux/macOS compatibility testing
- Security audit (path traversal, symlink attacks, sensitive file handling)
- Documentation: user guide (
docs/guides/filesystem.mdx) - Documentation: SDK reference (
docs/sdk/sdks/filesystem.mdx) - Update
docs/docs.jsonnavigation - Update
docs/reference/cli.mdxwithgaia fscommands - Error handling and recovery for corrupted index
- MCP exposure consideration (expose tools via MCP for external agents)
13. Dependencies
New Dependencies
| Package | Purpose | Size | Required? | Install Group |
|---|---|---|---|---|
pdfplumber | PDF table extraction | ~2 MB | Recommended | gaia[filesystem] |
charset-normalizer | Encoding detection | ~1 MB | Optional | gaia[filesystem] |
python-docx | DOCX extraction | ~1 MB | Optional | gaia[filesystem] |
python-pptx | PPTX extraction | ~1 MB | Optional | gaia[filesystem] |
openpyxl | XLSX extraction | ~3 MB | Optional | gaia[filesystem] |
beautifulsoup4 | HTML extraction | ~500 KB | Optional | gaia[filesystem] |
python-magic— Replaced bymimetypes(stdlib).python-magicrequireslibmagicDLL on Windows which is unreliable. Extension-based detection viamimetypesis the DEFAULT.chardet— Replaced bycharset-normalizer(MIT license, faster, used byrequests)
Existing Dependencies (already in GAIA)
| Package | Usage |
|---|---|
sqlite3 | Index database (stdlib) |
mimetypes | File type detection (stdlib) |
pathlib | Path manipulation (stdlib) |
ast | Python code chunking (stdlib) |
watchdog | File system monitoring |
faiss-cpu | Vector search (RAG) |
sentence-transformers | Embeddings (RAG) |
PyPDF2 / pdfplumber | PDF extraction |
Extras Group
14. Testing Strategy
14.1 Test Matrix
| Component | Unit Tests | Integration Tests | Notes |
|---|---|---|---|
FileSystemToolsMixin (6 tools) | Yes (mock filesystem via tmp_path) | Yes (real filesystem) | Test each tool with expected output format |
FileSystemIndexService | Yes (in-memory SQLite) | Yes (real SQLite file) | Test scan, query, FTS5, incremental, migrations |
| File watcher integration | Yes (mock events) | Yes (real watchdog) | Test create/modify/delete callbacks |
| Content extractors | Yes (fixture files) | No | Test each format with sample files |
| SmartChunker | Yes (fixture content) | No | Test boundary detection accuracy |
CLI commands (gaia fs) | Yes (subprocess) | Yes (real index) | Test each subcommand |
| ChatAgent integration | No | Yes (mock LLM) | End-to-end with mock LLM choosing tools |
14.2 Test File Locations
14.3 Performance Benchmarks
15. Success Metrics
| Metric | Target |
|---|---|
| Can answer “where is file X?” from index | < 1 second |
| Can summarize “what’s in directory Y?” | Accurate tree + stats |
| Can find files by content | Correct results with context |
| Can find files by metadata (size, date, type) | Correct filtering |
| Remembers file locations across sessions | 100% (via SQLite) |
| Handles home dir with 50K+ files | No OOM, < 60s scan, < 50MB memory |
| Zero data leakage (all local) | Verified by security audit |
| Works on Windows, Linux, macOS | Tested on all three |
| LLM tool selection accuracy | > 90% correct tool choice (6 tools) |
| No tool name confusion | Zero overlap with remaining agent tools |
16. Decisions Log
Decisions made during architecture review (2026-03-09):| # | Decision | Rationale |
|---|---|---|
| D1 | Use docstrings for tool descriptions, not description= param | GAIA’s @tool decorator reads from __doc__ (line 73 of tools.py) |
| D2 | Inherit FileSystemIndexService from DatabaseMixin | Reuse existing init_db(), query(), insert(), transaction() |
| D3 | Reuse FileWatcher from gaia.utils.file_watcher | Avoid parallel infrastructure; existing watcher is mature |
| D4 | 6 core tools initially (not 11) | Reduce LLM confusion; deferred tools added in Phase 3-4 |
| D5 | Replace FileSearchToolsMixin in ChatAgent | Avoid semantic overlap (find_files vs search_file) |
| D6 | Metadata-based change detection (size + mtime) | Content hashing reads every file = too slow for quick scan |
| D7 | Content hashing is opt-in | Privacy + performance; enabled via --full flag or config |
| D8 | Watch only bookmarked/scanned directories | Full home dir watching exhausts OS watch handles |
| D9 | File system map is on-demand, not always-on | Save ~800 tokens per non-file query; critical for small LLMs |
| D10 | mimetypes (stdlib) over python-magic | python-magic requires libmagic DLL on Windows |
| D11 | charset-normalizer over chardet | MIT license, faster, modern replacement |
| D12 | No accessed_at in schema | Privacy-invasive, often inaccurate, marginal value |
| D13 | WAL mode for SQLite | Concurrent read/write without SQLITE_BUSY errors |
| D14 | Platform-conditional exclusion patterns | Windows-only paths like $Recycle.Bin don’t exist on Linux |
| D15 | Three-tier sensitive file handling (BLOCK/SKIP/WARN) | Clear, explicit behavior instead of vague “warn” |
| D16 | Schema migration via schema_version table | Graceful upgrades for existing users |
| D17 | Conservative default scan depth (3) | Deeper scanning triggers antivirus alerts, takes too long |
| D18 | No tree-sitter dependency | Use stdlib ast for Python; regex for other languages |
| D19 | Defer Everything/Windows Search API integration | Platform-specific complexity; can accelerate later |
| D20 | Defer project/workspace concept | Good future feature but adds schema + UI complexity |
| D21 | SQLite scratchpad as agent working memory | LLMs bad at math, SQL perfect; enables multi-doc analysis without context limits |
| D22 | Scratchpad shares DB file with file index | Single file_index.db with scratch_ table prefix; simpler than separate databases |
| D23 | max_steps increase to 20 for analysis mode | Processing 12 documents needs more than 10 steps; batch extraction helps too |
| D24 | pdfplumber for table extraction | Critical for finance/tax demos; PyMuPDF does text but not structured tables |
| D25 | Query-only restriction on query_data() tool | Security: mutations only through dedicated insert_data/drop_table tools |
17. References
- Claude Code Tool System — Agentic search architecture
- Why Claude Code Doesn’t Index — Agentic vs. RAG tradeoffs
- How Cursor Indexes Codebases — Merkle tree + embeddings
- Aider Repository Map — Tree-sitter AST graph ranking
- Everything (voidtools) — NTFS MFT indexing
- MCP Filesystem Server — Standard file tools
- OpenAI File Search — Hosted RAG at scale
- Anthropic Agent Skills — Folder-based context
- Windsurf Codemaps — AI-annotated code navigation
Appendix A: Deferred Feature Details
A.1 disk_usage(path, depth, top_n) — Phase 3
A.2 compare_files(path1, path2) — Phase 4
A.3 find_duplicates(directory, method) — Phase 4
A.4 MCP Exposure — Phase 5
Consider exposing file system tools via MCP for external agent access:- Read-only tools (
browse_directory,tree,file_info,find_files,read_file) can be exposed - Write tools and bookmark management should require explicit opt-in
- Use MCP tool annotations to mark read-only vs. write operations