Documentation Index
Fetch the complete documentation index at: https://amd-gaia.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Source Code:
src/gaia/code_index/ and src/gaia/agents/code_index/tools/mixin.pyLooking for the API? See the Code Index SDK Reference for
CodeIndexSDK, CodeIndexConfig, and CodeIndexToolsMixin.- CLI —
gaia-code index ...for build, search, status, clear, and chat. - Tool mixin —
CodeIndexToolsMixinis composed into the built-inCodeAgentand is available to any custom agent that opts in. - Python SDK —
CodeIndexSDKfor direct programmatic use.
gh is not required — the index covers source files only.
Overview
| Aspect | Details |
|---|---|
| Languages parsed | Python (AST), JavaScript, TypeScript, Go, Rust, Java, C, C++ |
| Embeddings | Local AMD NPU/GPU via Lemonade Server (nomic-embed-text-v2-moe-GGUF by default) |
| Vector store | FAISS IndexFlatL2 |
| Cache | ~/.gaia/code_index/<repo-hash>/ (atomic writes; incremental on file-hash) |
| Privacy | All processing local; sensitive files (.env, .pem, .key, …) auto-excluded |
Setup
Install the[rag] extras (FAISS + numpy live there):
CLI usage
Build the index
| Flag | Purpose |
|---|---|
--repo PATH | Repository root (default: cwd) |
--max-files N | Cap discovery (default 5000) |
--model M | Lemonade embedding model |
--base-url URL | Lemonade server URL (default http://localhost:8000/api/v1) |
--no-lemonade-check | Skip the server reachability check |
--use-claude / --use-chatgpt | Cloud LLM for index chat (embeddings still local) |
gaia-code index is incremental: unchanged files (matched by SHA-256) reuse their existing embeddings.
Search
Inspect and manage
Interactive Q&A
CodeAgent with the CodeIndexToolsMixin already wired in, so the agent can call index_codebase, search_code_index, get_index_status, and clear_code_index autonomously while answering your questions.
Agent integration
CodeAgent ships with the mixin already composed, so gaia-code chat sessions and any CodeAgent instance can call the four code-index tools:
| Tool | Description |
|---|---|
index_codebase | Build / refresh the FAISS index |
search_code_index | Semantic search over indexed chunks |
get_index_status | Report current index state |
clear_code_index | Remove the cached index |
CodeIndexToolsMixin directly in their class declaration; it’s also registered in KNOWN_TOOLS["code_index"] (src/gaia/agents/registry.py) for dynamic resolution. For SDK-level composition see the SDK reference.
Example interaction
Cache layout
Privacy
All processing is local. No source code is sent to external services. Sensitive filenames are skipped automatically:.env, .env.*, .htpasswd, SSH private keys (id_rsa, id_ed25519, …), and any *.pem, *.key, *.pfx, *.p12, *.jks, *.keystore.
Testing
The code-index stack has unit-test coverage intests/unit/test_code_index*.py — 64 tests spanning the SDK (cache layout, incremental re-index, atomic writes), the parsers (Python AST + the regex backends for JS/TS/Go/Rust/Java/C/C++), and the CodeIndexToolsMixin tool surface.
Run them with:
Real-world benchmark
Indexing the gaia repo itself against Lemonade Server’snomic-embed-text-v2-moe-GGUF embedding model:
| Metric | Value |
|---|---|
| Source files discovered | 973 |
| Semantic chunks indexed | 23,974 |
| Languages parsed | Python, JavaScript, TypeScript, Go, Rust, Java, C/C++ |
| Embedding model | nomic-embed-text-v2-moe-GGUF (via Lemonade Server) |
| Wall-clock (remote, ngrok) | ~51 min |