Skip to main content

Documentation Index

Fetch the complete documentation index at: https://amd-gaia.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Import: from gaia.agents.base.memory import MemoryMixin Import: from gaia.agents.base.memory_store import MemoryStore Import: from gaia.agents.base.discovery import SystemDiscovery
See also: User Guide · Agent System · Tool Decorator

Architecture

The memory system has three layers:
LayerClassFilePurpose
Agent integrationMemoryMixinmemory.pyHooks memory into the Agent lifecycle — embedding pipeline, FAISS index, hybrid search orchestration, Mem0-style LLM extraction, consolidation, reconciliation
Data layerMemoryStorememory_store.pyPure SQLite + FTS5 storage with schema v2 (embedding BLOB, superseded_by, consolidated_at), vector data retrieval, temporal filtering, consolidation queries, reconciliation queries. No agent dependencies.
BootstrapSystemDiscoverydiscovery.pyLocal system scanner for day-zero onboarding. Returns facts for user review.
Agent
  ├── MemoryMixin (hooks into prompt, tool exec, post-query)
  │     ├── Embedding pipeline (Lemonade → nomic-embed-text-v2, 768-dim)
  │     ├── FAISS index (IndexFlatIP, cosine similarity)
  │     ├── Cross-encoder reranker (ms-marco-MiniLM-L-6-v2)
  │     └── MemoryStore (SQLite: ~/.gaia/memory.db)
  │           ├── conversations table (+ consolidated_at column)
  │           ├── knowledge table (FTS5 + embedding BLOB + superseded_by)
  │           └── tool_history table
  └── SystemDiscovery (file system, git, apps, browser scans)

MemoryMixin

The primary integration point. Add this mixin to any Agent subclass to give it persistent memory.

Inheritance Order

MemoryMixin must come before Agent in the class declaration. This is required because MemoryMixin overrides process_query and _execute_tool, both of which call super() to reach the Agent base class. If Agent is listed first, both overrides are silently shadowed.
from gaia.agents.base.agent import Agent
from gaia.agents.base.memory import MemoryMixin

# Correct -- MemoryMixin before Agent
class MyAgent(MemoryMixin, Agent):
    def __init__(self, **kwargs):
        self.init_memory()              # Before super().__init__()
        super().__init__(**kwargs)

    def _register_tools(self):
        super()._register_tools()
        self.register_memory_tools()    # Exposes 5 tools to the LLM
If you put Agent before MemoryMixin in the class declaration, tool logging and dynamic context injection will silently fail. Python’s MRO requires the mixin to appear first.

init_memory()

Initialize the memory subsystem. Call this before super().__init__().
def init_memory(
    self,
    db_path: Optional[Path] = None,
    context: str = "global"
) -> None:
ParameterTypeDefaultDescription
db_pathPath~/.gaia/memory.dbPath to the SQLite database file
contextstr"global"Active context scope (e.g., "work", "personal", "global")
v2 startup sequence:
  1. Open/create DB, apply schema migrations (v1 → v2: adds embedding BLOB, superseded_by TEXT, consolidated_at TEXT)
  2. Validate Lemonade embedding service connectivity — raises RuntimeError if unreachable
  3. Backfill embeddings for items missing them (up to 100 per startup)
  4. Rebuild FAISS index from stored embeddings
  5. apply_confidence_decay() — 30-day decay
  6. reconcile_memory() — Hindsight-inspired, max 20 pairs
  7. consolidate_old_sessions() — max 5 sessions
  8. prune() — 90-day hard delete
  9. Generate session UUID
Embedding is a hard requirement in v2. If the Lemonade embedding service is unavailable, init_memory() raises RuntimeError("Lemonade embedding service required for memory system"). There is no silent degradation to keyword-only search.

get_memory_system_prompt()

Returns the stable frozen prefix for the system prompt. Always includes proactive usage instructions for the LLM, plus any stored preferences, facts, skills, and error patterns. Nothing time-sensitive. This method is called automatically by Agent._get_mixin_prompts().
def get_memory_system_prompt(self) -> str
The output stays frozen for the entire session so the LLM inference engine can reuse its KV cache across turns. Always returns a non-empty string — even with zero stored memories, the instructions block is included so the LLM knows it has persistent memory tools. Example output (with stored memories):
=== MEMORY (Persistent Second Brain) ===
You have persistent memory across sessions. USE IT PROACTIVELY:
- When the user states a fact, preference, or commitment → call `remember` immediately
- When the user asks what you know, what was discussed, or about a person/project → call `recall`
- When information changes or is corrected → call `recall` to find the old item, then `update_memory`
- When the user mentions a deadline or reminder → call `remember` with due_at (ISO 8601)
- When the user wants to forget something → call `recall` to find it, then `forget`
- BIAS TOWARD REMEMBERING: if in doubt, store it. It's better to remember too much than too little.
- Every fact, preference, name, project detail, deadline, or observation is worth storing.

Preferences:
  - tone: professional but friendly
  - code_style: black formatter, 88 char lines

Known facts:
  - Project uses React 19 with app router (confidence: 0.82)
  - User's name is Alex, role is tech lead (confidence: 0.95)

Skills:
  - Deploy workflow: test → build → push → verify (confidence: 0.88)
  - Docker compose: always use --build flag on first run (confidence: 0.72)
  - Git bisect: use binary search for regression hunting (confidence: 0.65)

Known errors to avoid:
  - execute_code: "import torch" fails -- torch not installed on this machine
  - pip install: always use --index-url for PyTorch packages
Example output (zero memories stored):
=== MEMORY (Persistent Second Brain) ===
You have persistent memory across sessions. USE IT PROACTIVELY:
[... same instructions ...]

No memories stored yet. Start building your knowledge base by remembering what the user tells you.
Filters applied to the knowledge sections:
  • Includes items from global context + active context
  • Excludes items where sensitive=1
  • Excludes items where superseded_by IS NOT NULL (only current/active items)
  • Sorted by confidence descending
  • Hard limits: max 10 preferences, 5 facts, 3 skills, 5 errors
  • Hard cap on total output: 4000 chars (truncated with ... (memory truncated) if exceeded)

get_memory_dynamic_context()

Returns the per-turn dynamic context that is prepended to the user message each turn. Contains the current time and upcoming/overdue items.
def get_memory_dynamic_context(self) -> str
This is injected into the user message (not the system prompt) so the frozen prefix is preserved for KV-cache reuse. Example output:
[GAIA Memory Context]
Current time: 2026-03-25T10:30:00-0700 (Tuesday)

Upcoming/overdue:
  - [DUE Mar 27] Online course starts next week
  - [OVERDUE Mar 24] Follow up on deployment review
After mentioning a time-sensitive item, call update_memory to set reminded_at so you don't repeat yourself.
Always returns at least the current time. The upcoming/overdue section is included only when time-sensitive items are active. Returns an empty string only if init_memory() has not been called.

register_memory_tools()

Registers the 5 LLM-facing memory tools with the agent’s tool registry. Call this from your agent’s _register_tools() method.
def register_memory_tools(self) -> None
Registers: remember, recall, update_memory, forget, search_past_conversations.

set_memory_context()

Switch the active context mid-session. Affects system prompt filtering and the default context for new remember calls.
def set_memory_context(self, context: str) -> None
agent.set_memory_context("work")      # Switch to work context
agent.set_memory_context("personal")  # Switch to personal context

reset_memory_session()

Start a fresh memory session. Generates a new session ID and applies confidence decay to unused knowledge.
def reset_memory_session(self) -> None
Confidence decay multiplies the confidence of items not accessed in 30+ days by 0.9. This is called once per session start to keep knowledge fresh.

Properties

PropertyTypeDescription
memory_storeMemoryStoreDirect access to the underlying data layer
memory_session_idstrCurrent session UUID
memory_contextstrCurrent active context (e.g., "work", "global")

Embedding Pipeline

These methods handle the vector embedding pipeline for hybrid search. All are internal (_-prefixed) — you do not call them directly.
def _get_embedder(self) -> Any
Lazy-initializes a LemonadeProvider for embedding. Cached for the process lifetime. Raises RuntimeError if Lemonade is unreachable.
def _embed_text(self, text: str) -> np.ndarray
Embeds a single text string into a 768-dimensional vector via nomic-embed-text-v2-moe-GGUF. Returns a normalized numpy array suitable for cosine similarity via FAISS IndexFlatIP.
def _backfill_embeddings(self, limit: int = 100) -> int
Embeds knowledge items that are missing embeddings (e.g., after a v1 → v2 migration). Called automatically during init_memory() startup. Returns the number of items backfilled.
def _hybrid_search(
    self,
    query: str,
    category: str = None,
    context: str = None,
    entity: str = None,
    include_sensitive: bool = False,
    top_k: int = 5,
    time_from: str = None,
    time_to: str = None,
) -> List[Dict]:
Combines vector similarity (FAISS) and keyword matching (FTS5 BM25) via Reciprocal Rank Fusion (RRF), then reranks with a cross-encoder. The full pipeline:
  1. Embed query via Lemonade (nomic-embed-text-v2, 768-dim)
  2. FAISS cosine search: top-K × 4 candidates (oversample)
  3. FTS5 BM25 search: top-K × 4 candidates (oversample)
  4. Deduplicate by ID, apply RRF weights: 0.6 / (60 + rank_vector) + 0.4 / (60 + rank_bm25)
  5. Cross-encoder reranking (cross-encoder/ms-marco-MiniLM-L-6-v2, ~22MB, CPU) on fused candidates
  6. Return final top-K results
  7. Bump confidence +0.02 and increment use_count on recalled items
ParameterTypeDefaultDescription
querystrrequiredNatural language search query
categorystrNoneFilter by category
contextstrNoneFilter by context scope
entitystrNoneFilter by entity
include_sensitiveboolFalseInclude sensitive items
top_kint5Maximum results returned
time_fromstrNoneISO 8601 lower bound on created_at
time_tostrNoneISO 8601 upper bound on created_at

Complexity-Aware Recall Depth

def _classify_query_complexity(self, query: str) -> int
Adapts retrieval depth based on query complexity. Returns an adaptive top_k value — no LLM call needed, purely heuristic:
ComplexityHeuristic Signalstop_k
Simple< 8 words, single entity, no comparison words3
Medium8–20 words, or contains “how”, “why”, “explain”, “describe”, “summarize”, or “what happened”5
Complex> 20 words, or contains “compare”, “across”, “all”, “history”, “everything”, “between”, “throughout”10

Mem0-Style LLM Extraction

def _extract_via_llm(
    self,
    user_input: str,
    assistant_response: str,
    existing_items: List[Dict],
) -> List[Dict]:
Sends the conversation turn plus existing memory to the LLM, which returns a JSON array of operations:
OperationDescriptionRequired fields
addNew knowledge not already in memoryop, category, content, optional: entity, domain, confidence (default 0.4)
updateModify existing item (correction, enrichment, supersession)op, knowledge_id, content, optional: entity, domain
deleteRemove item contradicted or invalidatedop, knowledge_id, reason
noopInformation already captured — not included in output
The extraction fetches top-10 relevant existing items first via _hybrid_search(), so the LLM can see what already exists and decide whether to add, update, delete, or do nothing. This replaces v1’s regex-based heuristic extraction. Error handling: Invalid JSON → logged error, skip this turn. Timeout (3s) → logged warning, skip. Individual operation failure → logged, continue with remaining operations. No fallback to regex heuristics.

Consolidation

def consolidate_old_sessions(self, max_sessions: int = 5) -> Dict:
Distills old conversation sessions into durable knowledge before they age out at the 90-day prune boundary. Called automatically during init_memory() startup. Returns: {"consolidated": int, "extracted_items": int} Criteria for consolidation:
  • All turns in the session are > 14 days old
  • Session has ≥ 5 turns
  • At least one turn has consolidated_at IS NULL
Process:
  1. Fetch up to 20 turns per session (oldest first)
  2. Call LLM with consolidation prompt → returns summary + extracted knowledge
  3. Store summary as knowledge(category="note", source="consolidation", domain="session:{id[:8]}")
  4. Store each extracted item via store() (normal dedup applies)
  5. Mark all fetched turns with consolidated_at = now

Reconciliation

def reconcile_memory(self, max_pairs: int = 20) -> Dict:
Background reconciliation of high-similarity knowledge pairs. Detects and resolves contradictory, reinforcing, or weakening facts that were never co-retrieved during extraction. Called on startup after decay, before consolidation. Returns: {"pairs_checked": int, "reinforced": int, "contradicted": int, "weakened": int, "neutral": int} Process:
  1. For each context, compute pairwise embedding similarity among active items
  2. Flag pairs with cosine similarity > 0.85
  3. For each flagged pair, a single LLM call classifies the relationship:
RelationshipAction
reinforceBoost confidence of both items by +0.05
contradictSupersede the older item (superseded_by = newer_id), boost newer confidence +0.1
weakenReduce confidence of the older item by 0.1
neutralNo action (similar words, different topics)
Rate-limited to max_pairs classifications per startup (~20s on local LLM). Highest-similarity pairs processed first.

Lifecycle Hooks

MemoryMixin hooks into the Agent lifecycle at 3 points. These are automatic — you do not call them directly. Hook 1: process_query() override Prepends per-turn dynamic context (time + upcoming items) to the user message. Saves the original user input so _after_process_query can store the clean version without the context prefix. Hook 2: _execute_tool() override Wraps every non-memory tool call to auto-log it to tool_history. If a tool fails, the error is automatically stored as knowledge (category="error") for future avoidance. Memory tools (remember, recall, etc.) are excluded from logging to avoid noise and recursion. Hook 3: _after_process_query() callback Called after process_query() completes. Stores both conversation turns (user + assistant) in the conversations table and runs Mem0-style LLM extraction (ADD/UPDATE/DELETE/NOOP operations against existing memory). For turns ≥ 20 words, the extraction pipeline fetches top-10 relevant existing items via _hybrid_search(), then asks the LLM to decide what operations to perform — no regex heuristic fallback.

KV-Cache Frozen Prefix Design

The system prompt is deliberately split into two parts:
PartMethodWhere injectedChanges between turns?
Stable prefixget_memory_system_prompt()System prompt via _get_mixin_prompts()No — frozen for KV-cache reuse
Dynamic contextget_memory_dynamic_context()Prepended to user message each turnYes — current time, upcoming items
This design allows LLM inference engines (like Lemonade Server) to cache the attention computations for the system prompt and reuse them across conversation turns. Only the small dynamic section (typically 2-5 lines) changes per turn.
# Simplified flow inside MemoryMixin.process_query():
def process_query(self, user_input, **kwargs):
    self._original_user_input = user_input           # Save clean version
    dynamic = self.get_memory_dynamic_context()       # Time + upcoming
    augmented = f"{dynamic}\n\n{user_input}" if dynamic else user_input
    return super().process_query(augmented, **kwargs)  # System prompt stays frozen

MemoryStore

The pure data layer. Agent-agnostic — no imports from gaia.agents. Thread-safe via threading.Lock. Uses WAL mode for concurrent reads.

Constructor

class MemoryStore:
    def __init__(self, db_path: Path = None):
        """Open or create the database.
        Default path: ~/.gaia/memory.db
        Uses WAL mode. Thread-safe."""

Database Schema (v2)

Three tables in a single SQLite file. Schema version 2 adds vector embedding support and fact lineage tracking.
  • conversations — every conversation turn, persistent across sessions, with FTS5 index. v2 adds consolidated_at TEXT column for tracking which turns have been distilled to knowledge.
  • knowledge — persistent facts, preferences, errors, skills with FTS5 index, confidence scoring, context scoping, entity linking, temporal fields (due_at, reminded_at). v2 adds embedding BLOB (768-dim float32 vector) and superseded_by TEXT (fact lineage — ID of newer item that replaced this one).
  • tool_history — every tool call the agent makes, auto-logged with success/failure, duration, error messages
Schema migrations run automatically in MemoryStore.__init__(). v1 → v2 adds:
ALTER TABLE knowledge ADD COLUMN embedding BLOB;
ALTER TABLE knowledge ADD COLUMN superseded_by TEXT;
ALTER TABLE conversations ADD COLUMN consolidated_at TEXT;

Knowledge Methods

store()

def store(
    self,
    category: str,              # 'fact' | 'preference' | 'error' | 'skill' | 'note' | 'reminder'
    content: str,               # Human-readable description
    domain: str = None,         # Optional grouping (e.g., 'python', 'deployment')
    metadata: dict = None,      # JSON blob for structured data
    confidence: float = 0.5,    # 0.0 to 1.0
    due_at: str = None,         # ISO 8601 for time-sensitive items
    source: str = "tool",       # 'tool' | 'llm_extract' | 'error_auto' | 'user' | 'discovery' | 'consolidation'
    context: str = "global",    # Context scope
    sensitive: bool = False,    # Exclude from system prompt if True
    entity: str = None,         # Entity reference (e.g., 'person:sarah_chen')
) -> str:                       # Returns knowledge_id (UUID)
Deduplication: If a new entry has >80% word overlap (Szymkiewicz-Simpson coefficient) with an existing entry in the same category + context + entity scope, the existing entry is updated with the newer content. The newer fact is assumed to be more current. Validation: content must be non-empty (raises ValueError otherwise). Content longer than 2000 characters is silently truncated. due_at, if provided, is normalized to timezone-aware ISO 8601. Embedding: After storage, MemoryMixin immediately embeds the new item via _embed_text() and writes the embedding BLOB back via store_embedding(). The FAISS index is incrementally updated.
def search(
    self,
    query: str,                     # FTS5 search query
    category: str = None,           # Filter by category
    context: str = None,            # Filter by context
    entity: str = None,             # Filter by entity
    include_sensitive: bool = False, # Include sensitive items
    top_k: int = 5,                 # Max results
    time_from: str = None,          # ISO 8601 lower bound on created_at
    time_to: str = None,            # ISO 8601 upper bound on created_at
) -> List[Dict]:
Pure FTS5 keyword search with BM25 ranking. Uses AND semantics by default; if zero results, falls back to OR. Bumps confidence +0.02 on each recalled item. Filters on superseded_by IS NULL to return only current/active items. The time_from and time_to parameters add temporal filtering on created_at, narrowing results before BM25 ranking.
This is the keyword component of search. For full hybrid search (vector + BM25 + RRF + cross-encoder reranking), use MemoryMixin._hybrid_search(), which calls this method internally as one of its two retrieval signals.

get_by_category()

def get_by_category(
    self,
    category: str,
    context: str = None,
    limit: int = 10,
) -> List[Dict]:
Filters on superseded_by IS NULL to return only current/active items.

get_by_entity()

def get_by_entity(
    self,
    entity: str,          # e.g., 'person:sarah_chen'
    limit: int = 20,
) -> List[Dict]:
Returns all knowledge linked to a specific entity. Filters on superseded_by IS NULL to return only current/active items.

get_upcoming()

def get_upcoming(
    self,
    within_days: int = 7,
    include_overdue: bool = True,
    context: str = None,
    limit: int = 10,
) -> List[Dict]:
Returns time-sensitive items due within N days or overdue. Filters out items that have already been reminded about (unless the due date has passed since the last reminder). Filters on superseded_by IS NULL to return only current/active items.

update()

def update(
    self,
    knowledge_id: str,
    content: str = None,
    category: str = None,
    domain: str = None,
    metadata: dict = None,
    context: str = None,
    sensitive: bool = None,
    entity: str = None,
    due_at: str = None,
    reminded_at: str = None,
    superseded_by: str = None,  # ID of newer item that replaces this one
) -> bool:                  # False if ID not found
Only provided fields are changed. Sets updated_at to the current time. When content is updated, the stored embedding is cleared (embedding = NULL) to force re-embedding. The superseded_by parameter is used by the LLM extraction pipeline to mark old items as replaced by newer versions while preserving fact lineage.

delete()

def delete(self, knowledge_id: str) -> bool

apply_confidence_decay()

def apply_confidence_decay(
    self,
    days_threshold: int = 30,
    decay_factor: float = 0.9,
) -> int:                   # Returns number of entries decayed
Multiplies confidence by decay_factor for items not accessed in days_threshold days. Called once per session start via reset_memory_session().

update_confidence()

def update_confidence(self, knowledge_id: str, delta: float) -> None
Adjust confidence by delta, clamped to [0.0, 1.0]. Used internally by reconciliation (+0.05 reinforce, +0.1 contradict newer, -0.1 weaken) and hybrid search (+0.02 per recall for vector-only results).

delete_by_source()

def delete_by_source(self, source: str) -> int
Delete all knowledge entries with a given source (e.g., "discovery"). Returns the number of entries deleted. Used by gaia memory bootstrap --reset to clear discovery items.

Conversation Methods

def store_turn(self, session_id: str, role: str, content: str,
               context: str = "global") -> None

def get_history(self, session_id: str = None, context: str = None,
                limit: int = 20) -> List[Dict]

def search_conversations(self, query: str, context: str = None,
                         limit: int = 10) -> List[Dict]

def get_recent_conversations(self, days: int = 7, context: str = None,
                             limit: int = 50) -> List[Dict]

Tool History Methods

def log_tool_call(self, session_id: str, tool_name: str,
                  args: dict, result_summary: str,
                  success: bool, error: str = None,
                  duration_ms: int = None) -> None

def get_tool_errors(self, tool_name: str = None,
                    limit: int = 10) -> List[Dict]

def get_tool_stats(self, tool_name: str) -> Dict
    # Returns: {total_calls, success_rate, avg_duration_ms, last_error}

Dashboard Methods

Aggregate queries for the Memory Dashboard UI:
def get_stats(self) -> Dict
    # Returns counts by category, context, entity, conversations, tools, temporal

def get_all_knowledge(self, category=None, context=None, entity=None,
                      sensitive=None, search=None, sort_by="updated_at",
                      order="desc", offset=0, limit=50,
                      include_superseded=False) -> Dict
    # Returns: {"items": [...], "total": 142, "offset": 0, "limit": 50}
    # When include_superseded=False (default), filters on superseded_by IS NULL

def get_tool_summary(self) -> List[Dict]
    # Per-tool stats: total_calls, success_rate, avg_duration_ms, last_error

def get_activity_timeline(self, days: int = 30) -> List[Dict]
    # Daily activity counts for the last N days

def get_recent_errors(self, limit: int = 20) -> List[Dict]

def prune(self, days: int = 90) -> Dict
    # Delete tool_history and conversations older than N days.
    # Also prunes low-confidence knowledge (confidence < 0.1) last used > N days ago.
    # Returns: {"tool_history_deleted": N, "conversations_deleted": N, "knowledge_deleted": N}
    # Called automatically on agent startup via init_memory().

def rebuild_fts(self) -> None
    # Rebuild all FTS5 indexes from source tables.
    # Use if search results seem wrong or incomplete.
    # Also available via POST /api/memory/rebuild-fts

Embedding & Vector Methods (v2)

def store_embedding(self, knowledge_id: str, embedding: bytes) -> bool
    # Store a float32 embedding BLOB for a knowledge item.
    # Called by MemoryMixin after store() to persist the vector.
    # Returns False if knowledge_id not found.

def get_items_with_embeddings(
    self,
    category: str = None,
    context: str = None,
    entity: str = None,
    include_sensitive: bool = False,
    top_k: int = 100,
    time_from: str = None,          # ISO 8601 lower bound on created_at
    time_to: str = None,            # ISO 8601 upper bound on created_at
) -> List[Dict]
    # Returns knowledge items that have embeddings (embedding IS NOT NULL,
    # superseded_by IS NULL). Includes the embedding BLOB in each dict.
    # Used to build/rebuild the FAISS index and for filtered vector retrieval.

def get_items_without_embeddings(self, limit: int = 100) -> List[Dict]
    # Returns knowledge items missing embeddings (embedding IS NULL).
    # Used by _backfill_embeddings() during startup.

Consolidation Methods (v2)

def get_unconsolidated_sessions(
    self,
    older_than_days: int = 14,
    min_turns: int = 5,
    limit: int = 5,
) -> List[str]
    # Returns session_ids eligible for consolidation.
    # Criteria: all turns > older_than_days old, >= min_turns, at least one
    # turn with consolidated_at IS NULL.

def mark_turns_consolidated(self, turn_ids: List[int]) -> int
    # Sets consolidated_at = now for the given conversation turn IDs.
    # Returns count of turns marked. Turns remain until 90-day prune;
    # consolidated_at prevents re-processing.

Reconciliation Methods (v2)

def get_items_for_reconciliation(
    self,
    context: str = None,
    limit: int = 100,
) -> List[Dict]
    # Returns active knowledge items (superseded_by IS NULL) with embeddings,
    # suitable for pairwise similarity comparison during reconciliation.

def get_sessions(self, limit: int = 20) -> List[Dict]
    # List conversation sessions with turn counts and preview text.
    # Used by the Memory Dashboard.

def get_entities(self, limit: int = 100) -> List[Dict]
    # List all unique entities with knowledge counts and last update time.
    # Returns: [{"entity": "person:sarah_chen", "count": 5, "last_updated": "..."}, ...]

def get_contexts(self, limit: int = 100) -> List[Dict]
    # List all contexts with knowledge counts.
    # Returns: [{"context": "work", "count": 42}, ...]

SystemDiscovery

Local system scanner for day-zero bootstrap. Returns lists of discovered facts for user review — nothing is stored directly.
from gaia.agents.base.discovery import SystemDiscovery

discovery = SystemDiscovery()
results = discovery.scan_all()
# results is a Dict[str, List[Dict]] — source name → list of discovered facts
# Example: {"file_system": [...], "git_repos": [...], "installed_apps": [...]}

# To iterate all findings:
findings = []
for source_name, items in results.items():
    findings.extend(items)

# Each item dict:
# {content, category, context, entity, sensitive, confidence, source, approved}

Methods

MethodWhat it readsWhat it returns
scan_file_system(paths)Folder names + file extensions in project directoriesProject names, languages used
scan_git_repos(paths).git/config files — remotes, branch namesRepo names, languages, remote URLs
scan_installed_apps()Windows registry, Start Menu shortcutsApp inventory
scan_browser_bookmarks()Chrome/Edge/Firefox bookmark filesCategorized sites and interests
scan_browser_history(days)Browser history DBs (URLs only, no page content)Top domains (all flagged sensitive)
scan_email_accounts()Windows credential store — addresses onlyEmail addresses (all flagged sensitive)
Each method returns dicts like:
{
    "content": "Project 'gaia' -- Python/TypeScript, github.com/amd/gaia",
    "category": "fact",
    "context": "work",
    "entity": "project:gaia",
    "sensitive": False,
    "confidence": 0.4,       # Lower than user-stated (inferred)
    "source": "discovery",
    "approved": None,        # Set by user review: True/False
}
Discovery never reads file contents, email content, or browser page content. It reads names, extensions, URLs, and metadata only. All browser history and email items are auto-flagged as sensitive.

Code Examples

Minimal Agent with Memory

from gaia.agents.base.agent import Agent
from gaia.agents.base.memory import MemoryMixin


class RememberBot(MemoryMixin, Agent):
    """A simple agent that remembers everything."""

    def __init__(self):
        self.init_memory(context="global")
        super().__init__(
            name="RememberBot",
            system_prompt="You are a helpful assistant with persistent memory.",
        )

    def _register_tools(self):
        super()._register_tools()
        self.register_memory_tools()


# Usage
agent = RememberBot()
result = agent.process_query("My name is Alex and I prefer concise answers")
# Memory auto-extracts: fact("My name is Alex"), preference("prefer concise answers")
# Next session, system prompt includes these automatically

Switching Contexts

class WorkPersonalAgent(MemoryMixin, Agent):
    def __init__(self, context="work"):
        self.init_memory(context=context)
        super().__init__(name="DualContext")

    def _register_tools(self):
        super()._register_tools()
        self.register_memory_tools()


agent = WorkPersonalAgent(context="work")
agent.process_query("Remember: deploy with kubectl apply -f prod.yaml")
# Stored in 'work' context

agent.set_memory_context("personal")
agent.process_query("Remember: dentist appointment Thursday at 2pm")
# Stored in 'personal' context

agent.set_memory_context("work")
# System prompt now shows work items only (plus global)
# Personal items are invisible until context switches back

Accessing the Store Directly

from gaia.agents.base.memory_store import MemoryStore

store = MemoryStore()  # Uses ~/.gaia/memory.db

# Store a fact
kid = store.store(
    category="fact",
    content="Project uses Python 3.12 with uv",
    context="work",
    entity="project:gaia",
)

# Search
results = store.search("Python version", context="work")

# Get everything about an entity
gaia_facts = store.get_by_entity("project:gaia")

# Get upcoming items
upcoming = store.get_upcoming(within_days=7)

# Dashboard stats
stats = store.get_stats()
print(f"Total memories: {stats['knowledge']['total']}")
print(f"Tool success rate: {stats['tools']['overall_success_rate']:.0%}")

# Cleanup
store.close()

Custom DB Path

from pathlib import Path

class TestAgent(MemoryMixin, Agent):
    def __init__(self):
        self.init_memory(
            db_path=Path("./test_memory.db"),
            context="testing",
        )
        super().__init__(name="TestAgent")

    def _register_tools(self):
        super()._register_tools()
        self.register_memory_tools()

Memory Tools Reference

These 5 tools are registered by register_memory_tools() and exposed to the LLM:

remember

Store a fact, preference, error, skill, note, or reminder. Supports category, domain, due_at, context, sensitive, entity.

recall

Search memory by query (hybrid: vector + BM25 + cross-encoder), category, context, entity, or time range. Returns results with IDs for use with update/forget.

update_memory

Modify an existing entry by ID. Only non-empty fields change. Use reminded_at="now" after mentioning time-sensitive items.

forget

Delete a specific memory entry by ID.

search_past_conversations

Search conversation history by keywords, time range, or both. Returns matching turns with timestamps and session IDs.
These map to CRUD operations: remember = create, recall = read, update_memory = update, forget = delete, plus search_past_conversations for history.

Knowledge Sources

SourceHow createdDefault confidence
toolLLM called remember()0.5
llm_extractAuto-extracted by LLM from conversation, Mem0-style0.4
error_autoAuto-stored from tool failure0.5
userManually created via dashboard0.8
discoverySystem scan during bootstrap0.4
consolidationDistilled from old conversation sessions0.5
v2 replaces the v1 heuristic source (regex-based) with llm_extract (Mem0-style LLM extraction). The LLM sees both the conversation and existing memory, then decides what operations to perform (ADD/UPDATE/DELETE/NOOP). This produces higher-quality extractions with proper deduplication and contradiction resolution.

API Reference

MemoryMixin Methods

MethodDescription
init_memory(db_path, context)Initialize memory subsystem (v2: includes embedding validation, FAISS rebuild, reconciliation, consolidation)
get_memory_system_prompt()Stable frozen prefix for system prompt (includes Skills section)
get_memory_dynamic_context()Per-turn time + upcoming items
register_memory_tools()Register 5 LLM-facing tools
set_memory_context(context)Switch active context
reset_memory_session()New session ID + confidence decay
_get_embedder()Lazy-init Lemonade embedding provider
_embed_text(text)Embed text to 768-dim vector via nomic-embed-text-v2
_backfill_embeddings(limit)Embed items missing embeddings
_hybrid_search(query, ...)Vector + BM25 + RRF + cross-encoder search
_classify_query_complexity(query)Returns adaptive top_k: 3, 5, or 10
_extract_via_llm(user_input, assistant_response, existing_items)Mem0-style extraction: ADD/UPDATE/DELETE/NOOP
consolidate_old_sessions(max_sessions)Distill old sessions to durable knowledge
reconcile_memory(max_pairs)Detect and resolve contradictory/reinforcing facts

MemoryStore Methods

MethodDescription
store(category, content, ...)Store knowledge with dedup (v2: embedding follows via store_embedding)
search(query, category, ...)FTS5 keyword search with BM25 ranking (v2: adds time_from/time_to, superseded_by IS NULL filter)
get_by_category(category, ...)Filter by category (v2: superseded_by IS NULL filter)
get_by_entity(entity, ...)Get all knowledge about an entity (v2: superseded_by IS NULL filter)
get_upcoming(within_days, ...)Time-sensitive items (v2: superseded_by IS NULL filter)
update(knowledge_id, ...)Update existing entry (v2: adds superseded_by parameter)
delete(knowledge_id)Delete entry
apply_confidence_decay(...)Decay unused knowledge
update_confidence(knowledge_id, delta)Adjust confidence by delta, clamped to [0.0, 1.0]
delete_by_source(source)Delete all knowledge entries with a given source
store_embedding(knowledge_id, embedding)Store float32 embedding BLOB for a knowledge item (v2)
get_items_with_embeddings(...)Get items that have embeddings for FAISS index (v2)
get_items_without_embeddings(limit)Get items missing embeddings for backfill (v2)
get_unconsolidated_sessions(...)Get session IDs eligible for consolidation (v2)
mark_turns_consolidated(turn_ids)Mark conversation turns as consolidated (v2)
get_items_for_reconciliation(...)Get active items with embeddings for pairwise comparison (v2)
store_turn(session_id, ...)Store conversation turn
get_history(session_id, ...)Get turns for a session
search_conversations(query, ...)FTS5 conversation search
get_recent_conversations(days, ...)Time-based conversation retrieval
log_tool_call(session_id, ...)Log a tool execution
get_tool_errors(tool_name, ...)Recent tool errors
get_tool_stats(tool_name)Per-tool success rate and duration
get_stats()Aggregate dashboard statistics
get_all_knowledge(...)Paginated knowledge browser (v2: adds include_superseded parameter)
get_entities(limit)List all unique entities with counts
get_contexts(limit)List all contexts with counts
get_tool_summary()Per-tool stats for dashboard
get_tool_history(tool_name, limit)Recent call history for one tool
get_sessions(limit)List conversation sessions with previews
get_activity_timeline(days)Daily activity counts
get_recent_errors(limit)Recent errors across all tools
prune(days)Delete old history and low-confidence knowledge
rebuild_fts()Rebuild FTS5 indexes if search seems wrong
close()Close the database connection

  • User Guide — What agent memory does, CLI commands, dashboard walkthrough
  • Agent System — Base Agent class that MemoryMixin extends
  • Tool Decorator — How the 5 memory tools are registered
  • Agent UI — Desktop interface with Memory Dashboard
  • Agent SDK — Chat SDK for building conversational agents