GAIA Security Model
Date: 2026-04-01 Status: Planning Milestones: v0.17.2, v0.18.2, v0.21.0, v0.23.0 Related issues: #94, #438, #447, #459, #461, #559 Prerequisites: None (security is cross-cutting)
1. Executive Summary
This document unifies all security concerns scattered across the GAIA roadmap into a single plan. GAIA enforces a defense-in-depth model: localhost-only communication, sandboxed tool execution, confirmation gates for destructive operations, and a complete audit trail. The plan is organized into ten security domains, each mapped to a specific milestone and GitHub issue. Implementation is phased: foundational guardrails ship in v0.18.2, browser and desktop controls in v0.21.0, and autonomous execution safety in v0.23.0.2. Threat Model
2.1 Attack Surface
| Surface | Threat | Current State | Target State |
|---|---|---|---|
| Network ports | Remote exploitation | MCP bridge on :8765, API on :8080 (localhost only) | Remain localhost-only; firewall rules enforced |
| MCP tool execution | Malicious or unverified tools | No classification; all tools execute freely | Tiered classification: auto-approve, confirm, deny |
| Shell commands | Arbitrary code execution | Whitelist in ShellToolsMixin (ALLOWED_COMMANDS) | Retain whitelist; extend to all agent types |
| RAG cache | Pickle deserialization (RCE) | pickle.load() on cached data guarded by a GAIA_CACHE_V1 magic header and 500 MB size cap (src/gaia/rag/sdk.py) — shipped in v0.17.2 | Migrate to fully safe serialization / HMAC (#447 follow-up) |
| Credentials | Secret leakage via logs/env | Env vars (ATLASSIAN_SITE_URL, GITHUB_TOKEN, etc.) | Encrypted credential vault |
| Browser automation | Navigation to malicious URLs | No URL restrictions on Playwright MCP | URL allowlist + domain restrictions (#459) |
| Desktop control | Unauthorized system modification | Not yet implemented | Opt-in model with screenshot permissions (#461) |
| Messaging adapters | Prompt injection from untrusted input | Not yet implemented | Restricted tool set, input sanitization (#635) |
| Skill marketplace | Malicious third-party skills | No marketplace yet | Code signing + declared permissions |
2.2 Trust Boundaries
3. Security Architecture Overview
3.1 Localhost-Only Communication
All GAIA services bind exclusively to127.0.0.1. No public ports are opened.
| Service | Default Bind | Port | Configurable |
|---|---|---|---|
| Lemonade Server | 127.0.0.1 | 8000 | Yes (host/port) |
| MCP Bridge | 127.0.0.1 | 8765 | Yes (env: GAIA_MCP_HOST, GAIA_MCP_PORT) |
| API Server | 127.0.0.1 | 8080 | Yes |
| Electron UI | IPC (no network) | N/A | No |
- The MCP bridge (
src/gaia/mcp/mcp_bridge.py) and API server (src/gaia/api/) must reject non-loopback bind addresses unless--dangerous-allow-remoteis explicitly passed (v0.23.0). - The Windows installer and setup wizard should configure Windows Firewall to block inbound connections to GAIA ports.
3.2 Defense in Depth
4. Tool Execution Guardrails
GitHub issues: #438 (v0.18.2), #559 (v0.23.0)4.1 Tool Classification Tiers
Every tool registered in_TOOL_REGISTRY (see src/gaia/agents/base/tools.py) is
assigned a risk tier:
| Tier | Behavior | Examples |
|---|---|---|
| READ | Auto-approve, no confirmation | search_web, read_file, run_shell_command (read-only whitelist), git status |
| WRITE | Confirm-first popup in UI, y/n in CLI | write_file, run_cli_command (unrestricted), browser_click |
| DESTRUCTIVE | Confirm-first with warning banner | delete_file, git reset, shell rm |
| DENIED | Always blocked, cannot be overridden | os.system, eval, exec, subprocess.Popen (direct) |
4.2 MCP Tool Classification
MCP tools from external servers are classified by name. The whitelists below apply to the three pre-configured MCP servers (Playwright, Brave Search, Fetch).4.3 Confirmation Flow
Desktop UI (Electron):- Agent requests tool execution via SSE event.
- UI displays a modal: tool name, arguments (truncated), risk tier.
- User clicks “Allow” or “Deny”.
- Result sent back via MCP bridge POST.
- If denied, agent receives
{"status": "denied", "reason": "user_rejected"}and must replan without that tool.
- Agent prints tool name and arguments to console.
- Prompt:
Execute [tool_name]? (y/n/always): - “always” adds the tool to session-level auto-approve (not persisted).
4.4 Implementation: @tool Decorator Extension
risk_tier parameter is stored in _TOOL_REGISTRY[tool_name]["risk_tier"].
The _execute_tool method in src/gaia/agents/base/agent.py checks the tier
before execution and gates on confirmation if required.
5. MCP Security
GitHub issue: #94 (v0.18.2)5.1 Sandboxed Execution
MCP servers run as child processes (subprocess.run in src/gaia/mcp/external_services.py).
The following hardening applies:
| Control | Implementation | Milestone |
|---|---|---|
| Process isolation | Each MCP server runs in its own subprocess | Existing |
| Timeout enforcement | subprocess.run(..., timeout=30) | Existing |
| Resource limits | CPU and memory cgroup limits (Linux); Job Objects (Windows) | v0.18.2 |
| Filesystem sandbox | MCP servers can only access ~/.gaia/mcp/<server>/ | v0.18.2 |
| Network restriction | MCP servers inherit localhost-only binding | Existing |
| Stderr capture | All stderr logged for audit | v0.18.2 |
5.2 Unknown Tool Default
When a user connects a new MCP server via the MCP Settings UI or~/.gaia/mcp.json:
- GAIA calls
tools/liston the server to discover available tools. - All discovered tools are classified as CONFIRM by default.
- The user can promote individual tools to AUTO-APPROVE via the Settings UI.
- Promoted tools are stored in
~/.gaia/config.jsonundermcp.trusted_tools.
5.3 npm Package Verification
MCP servers installed vianpx must be verified:
- Package name must be scoped (
@modelcontextprotocol/server-*or@anthropic/*). - Package integrity is checked via npm’s
--prefer-offlineand lockfile hashes. - Unscoped packages display a warning: “This MCP server is from an unverified publisher. Proceed?”
- A future AMD Verified badge (v0.24.0) will indicate code-signed packages.
6. Audit Trail
Cross-cutting concern; no single GitHub issue.6.1 What is Logged
Every tool execution produces an audit record:| Field | Type | Description |
|---|---|---|
timestamp | ISO 8601 | When the tool was invoked |
session_id | UUID | Conversation session identifier |
tool_name | string | Registry name of the tool |
tool_source | enum | native, mcp:<server>, skill:<name> |
risk_tier | enum | read, write, destructive, denied |
arguments | JSON | Tool arguments (secrets redacted) |
result_status | enum | success, error, denied, timeout |
result_summary | string | Truncated result (max 500 chars) |
user_confirmed | bool | Whether user confirmation was required and granted |
duration_ms | int | Execution wall-clock time |
input_channel | enum | cli, desktop_ui, api, messaging:<platform> |
6.2 Storage
Audit logs are stored in SQLite using the existingDatabaseMixin pattern
(src/gaia/database/mixin.py):
6.3 UI Integration
The Electron UI displays the audit trail in a dedicated “Activity” panel:- Filterable by session, tool, risk tier, date range.
- Exportable as CSV or JSON.
- Destructive/denied actions are highlighted.
6.4 Secret Redaction
Before writing to the audit log, arguments are scrubbed:- Any value matching a known credential pattern (API keys, tokens, passwords) is
replaced with
[REDACTED]. - Fields named
password,token,secret,api_key,auth,credentialare always redacted regardless of value. - The redaction function lives in
src/gaia/agents/base/security.py.
7. Credential Management
Strategy doc requirement; no dedicated GitHub issue yet.7.1 Current State (Insecure)
Credentials are stored as environment variables or in plaintext config files:ATLASSIAN_SITE_URL,ATLASSIAN_EMAIL,ATLASSIAN_API_TOKEN(Jira)GITHUB_TOKEN(GitHub MCP)BRAVE_API_KEY(Brave Search MCP)PERPLEXITY_API_KEY(web search fallback)DISCORD_BOT_TOKEN,SLACK_BOT_TOKEN,TELEGRAM_BOT_TOKEN(messaging)
/proc/<pid>/environ (Linux), Get-Process
(Windows), and can leak into logs, crash reports, or child process environments.
7.2 Target: Encrypted Credential Vault
Location:~/.gaia/credentials.db (SQLite with encrypted values).
7.3 Migration Path
- v0.18.2: Introduce
CredentialVaultclass, support both env vars and vault. - v0.21.0: UI for managing credentials (add/remove/rotate).
- v0.23.0: Deprecate env var credentials with warning; vault is primary.
- v0.24.0: Env var credentials removed from documentation.
7.4 Logging Rules
- Credentials MUST NEVER appear in log output, audit trail, error messages, or crash reports.
- The
format_execution_tracefunction insrc/gaia/agents/base/errors.pymust scrub tool arguments before formatting. - MCP server configs with
envkeys containingTOKEN,KEY,SECRET,PASSWORD, orCREDENTIALmust mask values in all debug output.
8. Browser Security
GitHub issue: #459 (v0.21.0)8.1 URL Allowlist
Playwright MCP browser operations are restricted to an allowlist of domains:8.2 Domain Restriction Enforcement
The Playwright MCP tool wrapper interceptsbrowser_navigate calls:
- Parse the target URL.
- Check domain against
allowed_domains(glob matching). - Check domain against
blocked_domains(always takes precedence). - If domain is not in either list, prompt user for confirmation.
file://,javascript:,data:protocols are always blocked.
8.3 Content Security
- Pages that attempt to open new windows or popups are blocked.
- JavaScript
alert(),confirm(),prompt()dialogs are auto-dismissed. - Downloaded files are quarantined to
~/.gaia/downloads/and never auto-executed. - Cookie and localStorage data is isolated per session (no persistence across agent restarts).
9. Desktop Control Security
GitHub issue: #461 (v0.21.0)9.1 Opt-In Model
Desktop control (CUA) capabilities are disabled by default. Users must explicitly enable them:9.2 Permission Tiers
| Permission | Description | Default | Requires Confirmation |
|---|---|---|---|
screenshot | Capture screen regions | Off | Yes (first time) |
mouse_move | Move mouse cursor | Off | Yes (per session) |
mouse_click | Click at coordinates | Off | Yes (per action) |
keyboard_type | Type text | Off | Yes (per action) |
keyboard_shortcut | Press key combinations | Off | Yes (per action) |
window_manage | Resize, move, close windows | Off | Yes (per action) |
9.3 Screenshot Permissions
Screenshots capture potentially sensitive information. Controls:- Screenshots are stored in
~/.gaia/screenshots/with session-scoped filenames. - Screenshots are auto-deleted after session ends (configurable retention).
- Screenshots are never sent to external services (processed locally via VLM).
- The Electron UI displays a persistent indicator when screenshot capture is active.
9.4 Safety Constraints
- Desktop control actions execute with a minimum 500ms delay between actions (prevents runaway automation).
- A global kill switch: pressing
Escapethree times rapidly cancels all pending desktop control actions. - Desktop control is not available over messaging adapters (CLI and desktop UI only).
- CUA sessions are time-bounded (default: 5 minutes, configurable).
10. Dangerous Mode
GitHub issue: #559 (v0.23.0)10.1 Purpose
Dangerous mode is an explicit opt-in that bypasses tool confirmation gates for autonomous execution. It is designed for advanced users running unattended workflows (scheduled tasks, CI pipelines, batch processing).10.2 Activation
Dangerous mode requires a deliberate multi-step activation:- Enabled via config file (must be per-session, explicit).
- Activated over messaging adapters (Discord, Slack, Telegram).
- Activated via the API server (only CLI and desktop UI).
- Combined with desktop control permissions (CUA always requires confirmation).
10.3 What Changes in Dangerous Mode
| Behavior | Normal Mode | Dangerous Mode |
|---|---|---|
| READ tools | Auto-approve | Auto-approve (no change) |
| WRITE tools | Confirm-first | Auto-approve |
| DESTRUCTIVE tools | Confirm-first + warning | Auto-approve + warning logged |
| DENIED tools | Always blocked | Still blocked |
| Audit logging | Active | Active (cannot be disabled) |
| Desktop control | Per-action confirmation | Still per-action (no bypass) |
| MCP unknown tools | Confirm | Auto-approve (with audit) |
10.4 Visual Indicators
- CLI: Red banner
[DANGEROUS MODE]in prompt. - Desktop UI: Red border on chat window, persistent warning badge.
- Audit log: All entries during dangerous mode are tagged
dangerous_mode: true.
11. Skill/Plugin Security
Prerequisites for #647 (skill marketplace). The SKILL.md format specification, permission model, security tiers, and sandboxing architecture are defined in skill-format.mdx — the canonical reference for all skill/plugin security. Key security properties:- Skills declare permissions in YAML frontmatter using domain-scoped syntax
(
filesystem:read,network:write, etc.) - Three security tiers: AMD Verified, Community Reviewed, Experimental
- Skills run in isolated context with restricted
_TOOL_REGISTRYaccess - Dedicated working directory per skill:
~/.gaia/skills/<skill_name>/ - Code signing infrastructure planned for v0.24.0 (#462-#465)
12. Messaging Security
Related issues: #635 (messaging adapters), #559 (dangerous mode exclusion)12.1 Threat: Untrusted Input
Messaging platforms (Discord, Slack, Telegram) introduce untrusted external input from potentially hostile users. This is fundamentally different from the desktop UI where the local user is trusted.12.2 Restricted Default Tool Set
Messaging adapters operate with a restricted tool set by default:12.3 Input Sanitization
All messages from external platforms are sanitized before reaching the agent:- Length limit: Messages exceeding 4,000 characters are truncated.
- Injection filtering: Known prompt injection patterns are detected and blocked (e.g., “ignore previous instructions”, “system prompt:”, role-switching attempts).
- PII filtering: Outbound responses are scanned for potential PII leakage
(email addresses, phone numbers, SSNs). Detected PII is replaced with
[PII REDACTED]in responses sent back to the messaging platform. - Rate limiting: Per-user, per-channel, and global rate limits (see
messaging-integrations-plan.mdxSection 7 for configuration).
12.4 Identity Isolation
Each messaging platform user maps to an isolated GAIA session:- Sessions are keyed by
(platform, user_id, channel_id). - No cross-session data leakage.
- Sessions have a configurable TTL (default: 24 hours).
- Session history is stored in
~/.gaia/messaging/sessions.db, separate from the main audit log.
13. Pickle Deserialization Vulnerability — Mitigated in v0.17.2
GitHub issue: #447 — shipped in v0.17.2 via PR #722.13.1 Historical Vulnerability
The RAG SDK (src/gaia/rag/sdk.py) uses pickle.load() to deserialize cached
document indexes. Pickle deserialization of untrusted data can lead to
arbitrary code execution.
Attack vector: A malicious PDF is indexed, the resulting cache file is tampered with
(or a crafted .pkl file is placed in ~/.gaia/cache/), and the next time the RAG
SDK loads the cache, arbitrary code executes.
13.2 What v0.17.2 shipped
The RAG cache now prepends a fixed magic header (GAIA_CACHE_V1\n) and enforces
a MAX_CACHE_SIZE (500 MB) before any deserialization attempt. See
src/gaia/rag/sdk.py:44. This closes the drive-by-write attack on
~/.gaia/cache/ without forcing a JSON migration. Future hardening — full
JSON serialization or HMAC-authenticated pickle — remains tracked on #447.
13.3 Future hardening (tracked as follow-up)
-
Immediate (v0.17.2): Replace
pickle.dump/pickle.loadwithjsonfor the cache data structure. The cached data (chunks,full_text,metadata) is JSON-serializable. -
If JSON is insufficient (e.g., numpy arrays in embeddings): Use
numpy.save/numpy.loadwithallow_pickle=Falsefor embedding vectors, and JSON for metadata. - Cache integrity: Add HMAC-SHA256 verification to cache files. The HMAC key is derived from the file content hash, ensuring tampered caches are rejected.
- Migration: Existing
.pklcache files are invalidated on upgrade. Users will see a one-time re-indexing of their documents.
14. Phased Rollout
Phase 1: v0.17.2 — Critical Fixes
| Item | Issue | Priority | Effort |
|---|---|---|---|
| Fix pickle deserialization in RAG cache | #447 | P0 | 1 day |
| Cache integrity verification (HMAC) | #447 | P0 | 0.5 day |
Phase 2: v0.18.2 — Foundation Guardrails
| Item | Issue | Priority | Effort |
|---|---|---|---|
| Tool risk tier classification system | #438 | P0 | 2 days |
@tool(risk_tier=...) decorator extension | #438 | P0 | 1 day |
| MCP tool classification (auto-approve/confirm/deny) | #94, #438 | P0 | 2 days |
| Confirmation flow (CLI + desktop UI) | #438 | P0 | 3 days |
| Audit trail SQLite schema + logging | — | P1 | 2 days |
gaia audit CLI commands | — | P1 | 1 day |
| Secret redaction in logs and audit trail | — | P1 | 1 day |
| MCP server resource limits (timeout, filesystem sandbox) | #94 | P1 | 2 days |
| npm package verification for MCP servers | #94 | P2 | 1 day |
| Credential vault (initial, alongside env vars) | — | P2 | 3 days |
Phase 3: v0.21.0 — Browser and Desktop
| Item | Issue | Priority | Effort |
|---|---|---|---|
| Browser URL allowlist | #459 | P0 | 2 days |
| Domain restriction enforcement for Playwright | #459 | P0 | 1 day |
| Protocol blocking (file://, javascript:, data:) | #459 | P0 | 0.5 day |
| Desktop control opt-in model | #461 | P0 | 2 days |
| Screenshot permission system | #461 | P1 | 1 day |
| Desktop control safety constraints (delay, kill switch) | #461 | P1 | 1 day |
| Credential vault UI (add/remove/rotate) | — | P1 | 2 days |
| Content security (popup blocking, download quarantine) | #459 | P2 | 1 day |
Phase 4: v0.23.0 — Autonomous Execution
| Item | Issue | Priority | Effort |
|---|---|---|---|
| Dangerous mode (CLI + desktop UI) | #559 | P0 | 2 days |
| Dangerous mode exclusions (messaging, API, CUA) | #559 | P0 | 1 day |
| Messaging adapter restricted tool set | #635 | P0 | 1 day |
| Input sanitization for messaging | #635 | P1 | 2 days |
| PII filtering for outbound messaging responses | #635 | P1 | 2 days |
| Prompt injection detection | #635 | P1 | 2 days |
| Deprecate env var credentials (vault primary) | — | P2 | 1 day |
--dangerous-allow-remote bind flag | — | P2 | 0.5 day |
Phase 5: v0.24.0 — Marketplace Security
| Item | Issue | Priority | Effort |
|---|---|---|---|
| SKILL.md manifest specification | — | P0 | 2 days |
| Permission enforcement for skills | — | P0 | 3 days |
| Code signing infrastructure (AMD Verified) | — | P1 | 5 days |
| Skill sandbox (filesystem isolation, registry validation) | — | P1 | 3 days |
| AMD Verified badge in Agent Hub UI | #462-#465 | P2 | 1 day |
15. GitHub Issue Cross-References
| Issue | Title | Milestone | Security Domain |
|---|---|---|---|
| #94 | MCP security vulnerabilities | v0.18.2 | MCP sandboxing, tool classification |
| #438 | Tool execution guardrails | v0.18.2 | Risk tiers, confirmation flow |
| #447 | Pickle deserialization vulnerability in RAG cache | v0.17.2 | Serialization safety |
| #459 | Browser use security policy and URL allowlisting | v0.21.0 | Browser domain restrictions |
| #461 | Desktop control security policy and opt-in model | v0.21.0 | CUA permissions |
| #559 | Dangerous mode — opt-in guardrail bypass | v0.23.0 | Autonomous execution |
| #612 | Agent registry | v0.18.2 | Tool discovery trust |
| #635 | Messaging adapters | v0.23.0 | Input sanitization, restricted tools |
| #647 | Skill marketplace | v0.24.0 | Code signing, sandboxed skills |
16. Existing Security Measures
The following security measures are already implemented in the codebase:| Measure | Location | Description |
|---|---|---|
| Shell command whitelist | src/gaia/agents/chat/tools/shell_tools.py | ALLOWED_COMMANDS set restricts CLI tool to read-only commands |
| Git command whitelist | src/gaia/agents/chat/tools/shell_tools.py | Only read-only git subcommands (status, log, diff, etc.) |
| Localhost-only MCP bridge | src/gaia/mcp/mcp.json | GAIA_MCP_HOST defaults to localhost |
| Subprocess timeout | src/gaia/mcp/external_services.py | timeout=30 on MCP subprocess calls |
| Tool registry validation | src/gaia/agents/base/agent.py | Rejects unregistered tool names |
| Required argument checking | src/gaia/agents/base/agent.py | Validates tool arguments before execution |
17. Open Questions
- Credential vault key management: Should the vault encryption key be hardware-backed (TPM/fTPM on AMD platforms) or software-derived (DPAPI/Keychain)? TPM provides stronger security but adds platform-specific complexity.
- Prompt injection detection: What detection model should be used? Options range from regex heuristics to a dedicated classifier model. The classifier approach is more robust but adds latency and model dependency.
- Skill sandboxing depth: Should skills run in a separate Python subprocess (strong isolation, high overhead) or in-process with restricted globals (weaker isolation, low overhead)?
- Audit log retention: What is the default retention period? Options: 30 days, 90 days, unlimited. Unlimited is safest for compliance but grows the database.
- Browser allowlist management: Should the default allowlist be permissive (allow all, block known-bad) or restrictive (block all, allow known-good)? The current proposal is restrictive, which is safer but may frustrate users who browse diverse sites.