Source Code:
src/gaia/cli.pyPlatform Support
Windows 11
Full GUI and CLI support
Linux
Full GUI and CLI support via source installation (Ubuntu/Debian)
Quick Start
- Windows
- Linux
- Follow the Quickstart to install GAIA
- Open PowerShell and run
gaiato launch the Agent UI, orgaia --clifor terminal chat - GAIA automatically starts Lemonade Server when needed, or start manually:
Top-Level Flags
Runninggaia with no subcommand launches the Agent UI by default:
| Flag | Description |
|---|---|
--ui | Launch the Agent UI (browser-based chat interface) — this is the default |
--ui-port <port> | Port for the Agent UI server (default: 4200) |
--ui-dist <path> | Serve an alternative prebuilt Agent UI bundle (e.g., a PR preview) instead of the one shipped in the installed package |
--cli | Launch interactive CLI chat |
--base-url <url> | Use a remote Lemonade server (e.g. https://host:13305/api/v1). Also accepted as a global option on most subcommands (e.g. gaia chat --base-url ...). |
gaia chat --ui continues to work as an alias. The Agent UI requires an AMD Ryzen AI Max (Strix Halo) or an AMD Radeon GPU with ≥ 24 GB VRAM. If your device is not supported, a dismissible warning banner will appear in the UI.Agent step limit
Agents stop after a maximum number of reasoning/tool steps. The default is 50, applied consistently across the CLI, the Agent UI, and background runs.- Per-invocation: pass
--max-steps <n>to any agent command (e.g.gaia browse --max-steps 80).gaia blenderuses--steps <n>. - Fleet-wide: set the
GAIA_AGENT_MAX_STEPSenvironment variable to change the default for every agent at once, without per-command flags.
GAIA_AGENT_MAX_STEPS value (non-integer or ≤ 0) fails fast with an actionable error rather than silently capping agents.
Per-tool execution timeout
Each tool call is bounded so a hung tool (e.g. a stuck connector or network request) surfaces an actionable error instead of leaving the agent — and the UI — stuck indefinitely. The default is 180 seconds per tool.- Fleet-wide: set
GAIA_AGENT_TOOL_TIMEOUT(seconds) to change the default for every tool at once.
@tool(timeout=...) — for example image generation, which may need to download a model on first use. An invalid GAIA_AGENT_TOOL_TIMEOUT value (non-numeric or ≤ 0) fails fast with an actionable error rather than silently removing the guard.
Initialization
Init Command
New users start here! The
gaia init command is the easiest way to get GAIA running.| Option | Type | Default | Description |
|---|---|---|---|
--profile, -p | string | chat | Profile to initialize (minimal, sd, chat, code, rag, vlm, npu, all) |
--minimal | flag | false | Shortcut for --profile minimal |
--skip-models | flag | false | Skip model downloads (only install Lemonade) |
--skip-lemonade | flag | false | Skip Lemonade installation check (for CI with pre-installed Lemonade) |
--force-reinstall | flag | false | Force reinstall even if compatible version exists |
--force-models | flag | false | Force re-download models (deletes then re-downloads each model) |
--yes, -y | flag | false | Skip confirmation prompts (non-interactive) |
--verbose | flag | false | Enable verbose output |
--remote | flag | false | Use remote Lemonade Server (skip local install/start, download models via API). Auto-detected when LEMONADE_BASE_URL is set to a non-localhost URL. |
--profile flag accepts one of the following values (see
src/gaia/cli.py for the
canonical choices list):
| Profile | Models | Description | Approx Size |
|---|---|---|---|
minimal | Gemma-4-E4B-it-GGUF | Fast setup with Gemma 4 E4B multimodal model | ~3 GB |
sd | SDXL-Turbo, Gemma-4-E4B-it-GGUF | Image generation with multi-modal AI (LLM + SD) | ~10 GB |
chat | Gemma-4-E4B-it-GGUF, nomic-embed-text-v2-moe-GGUF | Interactive chat with RAG and vision support | ~4 GB |
code | Gemma-4-E4B-it-GGUF | Autonomous coding assistant | ~3 GB |
rag | Gemma-4-E4B-it-GGUF, nomic-embed-text-v2-moe-GGUF | Document Q&A with retrieval | ~4 GB |
vlm | Gemma-4-E4B-it-GGUF | Vision pipeline for document and image extraction | ~3 GB |
npu | gemma4-it-e2b-FLM | Ryzen AI NPU acceleration via FLM backend (requires XDNA2 NPU) | ~3 GB |
all | All models | All models for all agents | ~26 GB |
The
talk, blender, jira, and docker agents all rely on the chat profile —
there is no dedicated profile for each. Run gaia init --profile chat (or all)
to prepare their dependencies.gaia llm quick queries use the global default model (Gemma-4-E4B-it-GGUF).- Local — Lemonade runs on the same machine (default)
- Remote — Lemonade runs on another machine; enable with
--remoteor by settingLEMONADE_BASE_URL
- Local Mode
- Remote Mode
- Checks Lemonade Server - Detects if installed and verifies version compatibility
- Installs/Upgrades Lemonade - Downloads and installs from GitHub releases (Windows/Linux only). Automatically uninstalls old version if version mismatch detected.
- Starts Server - Ensures Lemonade server is running, prompts to start if not
- Downloads Models - Pulls required models for the selected profile
- Verifies Setup - Tests each model with inference to detect corrupted downloads
Install Command
Install individual GAIA components.| Option | Type | Description |
|---|---|---|
--lemonade | flag | Install Lemonade Server |
--yes, -y | flag | Skip confirmation prompts |
If a different version of Lemonade is already installed, you’ll be prompted to uninstall first.
Uninstall Command
Tiered cleanup of GAIA components. By default, the OS-native uninstall (Add or Remove Programs, drag to Trash,apt remove) only removes the app — user data in ~/.gaia/ is preserved. This command lets you escalate cleanup as far as you want.
| Option | Type | Description |
|---|---|---|
--venv | flag | Remove the Python environment (~/.gaia/venv/) |
--purge | flag | Remove venv + chats + documents + config + logs |
--purge-lemonade | flag | Also remove Lemonade Server (requires --purge) |
--purge-models | flag | Also remove downloaded Lemonade models (requires --purge) |
--purge-hf-cache | flag | Also remove ~/.cache/huggingface/ (requires --purge) |
--dry-run | flag | Show what would be removed without deleting anything |
--yes, -y | flag | Skip confirmation prompts |
Kill Command
Stop running GAIA services.| Option | Type | Description |
|---|---|---|
--lemonade | flag | Kill Lemonade Server and child processes |
--port | integer | Kill process on specific port |
On Windows,
--lemonade also kills orphaned llama-server.exe and lemonade-tray.exe processes.Agent Command
Author, version, validate, and share GAIA agents.Developer workflow: init, version, test
Scaffold a new agent package, bump its version, and run the quality gates that publishing requires. The scaffold mirrors the canonical hub package layout (hub/agents/python/summarize/).
init scaffolds a package directory named after <name>:
- Python:
gaia-agent.yaml,pyproject.toml(with thegaia.agententry point), agaia_agent_<id>/package (__init__.py+agent.pyskeleton),tests/test_agent.py, andREADME.md. - C++:
gaia-agent.yaml,CMakeLists.txt,src/agent.cpp,tests/, andREADME.md.
| Option | Type | Description |
|---|---|---|
name | string | Agent id/name — lowercase, hyphens allowed (positional, required) |
--language | choice | python (default) or cpp |
--output, -o | string | Parent directory to create the package in (default: current dir) |
--force | flag | Overwrite an existing package directory |
version bumps the SemVer in gaia-agent.yaml and keeps pyproject.toml / __init__.py in sync.
test runs quality gates in two modes:
| Mode | LLM required? | Checks |
|---|---|---|
--lint (default) | No (CI-safe) | Manifest valid + complete, package structure, Python sources parse + imports resolve, black + isort clean (Python); manifest + structure (C++) |
--live | Yes (Lemonade Server) | Agent answers each declared conversation_starter within --timeout without crashing |
--lint to pass; --live is recommended but not enforced.
Examples:
Lifecycle: configure, health, status
Manage an installed agent: set per-agent configuration, verify it loads, and inspect its state. Configuration is persisted under ~/.gaia/agents/<id>/config.json and survives restarts.
configure writes per-agent settings (e.g. a preferred model). --set values are JSON-decoded when possible (--set temperature=0.2 stores a number, --set verbose=true a boolean), otherwise kept as strings. Settings merge into the existing config by default; pass --replace to overwrite it wholesale, or --show to print the current config without changing it.
health verifies that an installed agent actually loads — its registration resolves and its entry point imports. It reports one of healthy, degraded (loads but something optional is off, e.g. a corrupt config), error (a required piece fails to load), or not_installed, and exits non-zero for error / not_installed so scripts can gate on it.
status aggregates installed version, health, config summary, and source for one agent — or every discovered agent when <id> is omitted.
| Option | Applies to | Description |
|---|---|---|
id | all | Agent id (positional; optional for status, required otherwise) |
--model | configure | Preferred model, stored as the model setting |
--set KEY=VALUE | configure | Set an arbitrary setting (repeatable) |
--replace | configure | Replace the whole config instead of merging |
--show | configure | Print the current config and exit |
Distribution: pack, publish, login
Build a distributable wheel from a Python agent package, then dual-publish it to the Agent Hub (R2, the source for the Hub UI) and PyPI (the source for pip install).
pack runs python -m build --wheel against the package’s pyproject.toml, writing gaia_agent_<id>-<version>-py3-none-any.whl to dist/ and printing its SHA-256. Python agents only — native (C++) agents ship a CMake-built binary, not a wheel.
publish packs the wheel and uploads it to both targets. R2 receives a multipart POST (gaia-agent.yaml + wheel) authenticated with a Bearer token; PyPI receives a twine upload authenticated with an API token. Both enforce version immutability — bump the version with gaia agent version before re-publishing.
login stores publisher tokens in your OS keyring. Tokens may also be supplied via the GAIA_HUB_TOKEN and PYPI_TOKEN environment variables (useful in CI), which take precedence over the keyring.
| Option | Applies to | Description |
|---|---|---|
PATH | pack, publish | Agent package directory (positional, default: current dir) |
--output, -o | pack, publish | Output directory for the wheel (default: <package>/dist) |
--hub-url | publish | R2 Worker origin (default: GAIA_HUB_URL or https://hub.amd-gaia.ai) |
--skip-r2 | publish | Publish to PyPI only |
--skip-pypi | publish | Publish to R2 only |
--hub-token / --hub | login | Store the R2 Hub token (pass a value, or --hub to be prompted) |
--pypi-token / --pypi | login | Store the PyPI token (pass a value, or --pypi to be prompted) |
build, twine) ships in the publish extra:
Installing a published agent
The two publish targets back two install paths, and both register the agent the same way — via thegaia.agent entry point, so the registry discovers it
automatically:
From PyPI (pip)
~/.gaia/agents/<id>/ (POST /api/agents/install, backed
by gaia.hub.installer). Use it when browsing the Hub UI; use pip install for
scripted or headless setups.
Automated PyPI publishing (CI)
.github/workflows/publish_agents.yml builds every production agent wheel and
publishes it to PyPI on a version tag (v*). Its matrix is derived from the
agents extra in setup.py (via util/list_agent_packages.py), so adding an
agent there is all it takes to start publishing it. The workflow uploads with
pypa/gh-action-pypi-publish
using the PYPI_API_TOKEN secret and skip-existing: true, so an unchanged
agent version is a no-op rather than an error — PyPI enforces version
immutability natively.
Sharing: export, import
Export every custom agent installed under ~/.gaia/agents/ into a single .zip bundle, and import a bundle produced on another machine.
| Option | Type | Description |
|---|---|---|
--output <path> | string | Destination .zip file (default: ~/.gaia/export.zip) |
| Option | Type | Description |
|---|---|---|
path | string | Path to the .zip bundle to import (positional, required) |
--yes, -y | flag | Skip the interactive trust prompt (required in non-interactive/CI contexts) |
Importing a bundle runs third-party Python code on your machine.
gaia agent import shows the agent IDs in the bundle and requires explicit y/yes confirmation (or --yes) before proceeding.Core Commands
LLM Direct Query
The fastest way to interact with AI models - no server management required.
| Option | Type | Default | Description |
|---|---|---|---|
--model | string | Client default | Specify the model to use |
--max-tokens | integer | 512 | Maximum tokens to generate |
--no-stream | flag | false | Disable streaming response |
Chat Command
Start an interactive conversation or send a single message with conversation history.- No message: Starts interactive chat session
- Message provided: Sends single message and exits
| Option | Type | Default | Description |
|---|---|---|---|
--query, -q | string | - | Single query to execute |
--model | string | auto-selected by agent | Override the model used by ChatAgent (None means let the agent pick; see ChatAgentConfig.model_id) |
--index, -i | path(s) | - | PDF document(s) to index for RAG |
--watch, -w | path(s) | - | Directories to monitor for new documents |
--chunk-size | integer | 500 | Document chunk size for RAG |
--max-chunks | integer | 3 | Maximum chunks to retrieve for RAG |
--max-indexed-files | integer | 100 | Maximum number of files to keep indexed. Evicts least-recently-accessed files when LRU eviction is enabled; rejects new files when disabled. |
--max-total-chunks | integer | 10000 | Maximum total chunks across all indexed files. Same eviction behavior as --max-indexed-files. |
--allowed-paths | path(s) | - | Restrict RAG/file tools to these directories (security sandbox) |
--stats, --show-stats | flag | false | Show performance statistics |
--stream | flag | false | Enable streaming responses |
--show-prompts | flag | false | Display prompts sent to LLM |
--debug | flag | false | Enable debug output |
--list-tools | flag | false | List available tools and exit |
--ui | flag | false | Launch the Chat Web UI (browser-based interface on port 4200) |
--ui-port | integer | 4200 | Port for the Agent UI server (used with --ui) |
--ui-dist | path | - | Serve an alternative prebuilt Agent UI bundle (e.g., a PR preview) instead of the one shipped in the package |
| Command | Description |
|---|---|
/clear | Clear conversation history |
/history | Show conversation history |
/system | Show current system prompt configuration |
/model | Show current model information |
/prompt | Show complete formatted prompt sent to LLM |
/stats | Show performance statistics (tokens/sec, latency, token counts) |
/help | Show available commands |
quit, exit, bye | End the chat session |
Specialized Agent Commands
Use focused agent commands when you want a smaller tool surface than the full chat agent.| Option | Type | Default | Description |
|---|---|---|---|
--query, -q | string | - | Single query to execute |
--model | string | auto-selected by agent | Override the model used by the agent |
--allowed-paths | path(s) | - | Restrict local file paths used by agent tools |
--stats, --show-stats | flag | false | Show performance statistics |
--stream | flag | false | Enable streaming responses |
--show-prompts | flag | false | Display prompts sent to LLM |
--debug | flag | false | Enable debug output |
--list-tools | flag | false | List available tools and exit |
Prompt Command
Send a single prompt to a GAIA agent.| Option | Type | Default | Description |
|---|---|---|---|
--model | string | auto | Model ID to use (default: auto-selected by each agent) |
--max-tokens | integer | 512 | Maximum tokens to generate |
--stats | flag | false | Show performance statistics |
Specialized Agents
Code Agent
Code Development
AI-powered code generation, analysis, and linting for Python/TypeScript
- Intelligent Language Detection (Python/TypeScript)
- Code Generation (functions, classes, unit tests)
- Autonomous Workflow (planning → implementation → testing → verification)
- Automatic Test Generation
- Iterative Error Correction
- Code Analysis with AST
- Linting & Formatting
Code Index
gaia-code index builds and queries a local FAISS-backed semantic index over a repository. Embeddings run on AMD NPU/GPU through Lemonade Server. Requires the [rag] extras.
index level: --repo, --max-files, --model, --base-url, --no-lemonade-check, --use-claude, --use-chatgpt.
→ Full Code Index Documentation
Blender Agent
3D Scene Creation
Natural language 3D modeling and scene manipulation
- Natural Language 3D Modeling
- Interactive Planning
- Object Management
- Material Assignment
- MCP Integration
| Option | Type | Default | Description |
|---|---|---|---|
--query | string | – | Custom query to run instead of the built-in examples |
--interactive | flag | false | Continuously input queries in interactive mode |
--example | string | – | Run a specific built-in example (1-6); omit for interactive mode |
--steps | integer | 5 | Maximum number of steps per query |
--output-dir | path | output | Directory to save output files |
--mcp-port | integer | 9876 | Port for the Blender MCP server |
--debug-prompts | flag | false | Show prompts sent to the LLM |
--print-result | flag | false | Print results to the console |
Email Command
Email Triage
Read, organize, and reply to Gmail with all email content processed locally on your machine.
| Option | Type | Default | Description |
|---|---|---|---|
-q, --query | string | – | Single query to send to the agent (non-interactive). |
-i, --interactive | flag | false | REPL loop until /quit. |
-v, --verbose | flag | false | Emit structured logs for every triage decision and tool call. |
--debug | flag | false | Adds full prompt + LLM-response logging (sensitive payloads in logs). |
gaia connectors connect google first; you’ll be asked to grant Gmail and Calendar scopes.
Privacy: all email body inference runs locally on Lemonade — the agent rejects any non-local LLM endpoint at startup.
→ Full Email Triage Agent Documentation
SD Command
Image Generation
Generate images using Stable Diffusion on Ryzen AI
| Option | Type | Default | Description |
|---|---|---|---|
prompt | string | - | Text description of the image to generate |
-i, --interactive | flag | false | Run in interactive mode |
--sd-model | string | SDXL-Turbo | Model: SDXL-Turbo (fast, good quality, default), SD-Turbo (faster, lower quality), SDXL-Base-1.0 (photorealistic, slow), SD-1.5 |
--size | string | auto | Image size: 512x512, 768x768, 1024x1024 (auto-selected per model) |
--steps | integer | auto | Inference steps (auto: 4 for Turbo, 20 for Base) |
--cfg-scale | float | auto | CFG scale (auto: 1.0 for Turbo, 7.5 for Base) |
--output-dir | path | .gaia/cache/sd/images | Directory to save images |
--seed | integer | random | Seed for reproducibility |
--no-open | flag | false | Skip prompt to open image in viewer |
Talk Command
Voice Interaction
Speech-to-speech conversation with optional document Q&A
| Option | Type | Default | Description |
|---|---|---|---|
--model | string | auto | Model ID to use (default: auto-selected by each agent) |
--max-tokens | integer | 512 | Maximum tokens to generate |
--no-tts | flag | false | Disable text-to-speech |
--audio-device-index | integer | auto-detect | Audio input device index |
--whisper-model-size | string | base | Whisper model [tiny, base, small, medium, large] |
--silence-threshold | float | 0.5 | Silence threshold in seconds |
--mic-threshold | float | 0.003 | Microphone amplitude threshold for voice detection (lower = more sensitive) |
--stats | flag | false | Show performance statistics |
--index, -i | path | - | PDF document for voice Q&A |
Jira Command
Jira / Atlassian
Natural-language interface for Jira, Confluence, and Compass using your Atlassian credentials (REST API).
| Option | Type | Default | Description |
|---|---|---|---|
command | string | – | Natural-language command to execute (positional). |
--interactive | flag | false | Continuous interactive mode. |
--mcp-host | string | localhost | MCP bridge host. |
--mcp-port | integer | 8765 | MCP bridge port. |
--verbose | flag | false | Verbose output. |
--debug | flag | false | Debug logging. |
Docker Command
Docker
Natural-language interface for Docker containerization.
| Option | Type | Default | Description |
|---|---|---|---|
command | string | – | Natural-language command to execute (positional). |
--directory | path | current dir | Directory to analyze/containerize. |
--verbose | flag | false | Verbose output. |
--debug | flag | false | Debug logging. |
Summarize Command
Summarize meeting transcripts, emails, and PDFs.| Option | Type | Default | Description |
|---|---|---|---|
-i, --input | path | – | Input file or directory (required unless --list-configs). |
-o, --output | path | auto | Output file/directory (auto-adjusted to format). |
-t, --type | transcript|email|pdf|auto | auto | Input type. |
-f, --format | json|pdf|email|both | json | Output format (both = json + pdf). |
--styles | one or more styles | executive participants action_items | brief, detailed, bullets, executive, participants, action_items, all. |
--max-tokens | integer | 1024 | Maximum tokens for the summary. |
--email-to | string | – | Recipients (comma-separated) for email output format. |
--email-subject | string | auto | Email subject line. |
--email-cc | string | – | CC recipients (comma-separated). |
--config | path | – | Use a predefined configuration file from configs/. |
--list-configs | flag | false | List available configuration templates and exit. |
--combined-prompt | flag | false | Combine styles into a single LLM call (experimental). |
--no-viewer | flag | false | Don’t auto-open the HTML viewer for JSON output. |
--quiet | flag | false | Minimal output. |
--verbose | flag | false | Detailed/debug output. |
Telegram Command
Telegram Adapter
Bridge a Telegram bot to GAIA so you can chat with your agents from Telegram.
| Subcommand | Option | Default | Description |
|---|---|---|---|
start | --token (required) | – | Telegram bot token. |
start | --allowed-users | allow all | Comma-separated user IDs permitted to interact. |
start | --background | false | Run as a daemon (writes PID + health endpoint). |
stop | --force | false | Force stop if graceful shutdown fails. |
status | --health-host | 127.0.0.1 | Health server host. |
status | --health-port | 8765 | Health server port. |
API Server
API Server
OpenAI-compatible REST API for VSCode and IDE integrations
Quick Start
- Start Lemonade with extended context:
- Start GAIA API server:
- Test the server:
Commands
- Start
- Status
- Stop
--host- Server host (default: localhost)--port- Server port (default: 8080).--debug- Enable debug logging--show-prompts- Log the prompts sent to the LLM for every request (useful for debugging)--streaming- Stream tokens to clients via SSE (OpenAI-style)--step-through- Pause between agent steps for manual inspection (development aid)
MCP Client
MCP Client
Connect GAIA agents to external MCP servers
~/.gaia/mcp_servers.json by default, or to a custom config file using --config.
Commands
Managing MCP servers (add / remove)
gaia mcp add and gaia mcp remove were removed in #977 — MCP servers are now
configured through the connectors framework. Run gaia connectors --help for the
current commands. gaia mcp list (below) still lists configured servers.gaia mcp list
List all configured MCP servers.
--config PATH- Custom config file path (default:~/.gaia/mcp_servers.json)
gaia mcp tools
List tools available from a configured MCP server.
<server-name>- Name of the server to query
--config PATH- Custom config file path (default:~/.gaia/mcp_servers.json)
gaia mcp test-client
Test connection to a configured MCP server.
<server-name>- Name of the server to test
--config PATH- Custom config file path (default:~/.gaia/mcp_servers.json)
MCP Bridge
MCP Bridge
Expose GAIA agents as MCP servers
Quick Start
Install MCP support:Commands
| Command | Description |
|---|---|
start | Start the MCP bridge server |
status | Check MCP server status |
stop | Stop background MCP bridge server |
test | Test MCP bridge functionality (optionally against a specific --tool) |
agent | Invoke the MCP orchestrator agent (positional request arg + --domain, --context) |
docker | Start the Docker MCP server (default port 8080) |
gaia mcp start options
| Option | Default | Description |
|---|---|---|
--host | localhost | Bind address |
--port | 8765 | Port for the MCP bridge |
--auth-token | none | If set, require Authorization: Bearer <token> on every request |
--no-streaming | off | Disable SSE streaming; reply synchronously |
--background | off | Detach; use gaia mcp stop to terminate |
--log-file | stdout | Write logs to this path (useful with --background) |
--verbose | off | Verbose logging |
--ctx-size | 32768 | Context window hint passed to Lemonade for loaded models |
gaia mcp test options
| Option | Default | Description |
|---|---|---|
--query | required | Text to send to the server |
--tool | gaia.chat | Which MCP tool to invoke when testing |
gaia mcp agent
request and, optionally, a --domain (e.g. jira,
docker) and free-form --context string to steer the orchestrator.
gaia mcp docker options
--port defaults to 8080.
→ Full MCP Integration Guide
Connectors
Connectors
OAuth providers, MCP-server connectors, and per-agent scope grants
| Subcommand | Description |
|---|---|
list (alias status) | List connectors in the catalog with their status (--json for machine-readable output) |
connect <id> | Authorize an OAuth connector (opens a browser); --scopes to request specific scopes |
configure <id> | Configure a connector — --set KEY=VALUE (repeatable) or --json '<object>' (e.g. MCP API keys, OAuth client creds) |
test <id> | Run a health check for a configured connector |
disconnect <id> | Remove credentials and reset a connector’s state |
grants {list|grant|revoke} | Manage per-agent scope grants (credential access) |
activations {list|activate|deactivate} | Manage per-agent MCP tool visibility (mcp_server connectors only) |
Model Management
Download Command
Download all models required for GAIA agents with streaming progress.| Option | Type | Default | Description |
|---|---|---|---|
--agent | string | all | Agent to download models for |
--list | flag | false | List required models without downloading |
--timeout | integer | 1800 | Timeout per model in seconds |
--host | string | localhost | Lemonade server host |
--port | integer | 13305 | Lemonade server port |
--clear-cache | flag | false | Remove cached download progress/metadata before running |
Pull Command
To download individual models, use the Lemonade Server CLI directly:Evaluation Commands
Evaluation Framework
Systematic testing, benchmarking, and model comparison
- Agent eval benchmark (scenario-based, end-to-end)
- Auto-fixing failures with Claude Code
- Report generation
- Performance-log visualization
Agent Eval
Agent Eval Benchmark
Scenario-based end-to-end testing of the GAIA Agent UI
| Option | Type | Default | Description |
|---|---|---|---|
--scenario | string | all | Run a specific scenario by ID |
--category | string | all | Run all scenarios in a category |
--backend | string | http://localhost:4200 | Agent UI backend URL |
--model | string | claude-sonnet-4-6 | Judge/simulator model |
--budget | string | 2.00 | Maximum USD spend per scenario |
--timeout | integer | 900 | Per-scenario timeout in seconds (auto-scaled for complex scenarios) |
--audit-only | flag | false | Run architecture audit without LLM calls (free) |
--generate-corpus | flag | false | Regenerate corpus documents and validate manifest.json |
--fix | flag | false | Auto-invoke Claude Code to diagnose and repair failures, then re-evaluate |
--max-fix-iterations | integer | 3 | Maximum repair cycles in --fix mode |
--target-pass-rate | float | 0.90 | Stop fix iterations when judged pass rate reaches this threshold |
--compare | string(s) | — | Compare two scorecard files, or one scorecard against saved baseline |
--save-baseline | flag | false | Save this run’s scorecard as eval/results/baseline.json |
--capture-session | string | — | Convert an Agent UI session (by UUID) into a YAML scenario file |
--scenario-dir | string | — | Additional scenario directory to scan (repeatable) |
--corpus-dir | string | — | Additional corpus directory to scan (repeatable) |
--tag | string | — | Run only scenarios with this tag (repeatable, OR logic) |
--output-format | choice | — | Output format: json, markdown, or junit |
The eval agent requires Claude Code CLI (
claude command), an Anthropic API key, and the Agent UI backend running. See the Agent Eval Guide for full setup instructions.Email Throughput Benchmark
Measure end-to-end email-triage throughput (tokens/sec), time-to-first-token, and pipeline latency for an on-device model. Direct-drives the email agent over the committed synthetic corpus and harvests metrics from the agent’s per-step stats. The committed bar is ≥10 tok/s (snappy-UX stretch ~30 tok/s); the benchmark is non-gating — a miss is reported, not failed.| Option | Type | Default | Description |
|---|---|---|---|
--model | str | Gemma-4-E4B-it-GGUF | Lemonade model id to benchmark |
--limit | int | 50 | Target message count steered to the triage call (effective = min(limit, corpus size)) |
--experiments | int | 1 | Repeat count per model; >1 prints a variance report |
--mbox-path | str | stub fixture | MBOX corpus to triage |
--ground-truth | str | fixture’s | Ground-truth JSON for quality scoring |
--backend | str | agent default | Lemonade base URL |
--output-dir | str | — | Write scorecard.json / summary.md / variance.json |
--compare | path | — | Compare throughput against a saved baseline (non-gating) |
--save-baseline | flag | off | Save this run’s scorecard as the throughput baseline |
Runs from a GAIA repo checkout (it drives the synthetic corpus in
tests/fixtures/email/) and needs Lemonade serving the target model. Run at most one gaia eval process at a time against a single Lemonade server.Performance Visualization
Plot llama.cpp server performance metrics from one or more log files. Plots are saved as images; pass--show to also display them interactively.
| Option | Type | Default | Description |
|---|---|---|---|
log_paths | path(s) | - | One or more llama.cpp server log files to visualize |
--show | flag | false | Display plots interactively in addition to saving images |
Memory
Agent Memory Guide
Persistent second brain — remembers facts, preferences, and workflows across sessions
Commands
- status
- bootstrap
Show aggregate memory statistics.Output includes:
- Knowledge entries by category (fact, preference, error, skill) and context
- Conversation count and session history
- Tool call success rates and error counts
- Upcoming and overdue time-sensitive items
- Database size
Utility Commands
Stats Command
View performance statistics from the most recent model run.Test Commands
Run various tests for development and troubleshooting.- TTS Tests
- ASR Tests
Test Types:Test streaming:Generate audio file:
tts-preprocessing- Test TTS text preprocessingtts-streaming- Test TTS streaming playbacktts-audio-file- Test TTS audio file generation
--test-text- Text to use for TTS tests--output-audio-file- Output file path (default: output.wav)
YouTube Utilities
Download transcripts from YouTube videos.--download-transcript- YouTube URL to download transcript from--output-path- Output file path (defaults to transcript_.txt)
Cache Command
Inspect or clear GAIA’s on-disk caches (document-Q7 metadata, chat history, context7 library docs, etc.). Caches live under~/.gaia/cache/ (Windows:
%LOCALAPPDATA%/gaia/cache/).
| Action | Description |
|---|---|
status | Print size and location of each cache directory |
clear | Remove cache contents (scoped by flags below) |
clear:
| Flag | Description |
|---|---|
--all | Remove every cache directory (safe: preserves models under ~/.gaia/models/) |
--context7 | Clear only the Context7 MCP library-docs cache |
Kill Command
Terminate processes running on specific ports.| Option | Type | Description |
|---|---|---|
--port | integer | Port number to kill process on |
--lemonade | flag | Kill Lemonade server (port 13305) |
- Find the process ID (PID) bound to the specified port
- Forcefully terminate that process
- Provide feedback about success or failure
Diagnostics Command
Bundle system info and logs into a tarball for bug reports.| Option | Type | Default | Description |
|---|---|---|---|
--output | path | ~/.gaia/diagnostics-<YYYYMMDD-HHMMSS>.tgz | Destination path for the tarball |
--no-logs | flag | false | Omit log files (useful for public bug reports) |
- System info snapshot (
uname -a, distro, relevant env vars, all TCP listeners viass -tlnp) - State files from
~/.gaia/(config, session data — no chat content) - Log files:
~/.gaia/gaia.logand~/.gaia/electron-main.log(omitted with--no-logs)
Global Options
All commands support these global options:| Option | Type | Default | Description |
|---|---|---|---|
--logging-level | string | INFO | Logging verbosity [DEBUG, INFO, WARNING, ERROR, CRITICAL] |
-v, --version | flag | - | Show program’s version and exit |
Troubleshooting
Connection Errors
Connection Errors
If you get connection errors, ensure Lemonade server is running:
Model Issues
Model Issues
Check available system memory (16GB+ recommended)Verify model compatibility:Pre-download models:Install additional models: See Features Guide
Audio Issues
Audio Issues
List available devices:Verify microphone permissions in Windows settingsTry different audio device indices if default doesn’t work
Performance
Performance
For optimal NPU performance:
- Disable discrete GPUs in Device Manager
- Ensure NPU drivers are up to date
- Monitor system resources during execution
See Also
Code Agent
Python/TypeScript development
Blender Agent
3D scene creation
Voice Interaction
Speech-to-speech conversation
API Server
OpenAI-compatible REST API
MCP Integration
Model Context Protocol
Evaluation Framework
Testing and benchmarking
Agent Memory
Persistent memory across sessions