Source Code:
src/gaia/cli.pyPlatform Support
Windows 11
Full GUI and CLI support with installer and desktop shortcuts
Linux
Full GUI and CLI support via source installation (Ubuntu/Debian)
Quick Start
- Windows
- Linux
- Follow the Getting Started Guide to install
gaiaCLI andlemonadeLLM server - Double click the GAIA-CLI desktop icon to launch the command-line shell
- GAIA automatically starts Lemonade Server when needed, or start manually:
Core Commands
LLM Direct Query
The fastest way to interact with AI models - no server management required.
| Option | Type | Default | Description |
|---|---|---|---|
--model | string | Client default | Specify the model to use |
--max-tokens | integer | 512 | Maximum tokens to generate |
--no-stream | flag | false | Disable streaming response |
Chat Command
Start an interactive conversation or send a single message with conversation history.- No message: Starts interactive chat session
- Message provided: Sends single message and exits
| Option | Type | Default | Description |
|---|---|---|---|
--query, -q | string | - | Single query to execute |
--model | string | Qwen3-Coder-30B-A3B-Instruct-GGUF | Model name to use |
--max-steps | integer | 10 | Maximum conversation steps |
--index, -i | path(s) | - | PDF document(s) to index for RAG |
--watch, -w | path(s) | - | Directories to monitor for new documents |
--chunk-size | integer | 500 | Document chunk size for RAG |
--max-chunks | integer | 3 | Maximum chunks to retrieve for RAG |
--stats | flag | false | Show performance statistics |
--streaming | flag | false | Enable streaming responses |
--show-prompts | flag | false | Display prompts sent to LLM |
--debug | flag | false | Enable debug output |
--list-tools | flag | false | List available tools and exit |
| Command | Description |
|---|---|
/clear | Clear conversation history |
/history | Show conversation history |
/system | Show current system prompt configuration |
/model | Show current model information |
/prompt | Show complete formatted prompt sent to LLM |
/stats | Show performance statistics (tokens/sec, latency, token counts) |
/help | Show available commands |
quit, exit, bye | End the chat session |
Prompt Command
Send a single prompt to a GAIA agent.| Option | Type | Default | Description |
|---|---|---|---|
--model | string | Qwen2.5-0.5B-Instruct-CPU | Model to use for the agent |
--max-tokens | integer | 512 | Maximum tokens to generate |
--stats | flag | false | Show performance statistics |
Specialized Agents
Code Agent
Code Development
AI-powered code generation, analysis, and linting for Python/TypeScript
- Intelligent Language Detection (Python/TypeScript)
- Code Generation (functions, classes, unit tests)
- Autonomous Workflow (planning → implementation → testing → verification)
- Automatic Test Generation
- Iterative Error Correction
- Code Analysis with AST
- Linting & Formatting
Blender Agent
3D Scene Creation
Natural language 3D modeling and scene manipulation
- Natural Language 3D Modeling
- Interactive Planning
- Object Management
- Material Assignment
- MCP Integration
Talk Command
Voice Interaction
Speech-to-speech conversation with optional document Q&A
| Option | Type | Default | Description |
|---|---|---|---|
--model | string | Qwen2.5-0.5B-Instruct-CPU | Model to use |
--max-tokens | integer | 512 | Maximum tokens to generate |
--no-tts | flag | false | Disable text-to-speech |
--audio-device-index | integer | auto-detect | Audio input device index |
--whisper-model-size | string | base | Whisper model [tiny, base, small, medium, large] |
--silence-threshold | float | 0.5 | Silence threshold in seconds |
--stats | flag | false | Show performance statistics |
--index, -i | path | - | PDF document for voice Q&A |
API Server
API Server
OpenAI-compatible REST API for VSCode and IDE integrations
Quick Start
- Start Lemonade with extended context:
- Start GAIA API server:
- Test the server:
Commands
- Start
- Status
- Stop
--host- Server host (default: localhost)--port- Server port (default: 8080)--background- Run in background--debug- Enable debug logging
MCP Bridge
MCP Integration
Model Context Protocol for external integrations
Quick Start
Install MCP support:Commands
| Command | Description |
|---|---|
start | Start the MCP bridge server |
status | Check MCP server status |
stop | Stop background MCP bridge server |
test | Test MCP bridge functionality |
agent | Test MCP orchestrator agent |
docker | Start Docker MCP server |
Model Management
Download Command
Download all models required for GAIA agents with streaming progress.| Option | Type | Default | Description |
|---|---|---|---|
--agent | string | all | Agent to download models for |
--list | flag | false | List required models without downloading |
--timeout | integer | 1800 | Timeout per model in seconds |
--host | string | localhost | Lemonade server host |
--port | integer | 8000 | Lemonade server port |
Pull Command
Download/install a specific model from the Lemonade Server registry.| Option | Type | Description |
|---|---|---|
--checkpoint | string | HuggingFace checkpoint (e.g., unsloth/Model-GGUF:Q4_K_M) |
--recipe | string | Lemonade recipe (e.g., llamacpp, oga-cpu) |
--reasoning | flag | Mark as reasoning model (like DeepSeek) |
--vision | flag | Mark as having vision capabilities |
--embedding | flag | Mark as embedding model |
--reranking | flag | Mark as reranking model |
--mmproj | string | Multimodal projector file for vision models |
--timeout | integer | Timeout in seconds (default: 1200) |
--host | string | Lemonade server host (default: localhost) |
--port | integer | Lemonade server port (default: 8000) |
Use the
user. prefix for custom models not in the official registry. Custom models require both --checkpoint and --recipe parameters.Evaluation Commands
Evaluation Framework
Systematic testing, benchmarking, and model comparison
- Ground Truth Generation
- Automated Evaluation
- Batch Experimentation
- Performance Analysis
- Transcript Testing
Visualize Command
Launch interactive web-based visualizer for comparing evaluation results.| Option | Type | Default | Description |
|---|---|---|---|
--port | integer | 3000 | Visualizer server port |
--experiments-dir | path | ./output/experiments | Experiments directory |
--evaluations-dir | path | ./output/evaluations | Evaluations directory |
--workspace | path | current directory | Base workspace directory |
--no-browser | flag | false | Don’t auto-open browser |
--host | string | localhost | Host address |
- Interactive Comparison (side-by-side)
- Key Metrics Dashboard
- Quality Analysis
- Real-time Updates
- Responsive Design
Node.js must be installed. Dependencies are automatically installed on first run.
Utility Commands
Stats Command
View performance statistics from the most recent model run.Test Commands
Run various tests for development and troubleshooting.- TTS Tests
- ASR Tests
Test Types:Test streaming:Generate audio file:
tts-preprocessing- Test TTS text preprocessingtts-streaming- Test TTS streaming playbacktts-audio-file- Test TTS audio file generation
--test-text- Text to use for TTS tests--output-audio-file- Output file path (default: output.wav)
YouTube Utilities
Download transcripts from YouTube videos.--download-transcript- YouTube URL to download transcript from--output-path- Output file path (defaults to transcript_.txt)
Kill Command
Terminate processes running on specific ports.- Find the process ID (PID) bound to the specified port
- Forcefully terminate that process
- Provide feedback about success or failure
Global Options
All commands support these global options:| Option | Type | Default | Description |
|---|---|---|---|
--logging-level | string | INFO | Logging verbosity [DEBUG, INFO, WARNING, ERROR, CRITICAL] |
-v, --version | flag | - | Show program’s version and exit |
Troubleshooting
Connection Errors
Connection Errors
If you get connection errors, ensure Lemonade server is running:
Model Issues
Model Issues
Check available system memory (16GB+ recommended)Verify model compatibility:Pre-download models:Install additional models: See Features Guide
Audio Issues
Audio Issues
List available devices:Verify microphone permissions in Windows settingsTry different audio device indices if default doesn’t work
Performance
Performance
For optimal NPU performance:
- Disable discrete GPUs in Device Manager
- Ensure NPU drivers are up to date
- Monitor system resources during execution
See Also
Code Agent
Python/TypeScript development
Blender Agent
3D scene creation
Voice Interaction
Speech-to-speech conversation
API Server
OpenAI-compatible REST API
MCP Integration
Model Context Protocol
Evaluation Framework
Testing and benchmarking