Agent Base Class - GAIA SDK

🔧 You are viewing: API Specification - Complete technical reference for the Agent classSee also: Conceptual Guide · Quickstart Tutorial

Component: Agent Base Class
Module: gaia.agents.base.agent
Import: from gaia.agents.base.agent import Agent
Source: src/gaia/agents/base/agent.py

Overview

The Agent class is the foundational base class for all GAIA agents. It provides a complete reasoning loop that connects LLMs with executable tools, manages conversation state, handles error recovery, and orchestrates multi-step planning and execution. What it does:

Manages LLM conversation and reasoning loop
Registers and executes tools
Parses and validates JSON responses from LLMs
Handles multi-step plan execution
Provides error recovery and retry logic
Manages conversation history and state
Supports streaming and non-streaming responses
Integrates with AgentSDK for LLM communication

Why use it:

Foundation for building any AI agent
Handles complex reasoning patterns (planning, execution, error recovery)
Provides tool registration and execution framework
Supports multiple LLM backends (local, Claude, ChatGPT)
Built-in debugging and tracing capabilities

Purpose and Use Cases

When to Use

Building Custom Agents
- Domain-specific assistants (customer service, data analysis, etc.)
- Task automation agents
- Multi-step workflow orchestrators
Tool-Based Applications
- Agents that need to interact with APIs, databases, or file systems
- Applications requiring LLM decision-making with concrete actions
Research and Experimentation
- Testing different prompting strategies
- Evaluating LLM performance on multi-step tasks
- Building agent benchmarks

When NOT to Use

Simple single-turn LLM queries (use AgentSDK directly)
Applications without tools (use LLMClient directly)
Stateless request/response patterns (use quick_chat())

API Specification

Class Definition

from abc import ABC, abstractmethod
from typing import Any, Dict, List, Optional

class Agent(ABC):
    """Base class for all GAIA agents."""

    # State constants
    STATE_PLANNING = "PLANNING"
    STATE_EXECUTING_PLAN = "EXECUTING_PLAN"
    STATE_DIRECT_EXECUTION = "DIRECT_EXECUTION"
    STATE_ERROR_RECOVERY = "ERROR_RECOVERY"
    STATE_COMPLETION = "COMPLETION"

Constructor

def __init__(
    self,
    use_claude: bool = False,
    use_chatgpt: bool = False,
    claude_model: str = "claude-sonnet-4-20250514",
    base_url: Optional[str] = None,
    model_id: str = None,
    max_steps: int = 20,
    debug_prompts: bool = False,
    show_prompts: bool = False,
    output_dir: str = None,
    streaming: bool = False,
    show_stats: bool = False,
    silent_mode: bool = False,
    debug: bool = False,
    output_handler = None,
    max_plan_iterations: int = 3,
    max_consecutive_repeats: int = 4,
    min_context_size: int = 32768,
    skip_lemonade: bool = False,
) -> None:
    """
    Initialize the Agent.

    Args:
        use_claude: Use Claude API instead of local LLM
        use_chatgpt: Use ChatGPT/OpenAI API instead of local LLM
        claude_model: Model to use when use_claude=True
        base_url: Local LLM server URL (default: from LEMONADE_BASE_URL or http://localhost:8000/api/v1)
        model_id: Model ID for local LLM (default: Qwen3.5-35B-A3B-GGUF)
        max_steps: Maximum reasoning iterations (default: 20)
        debug_prompts: Include prompts in conversation history (default: False)
        show_prompts: Display prompts sent to LLM (default: False)
        output_dir: Directory for JSON output files (default: current directory)
        streaming: Enable streaming responses (default: False)
        show_stats: Display LLM performance stats (default: False)
        silent_mode: Suppress all console output (default: False)
        debug: Enable debug logging (default: False)
        output_handler: Custom output handler (default: creates AgentConsole or SilentConsole)
        max_plan_iterations: Max plan-execute-replan cycles (default: 3, 0=unlimited)
        max_consecutive_repeats: Max consecutive identical tool calls before stopping (default: 4)
        min_context_size: Minimum context size required for this agent (default: 32768)
        skip_lemonade: Skip Lemonade server initialization (default: False). Use when connecting to a different OpenAI-compatible backend.
    """

Abstract Methods (Must Implement)

@abstractmethod
def _get_system_prompt(self) -> str:
    """
    Return the system prompt for the agent.

    This defines the agent's role, capabilities, and behavior.
    Must be implemented by all subclasses.

    Returns:
        System prompt string
    """
    pass

@abstractmethod
def _create_console(self):
    """
    Create and return the console output handler.

    Returns:
        AgentConsole, SilentConsole, or custom output handler
    """
    pass

@abstractmethod
def _register_tools(self) -> None:
    """
    Register all agent-specific tools.

    Use the @tool decorator to register functions within this method.
    """
    pass

Core Methods

def process_query(
    self,
    user_input: str,
    max_steps: int = None,
    trace: bool = False,
    filename: str = None
) -> Dict[str, Any]:
    """
    Process a user query through the agent reasoning loop.

    Args:
        user_input: User's question or command
        max_steps: Override default max_steps (optional)
        trace: Enable detailed execution tracing (default: False)
        filename: File to write trace output (optional)

    Returns:
        dict: {
            "answer": str,           # Final answer from agent
            "steps": int,            # Number of reasoning steps taken
            "tools_used": List[str], # Tools called during execution
            "success": bool,         # Whether query completed successfully
            "error": str             # Error message if success=False
        }
    """
    pass

def _execute_tool(self, tool_name: str, tool_args: dict) -> Any:
    """
    Execute a registered tool by name (internal — called by the planning loop
    after parsing LLM output). Not part of the public surface.

    For programmatic tool invocation in tests or custom integrations, call
    the tool function directly via the registry:

        from gaia.agents.base.tools import _TOOL_REGISTRY
        result = _TOOL_REGISTRY["my_tool"]["function"](**tool_args)

    Args:
        tool_name: Name of the tool to execute
        tool_args: Dictionary of arguments for the tool

    Returns:
        Tool execution result

    Raises:
        ValueError: If tool_name not registered
    """
    pass

def list_tools(self, verbose: bool = True) -> None:
    """
    Display all registered tools.

    Args:
        verbose: Show full descriptions (default: True)
    """
    pass

Helper Methods

def validate_json_response(self, response_text: str) -> Dict[str, Any]:
    """
    Validate and fix JSON responses from LLM.

    Applies multiple strategies to extract valid JSON:
    1. Parse as-is if valid
    2. Extract from code blocks (```json)
    3. Bracket-matching extraction
    4. Fix common syntax errors

    Args:
        response_text: Raw LLM response

    Returns:
        Parsed JSON dictionary

    Raises:
        ValueError: If JSON cannot be extracted
    """
    pass

def _format_tools_for_prompt(self) -> str:
    """
    Format registered tools for system prompt.

    Returns:
        Formatted string describing all tools
    """
    pass

State Management Properties

# Current execution state
self.execution_state: str  # One of STATE_* constants

# Current plan being executed
self.current_plan: Optional[Dict]

# Current step in plan
self.current_step: int

# Total steps in current plan
self.total_plan_steps: int

# Number of plan iterations
self.plan_iterations: int

# Conversation history
self.conversation_history: List[Dict]

# Error history for learning
self.error_history: List[Dict]

# Last execution result
self.last_result: Any

Implementation Details

Reasoning Loop

The agent implements a sophisticated reasoning loop:

Planning State - LLM creates a multi-step plan
Execution State - Execute plan steps sequentially
Error Recovery State - Handle failures and retry
Completion State - Generate final answer

State Transitions:

PLANNING → EXECUTING_PLAN → COMPLETION
                ↓
         ERROR_RECOVERY → PLANNING (replan)
                ↓
         DIRECT_EXECUTION (simple tools)

Tool Execution

Tools are registered using the @tool decorator in _register_tools():

def _register_tools(self):
    from gaia.agents.base.tools import tool

    @tool
    def my_tool(param: str) -> dict:
        """Tool description."""
        # Implementation
        return {"result": "value"}

JSON Response Format

The LLM must respond with JSON in one of these formats: Thought + Tool:

{
    "thought": "I need to search for information",
    "tool": "search_database",
    "tool_args": {"query": "user data"}
}

Answer:

{
    "thought": "I have all the information needed",
    "answer": "Here is the complete answer..."
}

Plan:

{
    "thought": "This requires multiple steps",
    "plan": [
        {"step": 1, "tool": "search", "args": {...}},
        {"step": 2, "tool": "analyze", "args": {...}}
    ]
}

Error Handling

The agent automatically handles:

Invalid JSON responses (with repair attempts)
Tool execution failures (with retry logic)
Missing required fields (prompts LLM for correction)
Max steps exceeded (graceful termination)

Code Examples

Example 1: Minimal Agent

from gaia.agents.base.agent import Agent
from gaia.agents.base.tools import tool
from gaia.agents.base.console import AgentConsole

class WeatherAgent(Agent):
    """Simple weather information agent."""

    def _get_system_prompt(self) -> str:
        return """You are a weather assistant.
        Use get_weather to fetch current conditions.
        Always include temperature and conditions in your response."""

    def _create_console(self):
        return AgentConsole()

    def _register_tools(self):
        @tool
        def get_weather(city: str) -> dict:
            """Get current weather for a city.

            Args:
                city: Name of the city

            Returns:
                Weather data dictionary
            """
            # Simulate API call
            return {
                "city": city,
                "temperature": 72,
                "conditions": "Sunny",
                "humidity": 45
            }

# Usage
agent = WeatherAgent()
result = agent.process_query("What's the weather in Austin?")
print(result["result"])

Example 2: Database Agent with Multiple Tools

from gaia.agents.base.agent import Agent
from gaia.agents.base.tools import tool
import sqlite3

class CustomerAgent(Agent):
    """Agent for customer relationship management."""

    def __init__(self, db_path: str = "customers.db", **kwargs):
        self.db_path = db_path
        self.conn = sqlite3.connect(db_path)
        super().__init__(**kwargs)

    def _get_system_prompt(self) -> str:
        return """You are a customer service assistant.
        You can search for customers, create new records, and update information.
        Always confirm actions with the user before executing."""

    def _create_console(self):
        from gaia.agents.base.console import AgentConsole
        return AgentConsole()

    def _register_tools(self):
        @tool
        def search_customer(name: str) -> dict:
            """Search for customers by name."""
            cursor = self.conn.execute(
                "SELECT * FROM customers WHERE name LIKE ?",
                (f"%{name}%",)
            )
            customers = [dict(row) for row in cursor.fetchall()]
            return {"customers": customers, "count": len(customers)}

        @tool
        def create_customer(name: str, email: str, phone: str = "") -> dict:
            """Create a new customer record."""
            cursor = self.conn.execute(
                "INSERT INTO customers (name, email, phone) VALUES (?, ?, ?)",
                (name, email, phone)
            )
            self.conn.commit()
            return {"status": "created", "id": cursor.lastrowid}

        @tool
        def update_notes(customer_id: int, notes: str) -> dict:
            """Add notes to a customer record."""
            self.conn.execute(
                "UPDATE customers SET notes = ? WHERE id = ?",
                (notes, customer_id)
            )
            self.conn.commit()
            return {"status": "updated"}

# Usage
agent = CustomerAgent()
result = agent.process_query(
    "Find customer John Smith and add a note that he called today"
)

Example 3: Silent Mode for API

from gaia.agents.base.agent import Agent
from gaia.agents.base.console import SilentConsole

class APIAgent(Agent):
    """Agent for JSON-only API usage."""

    def __init__(self, **kwargs):
        # Enable silent mode
        super().__init__(silent_mode=True, **kwargs)

    def _get_system_prompt(self) -> str:
        return "You are a helpful assistant."

    def _create_console(self):
        return SilentConsole()

    def _register_tools(self):
        # Register tools...
        pass

# Usage in API endpoint
agent = APIAgent()
result = agent.process_query("Process this request")
# No console output, only JSON result
return result

Testing Requirements

Unit Tests

File: tests/agents/base/test_agent.py

import pytest
from gaia.agents.base.agent import Agent
from gaia.agents.base.tools import tool
from gaia.agents.base.console import SilentConsole

class TestAgent(Agent):
    """Test agent implementation."""

    def _get_system_prompt(self):
        return "You are a test assistant."

    def _create_console(self):
        return SilentConsole()

    def _register_tools(self):
        @tool
        def test_tool(param: str) -> dict:
            """Test tool."""
            return {"result": param}

def test_agent_creation():
    """Test agent can be created."""
    agent = TestAgent(silent_mode=True)
    assert agent is not None
    assert agent.system_prompt is not None

def test_tool_registration():
    """Test tools are registered correctly."""
    agent = TestAgent(silent_mode=True)
    agent.list_tools()  # Should not raise

def test_tool_execution():
    """Test tool execution by calling the registered function directly."""
    from gaia.agents.base.tools import _TOOL_REGISTRY
    TestAgent(silent_mode=True)  # registers the tool at instance init
    result = _TOOL_REGISTRY["test_tool"]["function"](param="value")
    assert result["result"] == "value"

def test_json_validation():
    """Test JSON response validation."""
    agent = TestAgent(silent_mode=True)

    # Valid JSON
    json_str = '{"thought": "test", "answer": "response"}'
    result = agent.validate_json_response(json_str)
    assert result["answer"] == "response"

    # JSON in code block
    json_str = '```json\n{"thought": "test", "answer": "response"}\n```'
    result = agent.validate_json_response(json_str)
    assert result["answer"] == "response"

def test_state_management():
    """Test state transitions."""
    agent = TestAgent(silent_mode=True)
    assert agent.execution_state == Agent.STATE_PLANNING

def test_conversation_history():
    """Test conversation history tracking."""
    agent = TestAgent(silent_mode=True)
    assert isinstance(agent.conversation_history, list)
    assert len(agent.conversation_history) == 0

def test_max_steps():
    """Test max steps configuration."""
    agent = TestAgent(max_steps=10, silent_mode=True)
    assert agent.max_steps == 10

Integration Tests

def test_end_to_end_query():
    """Test complete query processing."""
    agent = TestAgent(silent_mode=True)
    result = agent.process_query("Test query")

    assert "answer" in result
    assert "steps" in result
    assert "tools_used" in result
    assert "success" in result

def test_multiple_tools():
    """Test agent using multiple tools."""
    # Implementation depends on specific tools
    pass

def test_error_recovery():
    """Test error recovery behavior."""
    # Implementation testing error handling
    pass

Dependencies

Required Packages

# Standard library
import abc
import datetime
import inspect
import json
import logging
import os
import re
import subprocess
import uuid
from typing import Any, Dict, List, Optional

# GAIA packages
from gaia.agents.base.console import AgentConsole, SilentConsole
from gaia.agents.base.errors import format_execution_trace
from gaia.agents.base.tools import _TOOL_REGISTRY
from gaia.chat.sdk import AgentConfig, AgentSDK

External Dependencies

[project]
dependencies = [
    # Required by AgentSDK
]

Performance Considerations

Memory Management

Conversation history limited by max_history_length in AgentConfig
Error history stored but not automatically pruned
Consider clearing history for long-running agents

Token Usage

Default context window: 4096 tokens
Tool descriptions consume ~100-500 tokens
Monitor token usage with show_stats=True
Use smaller models for simple tasks (Qwen3-0.6B)
Use larger models for complex reasoning (Qwen3.5-35B)

Latency

Streaming mode reduces perceived latency
Tool execution time depends on tool implementation
Local LLM faster than cloud APIs but lower quality
Plan-execute cycles add latency vs direct execution

Migration Notes

From Direct LLM Calls

Before:

from gaia.chat.sdk import quick_chat
response = quick_chat("Do something")

After:

from my_agent import MyAgent
agent = MyAgent()
result = agent.process_query("Do something")

Adding Tools to Existing Agent

# In _register_tools()
@tool
def new_capability(param: str) -> dict:
    """Description of new capability."""
    return {"result": "value"}

Future Enhancements

Parallel tool execution for independent operations
Tool dependency graph for optimal ordering
Automatic tool discovery from modules
Tool result caching
Multi-agent collaboration patterns
Enhanced planning with A* or beam search
Learning from error history
Dynamic system prompt adjustment

Status: ✅ Implemented and tested Last Updated: April 2026 Specification Version: 1.0.0

Core Framework

SDKs

Infrastructure

Code Infrastructure

Tool Mixins

Packaging

Agents & Apps

​Overview

​Purpose and Use Cases

​When to Use

​When NOT to Use

​API Specification

​Class Definition

​Constructor

​Abstract Methods (Must Implement)

​Core Methods

​Helper Methods

​State Management Properties

​Implementation Details

​Reasoning Loop

​Tool Execution

​JSON Response Format

​Error Handling

​Code Examples

​Example 1: Minimal Agent

​Example 2: Database Agent with Multiple Tools

​Example 3: Silent Mode for API

​Testing Requirements

​Unit Tests

​Integration Tests

​Dependencies

​Required Packages

​External Dependencies

​Performance Considerations

​Memory Management

​Token Usage

​Latency

​Migration Notes

​From Direct LLM Calls

​Adding Tools to Existing Agent

​Future Enhancements

Overview

Purpose and Use Cases

When to Use

When NOT to Use

API Specification

Class Definition

Constructor

Abstract Methods (Must Implement)

Core Methods

Helper Methods

State Management Properties

Implementation Details

Reasoning Loop

Tool Execution

JSON Response Format

Error Handling

Code Examples

Example 1: Minimal Agent

Example 2: Database Agent with Multiple Tools

Example 3: Silent Mode for API

Testing Requirements

Unit Tests

Integration Tests

Dependencies

Required Packages

External Dependencies

Performance Considerations

Memory Management

Token Usage

Latency

Migration Notes

From Direct LLM Calls

Adding Tools to Existing Agent

Future Enhancements