LLM Client - GAIA SDK

Source Code:

src/gaia/llm/__init__.py - Package exports
src/gaia/llm/base_client.py - Abstract interface
src/gaia/llm/factory.py - Client factory
src/gaia/llm/providers/ - Provider implementations

Primary API: create_client() factory function Module: gaia.llm Imports:

from gaia.llm import create_client (preferred)
from gaia.llm import LLMClient, NotSupportedError

Overview

The LLM client package provides a unified interface for generating text from multiple LLM backends using a provider pattern. Each provider implements the abstract LLMClient interface, with optional methods raising NotSupportedError when unavailable. Key Features:

Factory-based client creation with create_client()
Three providers: Lemonade (local AMD-optimized), OpenAI, Claude
Abstract base class for type safety and extensibility
Graceful handling of unsupported features via NotSupportedError
Streaming and non-streaming generation
Backward-compatible use_claude/use_openai flags

Provider Capabilities:

Method	Lemonade	OpenAI	Claude
`generate()`	✓	✓	✓
`chat()`	✓	✓	✓
`embed()`	✓	✓	✗
`vision()`	✓	✗	✓
`get_performance_stats()`	✓	✗	✗
`load_model()`	✓	✗	✗
`unload_model()`	✓	✗	✗

Methods marked with ✗ raise NotSupportedError when called on that provider.

Requirements

Functional Requirements

Factory Pattern
- create_client() factory function for client creation
- Explicit provider selection via provider parameter (“lemonade”, “openai”, “claude”)
- Backward-compatible use_claude/use_openai flags
- Auto-detection of provider from flags when provider not specified
- Default to Lemonade provider when no flags set
Abstract Interface
- LLMClient ABC defines unified interface
- provider_name property returns provider name
- Required methods (all providers must implement):
  - generate() - Text completion
  - chat() - Chat completion with message history
- Optional methods (raise NotSupportedError if not implemented):
  - embed() - Generate embeddings
  - vision() - Vision/image understanding
  - get_performance_stats() - Performance statistics
  - load_model() - Load a model
  - unload_model() - Unload current model
Provider Implementations
- LemonadeProvider: Full support for all methods, connects to local Lemonade server
- OpenAIProvider: generate, chat, embed only
- ClaudeProvider: generate, chat, vision only
- All providers support streaming and non-streaming modes
Error Handling
- NotSupportedError raised for unsupported methods
- Clear error messages indicating provider and unsupported method
- Connection errors handled by underlying provider implementations

Non-Functional Requirements

Performance
- Lazy provider loading via importlib (load only when needed)
- Minimal overhead from abstraction layer
- Streaming support across all providers
- Default temperature of 0.1 for deterministic responses
Reliability
- Type safety through ABC pattern
- Graceful handling of unsupported features
- Clear error messages for provider capabilities
- Provider-specific connection management
Usability
- Simple factory function interface
- Backward compatibility with existing code
- Consistent API across all providers
- Clear documentation with examples

API Specification

Package Structure

src/gaia/llm/
├── __init__.py              # Package exports
├── base_client.py           # Abstract LLMClient interface
├── factory.py               # create_client() factory function
├── exceptions.py            # NotSupportedError
├── lemonade_client.py       # Low-level REST client for Lemonade
└── providers/
    ├── lemonade.py          # LemonadeProvider
    ├── openai_provider.py   # OpenAIProvider
    └── claude.py            # ClaudeProvider

Package Exports (`init.py`)

from gaia.llm import create_client, LLMClient, VLMClient, NotSupportedError

VLMClient is also re-exported from gaia.llm — see vlm-client for its API.

Factory Function (`factory.py`)

def create_client(
    provider: Optional[str] = None,
    use_claude: bool = False,
    use_openai: bool = False,
    **kwargs,
) -> LLMClient:
    """
    Create an LLM client, auto-detecting provider from parameters.

    Args:
        provider: Explicit provider name ("lemonade", "openai", or "claude").
                  If not specified, auto-detected from use_claude/use_openai flags.
        use_claude: If True, use Claude provider (ignored if provider is specified)
        use_openai: If True, use OpenAI provider (ignored if provider is specified)
        **kwargs: Provider-specific arguments (base_url, model, api_key, etc.)

    Returns:
        LLMClient instance for the specified or detected provider

    Raises:
        ValueError: If provider is unknown or both use_claude and use_openai are True

    Examples:
        # Default Lemonade provider
        client = create_client()

        # Explicit provider selection
        client = create_client(provider="lemonade", model="Qwen3-0.6B-GGUF")
        client = create_client(provider="openai", api_key="sk-...")
        client = create_client(provider="claude", api_key="sk-ant-...")

        # Backward-compatible flags
        client = create_client(use_claude=True, api_key="sk-ant-...")
        client = create_client(use_openai=True, api_key="sk-...")

    Note:
        Provider defaults to "lemonade" when no flags are set.
        The design maintains backward compatibility while allowing explicit provider selection.
    """

Abstract Base Class (`base_client.py`)

from abc import ABC, abstractmethod
from typing import Iterator, Union

class LLMClient(ABC):
    """
    Unified LLM client interface.

    Methods raise NotSupportedError if not available for this provider.
    """

    @property
    @abstractmethod
    def provider_name(self) -> str:
        """Return the provider name for error messages."""
        ...

    @abstractmethod
    def generate(
        self,
        prompt: str,
        model: str | None = None,
        stream: bool = False,
        **kwargs,
    ) -> Union[str, Iterator[str]]:
        """
        Generate text completion.

        Args:
            prompt: The user prompt/query to send to the LLM
            model: The model to use (defaults to provider's default model)
            stream: If True, returns a generator that yields chunks of the response
            **kwargs: Additional parameters (temperature, max_tokens, etc.)

        Returns:
            If stream=False: The complete generated text as a string
            If stream=True: A generator yielding chunks of the response

        Example:
            response = client.generate("Write a hello world program")
        """
        ...

    @abstractmethod
    def chat(
        self,
        messages: list[dict],
        model: str | None = None,
        stream: bool = False,
        **kwargs,
    ) -> Union[str, Iterator[str]]:
        """
        Chat completion with message history.

        Args:
            messages: List of message dicts with 'role' and 'content' keys
            model: The model to use (defaults to provider's default model)
            stream: If True, returns a generator that yields chunks of the response
            **kwargs: Additional parameters (temperature, max_tokens, etc.)

        Returns:
            If stream=False: The complete generated text as a string
            If stream=True: A generator yielding chunks of the response

        Example:
            messages = [
                {"role": "user", "content": "Hello"},
                {"role": "assistant", "content": "Hi there!"},
                {"role": "user", "content": "How are you?"}
            ]
            response = client.chat(messages)
        """
        ...

    # Optional methods - default raises NotSupportedError
    def embed(self, texts: list[str], **kwargs) -> list[list[float]]:
        """
        Generate embeddings for texts.

        Args:
            texts: List of text strings to embed
            **kwargs: Additional parameters (e.g., model="text-embedding-3-small" for OpenAI)

        Returns:
            List of embedding vectors (list of floats)

        Raises:
            NotSupportedError: If provider doesn't support embeddings

        Note:
            Supported by: Lemonade, OpenAI (default: "text-embedding-3-small")
            Not supported by: Claude
        """
        raise NotSupportedError(self.provider_name, "embed")

    def vision(self, images: list[bytes], prompt: str, **kwargs) -> str:
        """
        Vision/image understanding.

        Args:
            images: List of image data as bytes
            prompt: Text prompt describing what to analyze
            **kwargs: Additional parameters

        Returns:
            Text response describing the image

        Raises:
            NotSupportedError: If provider doesn't support vision

        Note:
            Supported by: Lemonade, Claude
            Not supported by: OpenAI
        """
        raise NotSupportedError(self.provider_name, "vision")

    def get_performance_stats(self) -> dict:
        """
        Get performance statistics from the last LLM request.

        Returns:
            Dictionary containing performance statistics

        Raises:
            NotSupportedError: If provider doesn't support performance stats

        Note:
            Only supported by: Lemonade
        """
        raise NotSupportedError(self.provider_name, "get_performance_stats")

    def load_model(self, model_name: str, **kwargs) -> None:
        """
        Load a specific model.

        Args:
            model_name: Name of the model to load
            **kwargs: Additional parameters

        Raises:
            NotSupportedError: If provider doesn't support model loading

        Note:
            Only supported by: Lemonade
        """
        raise NotSupportedError(self.provider_name, "load_model")

    def unload_model(self) -> None:
        """
        Unload the current model.

        Raises:
            NotSupportedError: If provider doesn't support model unloading

        Note:
            Only supported by: Lemonade
        """
        raise NotSupportedError(self.provider_name, "unload_model")

NotSupportedError (`exceptions.py`)

class NotSupportedError(Exception):
    """Raised when a provider doesn't support a method."""

    def __init__(self, provider: str, method: str):
        super().__init__(f"{provider} does not support {method}")

Provider Implementations

LemonadeProvider (`providers/lemonade.py`)

Full feature support - implements all methods.

class LemonadeProvider(LLMClient):
    """Lemonade provider - local AMD-optimized inference."""

    def __init__(
        self,
        model: Optional[str] = None,
        base_url: Optional[str] = None,
        host: Optional[str] = None,
        port: Optional[int] = None,
        system_prompt: Optional[str] = None,
        **kwargs,
    ):
        """
        Initialize Lemonade provider.

        Args:
            model: Model name (defaults to "Qwen3-0.6B-GGUF")
            base_url: Base URL for Lemonade server (overrides LEMONADE_BASE_URL env var)
            host: Server host (alternative to base_url)
            port: Server port (alternative to base_url)
            system_prompt: Default system prompt for chat
            api_key: API key for an authenticated Lemonade server
                     (overrides LEMONADE_API_KEY env var; None for unauthenticated)
            **kwargs: Additional arguments passed to LemonadeClient

        Environment:
            LEMONADE_BASE_URL: Default base URL (http://localhost:13305/api/v1)
            LEMONADE_API_KEY: API key for authenticated remote Lemonade
                              (sent as Authorization: Bearer header on every request)
            LEMONADE_MODEL: Default model name if not specified

        Note:
            Default model is "Qwen3-0.6B-GGUF" for CPU-only inference.
            All methods use temperature=0.1 by default for deterministic responses.
        """

    # Supports all methods: generate, chat, embed, vision,
    # get_performance_stats, load_model, unload_model

OpenAIProvider (`providers/openai_provider.py`)

Partial support - generate, chat, embed only.

class OpenAIProvider(LLMClient):
    """OpenAI (OpenAI API) provider."""

    def __init__(
        self,
        api_key: Optional[str] = None,
        model: str = "gpt-4o",
        system_prompt: Optional[str] = None,
        **_kwargs,
    ):
        """
        Initialize OpenAI provider.

        Args:
            api_key: OpenAI API key (defaults to OPENAI_API_KEY env var)
            model: Model name (default: "gpt-4o")
            system_prompt: Default system prompt for chat

        Environment:
            OPENAI_API_KEY: API key for OpenAI
        """

    # Supports: generate, chat, embed
    # Raises NotSupportedError: vision, get_performance_stats, load_model, unload_model

ClaudeProvider (`providers/claude.py`)

Partial support - generate, chat, vision only.

class ClaudeProvider(LLMClient):
    """Claude (Anthropic) provider."""

    def __init__(
        self,
        api_key: Optional[str] = None,
        model: str = "claude-3-5-sonnet-20241022",
        system_prompt: Optional[str] = None,
        **_kwargs,
    ):
        """
        Initialize Claude provider.

        Args:
            api_key: Anthropic API key (defaults to ANTHROPIC_API_KEY env var)
            model: Model name (default: "claude-3-5-sonnet-20241022")
            system_prompt: Default system prompt for chat

        Environment:
            ANTHROPIC_API_KEY: API key for Anthropic Claude

        Raises:
            ImportError: If anthropic package not installed
        """

    # Supports: generate, chat, vision
    # Raises NotSupportedError: embed, get_performance_stats, load_model, unload_model

Implementation Details

Provider Selection Logic

The factory function auto-detects the provider based on parameters:

# From factory.py
def create_client(provider=None, use_claude=False, use_openai=False, **kwargs):
    # Auto-detect provider from flags if not explicitly specified
    if provider is None:
        if use_claude and use_openai:
            raise ValueError("Cannot specify both use_claude and use_openai")
        elif use_claude:
            provider = "claude"
        elif use_openai:
            provider = "openai"
        else:
            provider = "lemonade"  # Default

    # Validate provider
    if provider.lower() not in _PROVIDERS:
        available = ", ".join(_PROVIDERS.keys())
        raise ValueError(f"Unknown provider: {provider}. Available: {available}")

    # Load provider class dynamically...

Lazy Provider Loading

Providers are loaded dynamically using importlib to avoid importing unnecessary dependencies:

_PROVIDERS = {
    "lemonade": "gaia.llm.providers.lemonade.LemonadeProvider",
    "openai": "gaia.llm.providers.openai_provider.OpenAIProvider",
    "claude": "gaia.llm.providers.claude.ClaudeProvider",
}

# Lazy import - only load when needed
module_path, class_name = _PROVIDERS[provider_lower].rsplit(".", 1)
module = importlib.import_module(module_path)
provider_class = getattr(module, class_name)

return provider_class(**kwargs)

NotSupportedError Pattern

Optional methods raise NotSupportedError by default in the ABC:

# In base_client.py
class LLMClient(ABC):
    def embed(self, texts: list[str], **kwargs) -> list[list[float]]:
        raise NotSupportedError(self.provider_name, "embed")

    def vision(self, images: list[bytes], prompt: str, **kwargs) -> str:
        raise NotSupportedError(self.provider_name, "vision")
    # etc...

Providers override only the methods they support:

# OpenAIProvider overrides embed but not vision
class OpenAIProvider(LLMClient):
    def embed(self, texts: list[str], **kwargs):
        # Implementation for OpenAI embeddings
        response = self._client.embeddings.create(...)
        return [item.embedding for item in response.data]

    # vision() inherited - raises NotSupportedError

Temperature Defaults

All providers default to temperature=0.1 for deterministic responses:

# In LemonadeProvider
kwargs.setdefault("temperature", 0.1)
response = self._backend.completions(model=model, prompt=prompt, **kwargs)

Provider-Specific Implementation

LemonadeProvider wraps the low-level LemonadeClient:

class LemonadeProvider(LLMClient):
    def __init__(self, model=None, base_url=None, **kwargs):
        self._backend = LemonadeClient(model=model, base_url=base_url, **kwargs)

    def generate(self, prompt, model=None, stream=False, **kwargs):
        return self._backend.completions(prompt=prompt, stream=stream, **kwargs)

OpenAIProvider uses the OpenAI SDK directly:

class OpenAIProvider(LLMClient):
    def __init__(self, api_key=None, model="gpt-4o", **kwargs):
        import openai
        self._client = openai.OpenAI(api_key=api_key)
        self._model = model

ClaudeProvider uses the Anthropic SDK:

class ClaudeProvider(LLMClient):
    def __init__(self, api_key=None, model="claude-3-5-sonnet-20241022", **kwargs):
        import anthropic
        self._client = anthropic.Anthropic(api_key=api_key)
        self._model = model

Testing Requirements

Unit Tests

File: tests/unit/test_llm_client_factory.py

import pytest
from unittest.mock import patch, Mock
from gaia.llm import create_client, LLMClient, NotSupportedError

class TestImports:
    def test_can_import_create_client(self):
        """Verify create_client can be imported."""
        from gaia.llm import create_client
        assert callable(create_client)

    def test_can_import_llm_client_abc(self):
        """Verify LLMClient ABC can be imported."""
        from abc import ABC
        from gaia.llm import LLMClient
        assert issubclass(LLMClient, ABC)

    def test_can_import_not_supported_error(self):
        """Verify NotSupportedError can be imported."""
        from gaia.llm import NotSupportedError
        assert issubclass(NotSupportedError, Exception)

class TestFactory:
    def test_default_creates_lemonade_provider(self):
        """Test factory creates Lemonade by default."""
        with patch("gaia.llm.providers.lemonade.LemonadeClient"):
            client = create_client()
            assert client.provider_name == "Lemonade"

    def test_explicit_provider_selection(self):
        """Test explicit provider parameter."""
        with patch("gaia.llm.providers.lemonade.LemonadeClient"):
            client = create_client(provider="lemonade")
            assert client.provider_name == "Lemonade"

    def test_use_claude_flag(self):
        """Test backward-compatible use_claude flag."""
        with patch("gaia.llm.providers.claude.anthropic"):
            client = create_client(use_claude=True, api_key="test")
            assert client.provider_name == "Claude"

    def test_use_openai_flag(self):
        """Test backward-compatible use_openai flag."""
        with patch("openai.OpenAI"):
            client = create_client(use_openai=True, api_key="test")
            assert client.provider_name == "OpenAI"

    def test_invalid_provider_raises_error(self):
        """Test unknown provider raises ValueError."""
        with pytest.raises(ValueError, match="Unknown provider"):
            create_client(provider="invalid")

    def test_both_flags_raises_error(self):
        """Test both flags raise ValueError."""
        with pytest.raises(ValueError, match="Cannot specify both"):
            create_client(use_claude=True, use_openai=True)

    def test_case_insensitive_provider(self):
        """Test provider names are case-insensitive."""
        with patch("gaia.llm.providers.lemonade.LemonadeClient"):
            client = create_client(provider="LEMONADE")
            assert client.provider_name == "Lemonade"

class TestNotSupportedError:
    def test_error_message_format(self):
        """Test NotSupportedError message format."""
        error = NotSupportedError("TestProvider", "test_method")
        assert "TestProvider" in str(error)
        assert "test_method" in str(error)

    def test_claude_embed_raises_not_supported(self):
        """Test Claude provider raises NotSupportedError for embed."""
        with patch("gaia.llm.providers.claude.anthropic"):
            client = create_client(provider="claude", api_key="test")
        with pytest.raises(NotSupportedError) as exc:
            client.embed(["text"])
        assert "Claude" in str(exc.value)
        assert "embed" in str(exc.value)

    def test_openai_vision_raises_not_supported(self):
        """Test OpenAI provider raises NotSupportedError for vision."""
        with patch("openai.OpenAI"):
            client = create_client(provider="openai", api_key="test")
        with pytest.raises(NotSupportedError) as exc:
            client.vision([b"image"], "describe")
        assert "OpenAI" in str(exc.value)
        assert "vision" in str(exc.value)

class TestProviderMethods:
    def test_lemonade_generate(self):
        """Test LemonadeProvider.generate()."""
        with patch("gaia.llm.providers.lemonade.LemonadeClient") as MockClient:
            mock_backend = Mock()
            mock_backend.completions.return_value = {
                "choices": [{"text": "Hello"}]
            }
            MockClient.return_value = mock_backend

            client = create_client()
            response = client.generate("Test")
            assert response == "Hello"

    def test_lemonade_chat(self):
        """Test LemonadeProvider.chat()."""
        with patch("gaia.llm.providers.lemonade.LemonadeClient") as MockClient:
            mock_backend = Mock()
            mock_backend.chat_completions.return_value = {
                "choices": [{"message": {"content": "Hi"}}]
            }
            MockClient.return_value = mock_backend

            client = create_client()
            response = client.chat([{"role": "user", "content": "Hello"}])
            assert response == "Hi"

    def test_streaming_returns_iterator(self):
        """Test streaming returns an iterator."""
        with patch("gaia.llm.providers.lemonade.LemonadeClient") as MockClient:
            mock_backend = Mock()

            def mock_stream():
                yield {"choices": [{"delta": {"content": "Hi"}}]}
                yield {"choices": [{"delta": {"content": " there"}}]}

            mock_backend.chat_completions.return_value = mock_stream()
            MockClient.return_value = mock_backend

            client = create_client()
            result = client.chat([{"role": "user", "content": "Hello"}], stream=True)
            chunks = list(result)
            assert len(chunks) == 2
            assert "".join(chunks) == "Hi there"

Integration Tests

File: tests/integration/test_llm_providers.py

def test_integration_lemonade_generate():
    """Test Lemonade provider with live server."""
    try:
        client = create_client()
        response = client.generate("Say hello")
        assert isinstance(response, str)
        assert len(response) > 0
    except ConnectionError:
        pytest.skip("Lemonade server not running")

def test_integration_lemonade_streaming():
    """Test Lemonade streaming."""
    try:
        client = create_client()
        chunks = list(client.generate("Count to 3", stream=True))
        assert len(chunks) > 0
    except ConnectionError:
        pytest.skip("Lemonade server not running")

def test_integration_lemonade_performance_stats():
    """Test performance stats (Lemonade only)."""
    try:
        client = create_client()
        client.generate("Test")
        stats = client.get_performance_stats()
        assert isinstance(stats, dict)
    except ConnectionError:
        pytest.skip("Lemonade server not running")

Dependencies

Required Packages

# pyproject.toml
[project]
dependencies = [
    "openai>=1.0.0",      # OpenAI Python SDK (used for local + OpenAI)
    "httpx>=0.24.0",      # HTTP client with timeout support
    "requests>=2.31.0",   # For performance stats/control endpoints
    "python-dotenv>=1.0.0",  # Environment variable management
]

[project.optional-dependencies]
claude = ["anthropic>=0.18.0"]  # Claude API support

Import Dependencies

Factory (factory.py):

import importlib
from typing import Optional
from .base_client import LLMClient

Base Client (base_client.py):

from abc import ABC, abstractmethod
from typing import Iterator, Union
from .exceptions import NotSupportedError

LemonadeProvider (providers/lemonade.py):

from typing import Iterator, Optional, Union
from ..base_client import LLMClient
from ..lemonade_client import LemonadeClient, DEFAULT_MODEL_NAME
# DEFAULT_MODEL_NAME = "Qwen3-0.6B-GGUF"

OpenAIProvider (providers/openai_provider.py):

from typing import Iterator, Optional, Union
import openai  # Requires: pip install openai
from ..base_client import LLMClient

ClaudeProvider (providers/claude.py):

from typing import Iterator, Optional, Union
try:
    import anthropic  # Requires: pip install anthropic
except ImportError:
    anthropic = None
from ..base_client import LLMClient

Usage Examples

Example 1: Basic Usage with Factory

from gaia.llm import create_client

# Default Lemonade provider
client = create_client()

# Non-streaming generation
response = client.generate("Write a hello world program in Python")
print(response)

# Get performance stats (Lemonade only)
stats = client.get_performance_stats()
print(f"Speed: {stats.get('tokens_per_second', 'N/A')} tokens/sec")

Example 2: Explicit Provider Selection

from gaia.llm import create_client

# Lemonade (local)
lemonade = create_client(provider="lemonade", model="Qwen3-0.6B-GGUF")

# OpenAI
openai_client = create_client(provider="openai", api_key="sk-...")

# Claude
claude = create_client(provider="claude", api_key="sk-ant-...")

# Backward-compatible flags
legacy_claude = create_client(use_claude=True, api_key="sk-ant-...")

Example 3: Handling Unsupported Features

from gaia.llm import create_client, NotSupportedError

# Create OpenAI client
client = create_client(provider="openai", api_key="sk-...")

# This works (OpenAI supports embed)
embeddings = client.embed(["Hello world", "How are you?"])

# This raises NotSupportedError (OpenAI doesn't support vision)
try:
    result = client.vision([image_bytes], "Describe this image")
except NotSupportedError as e:
    print(f"Feature not available: {e}")
    # Output: "OpenAI does not support vision"

Example 4: Chat with Message History

from gaia.llm import create_client

client = create_client()

# Build conversation history
messages = [
    {"role": "user", "content": "What's 2+2?"},
    {"role": "assistant", "content": "2+2 equals 4."},
    {"role": "user", "content": "What about 3+3?"}
]

# Use chat() method
response = client.chat(messages)
print(response)  # "3+3 equals 6."

Example 5: Streaming Responses

from gaia.llm import create_client

client = create_client()

# Streaming with generate()
print("AI: ", end="", flush=True)
for chunk in client.generate("Tell me a short story", stream=True):
    print(chunk, end="", flush=True)
print()

# Streaming with chat()
for chunk in client.chat([{"role": "user", "content": "Hello"}], stream=True):
    print(chunk, end="", flush=True)

Example 6: Embeddings (Lemonade and OpenAI only)

from gaia.llm import create_client

# With Lemonade
lemonade = create_client()
embeddings = lemonade.embed(["Hello world", "How are you?"])
print(f"Embedding dimensions: {len(embeddings[0])}")

# With OpenAI
openai_client = create_client(provider="openai", api_key="sk-...")
embeddings = openai_client.embed(["Text to embed"])

Example 7: Vision (Lemonade and Claude only)

from gaia.llm import create_client

# With Claude
claude = create_client(provider="claude", api_key="sk-ant-...")
with open("image.jpg", "rb") as f:
    image_data = f.read()
description = claude.vision([image_data], "Describe what you see")
print(description)

Example 8: Model Management (Lemonade only)

from gaia.llm import create_client

client = create_client()

# Load a specific model
client.load_model("Qwen3-0.6B-GGUF")

# Generate
response = client.generate("Hello")

# Get performance stats
stats = client.get_performance_stats()
print(f"Speed: {stats.get('tokens_per_second', 'N/A')} tokens/sec")

# Unload model
client.unload_model()

Example 9: Remote Lemonade Server

from gaia.llm import create_client

# Connect to remote server
client = create_client(base_url="http://192.168.1.100:13305")

response = client.generate("Hello from remote server")
print(response)

Example 10: System Prompts

from gaia.llm import create_client

# Set default system prompt
client = create_client(
    system_prompt="You are a helpful coding assistant."
)

# System prompt automatically prepended to chat messages
response = client.chat([
    {"role": "user", "content": "Write a binary search function"}
])
print(response)

Third-Party LLM Integration

GAIA supports third-party LLM service providers through its OpenAI-compatible API interface. Any service implementing the OpenAI API specification can be used with GAIA.

Required API Endpoints

Your LLM service must implement at least one of these OpenAI-compatible endpoints:

Completions Endpoint

Default: POST /v1/completionsUsed for pre-formatted prompts

Chat Completions Endpoint

POST /v1/chat/completionsUsed for structured conversations with message history

Completions Endpoint

Request
Response
Streaming

{
  "model": "your-model-name",
  "prompt": "Your prompt text here",
  "stream": false,
  "temperature": 0.1,
  "max_tokens": 2048
}

{
  "id": "cmpl-abc123",
  "object": "text_completion",
  "created": 1677652288,
  "model": "your-model-name",
  "choices": [
    {
      "text": "Generated response text",
      "index": 0,
      "finish_reason": "stop"
    }
  ]
}

Server-Sent Events (SSE) format:

data: {"choices": [{"text": "chunk1", "index": 0}]}

data: {"choices": [{"text": "chunk2", "index": 0}]}

data: [DONE]

Chat Completions Endpoint

Request
Response
Streaming

{
  "model": "your-model-name",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "stream": false,
  "temperature": 0.1
}

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "your-model-name",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hi! How can I help you?"
      },
      "finish_reason": "stop"
    }
  ]
}

Server-Sent Events (SSE) with delta chunks:

data: {"choices": [{"delta": {"content": "Hi!"}, "index": 0}]}

data: {"choices": [{"delta": {"content": " How"}, "index": 0}]}

data: [DONE]

Configuration

Environment Variable
Direct Initialization

Linux

export LEMONADE_BASE_URL="http://your-llm-service:8080"

Windows (PowerShell)

$env:LEMONADE_BASE_URL="http://your-llm-service:8080"

Windows (CMD)

set LEMONADE_BASE_URL=http://your-llm-service:8080

from gaia.llm import create_client

client = create_client(base_url="http://your-llm-service:8080")
response = client.generate("Hello world")

URL Normalization: LemonadeClient automatically appends /api/v1 if not present:

http://localhost:8080 → http://localhost:8080/api/v1
If your service uses /v1 instead, provide the full path: http://localhost:8080/v1

Example Integration

from gaia.llm import create_client

# Connect to your third-party LLM service
client = create_client(base_url="http://your-service:8080/v1")

# Test connection
response = client.generate("Hello, are you working?")
print(response)

Compatibility Checklist

Required Features

✅ OpenAI-compatible endpoints (/v1/completions or /v1/chat/completions)
✅ JSON request/response format matching OpenAI specification
✅ HTTP POST method for generation requests
✅ Non-streaming responses (complete response as JSON)

Optional Features

⚠️ Streaming responses (Server-Sent Events format)
⚠️ Error handling (proper HTTP status codes: 200, 400, 404, 500)
⚠️ Model listing (GET /v1/models endpoint)
⚠️ Token counting (usage statistics in responses)

GAIA-Specific Features

The following features are specific to Lemonade provider and raise NotSupportedError with third-party services:

get_performance_stats() - Performance statistics
load_model() - Model loading
unload_model() - Model unloading

Troubleshooting

Connection Errors

Problem: ConnectionError: LLM Server Connection ErrorSolutions:

Verify service is running:

curl http://your-service:port/v1/models

Check firewall settings
Ensure correct base URL format

Test with explicit base URL:

client = create_client(base_url="http://localhost:8080/v1")

404 Endpoint Errors

Problem: 404 endpoint not foundSolutions:

Check if service uses /v1/completions (OpenAI standard)
Verify API path structure: /v1 vs /api/v1
Consult service documentation for correct endpoint paths

Use chat method explicitly if needed:

client.chat([{"role": "user", "content": "Test"}])

Model Not Found

Problem: Model errors or “model not loaded”Solutions:

Specify model explicitly:

client.generate("Test", model="your-model-name")

List available models (if service supports):
```
curl http://your-service:port/v1/models
```
Ensure model is loaded in your service before connecting

Streaming Issues

Problem: Streaming responses not workingSolutions:

Verify service supports Server-Sent Events (SSE)
Check Content-Type headers: text/event-stream

Test non-streaming first:

response = client.generate("Test", stream=False)

Enable debug logging:

import logging
logging.basicConfig(level=logging.DEBUG)

Documentation Updates Required

docs/sdk/sdks/llm.mdx

Add to LLM Section:

### LLM Client

**Import:** `from gaia.llm import create_client, LLMClient, NotSupportedError`

**Purpose:** Provider-based LLM client with factory pattern for local and cloud backends.

**Features:**
- Factory-based client creation with `create_client()`
- Three providers: Lemonade (local), OpenAI, Claude
- Abstract base class for type safety
- `NotSupportedError` for unsupported features
- Streaming and non-streaming generation
- Backward-compatible flags

**Quick Start:**
```python
from gaia.llm import create_client

# Local LLM (default)
client = create_client()
response = client.generate("Hello world")

# Streaming
for chunk in client.generate("Tell me a story", stream=True):
    print(chunk, end="")

# Claude API
claude = create_client(provider="claude", api_key="sk-ant-...")
response = claude.generate("Explain Python decorators")

# Backward-compatible
client = create_client(use_claude=True)

LLMClient Technical Specification

​Overview

​Requirements

​Functional Requirements

​Non-Functional Requirements

​API Specification

​Package Structure

​Package Exports (__init__.py)

​Factory Function (factory.py)

​Abstract Base Class (base_client.py)

​NotSupportedError (exceptions.py)

​Provider Implementations

​LemonadeProvider (providers/lemonade.py)

​OpenAIProvider (providers/openai_provider.py)

​ClaudeProvider (providers/claude.py)

​Implementation Details

​Provider Selection Logic

​Lazy Provider Loading

​NotSupportedError Pattern

​Temperature Defaults

​Provider-Specific Implementation

​Testing Requirements

​Unit Tests

​Integration Tests

​Dependencies

​Required Packages

​Import Dependencies

​Usage Examples

​Example 1: Basic Usage with Factory

​Example 2: Explicit Provider Selection

​Example 3: Handling Unsupported Features

​Example 4: Chat with Message History

​Example 5: Streaming Responses

​Example 6: Embeddings (Lemonade and OpenAI only)

​Example 7: Vision (Lemonade and Claude only)

​Example 8: Model Management (Lemonade only)

​Example 9: Remote Lemonade Server

​Example 10: System Prompts

​Third-Party LLM Integration

​Required API Endpoints

Completions Endpoint

Chat Completions Endpoint

​Completions Endpoint

​Chat Completions Endpoint

​Configuration

​Example Integration

​Compatibility Checklist

​Troubleshooting

​Documentation Updates Required

​docs/sdk/sdks/llm.mdx

Overview

Requirements

Functional Requirements

Non-Functional Requirements

API Specification

Package Structure

Package Exports (`init.py`)

Factory Function (`factory.py`)

Abstract Base Class (`base_client.py`)

NotSupportedError (`exceptions.py`)

Provider Implementations

LemonadeProvider (`providers/lemonade.py`)

OpenAIProvider (`providers/openai_provider.py`)

ClaudeProvider (`providers/claude.py`)

Implementation Details

Provider Selection Logic

Lazy Provider Loading

NotSupportedError Pattern

Temperature Defaults

Provider-Specific Implementation

Testing Requirements

Unit Tests

Integration Tests

Dependencies

Required Packages

Import Dependencies

Usage Examples

Example 1: Basic Usage with Factory

Example 2: Explicit Provider Selection

Example 3: Handling Unsupported Features

Example 4: Chat with Message History

Example 5: Streaming Responses

Example 6: Embeddings (Lemonade and OpenAI only)

Example 7: Vision (Lemonade and Claude only)

Example 8: Model Management (Lemonade only)

Example 9: Remote Lemonade Server

Example 10: System Prompts

Third-Party LLM Integration

Required API Endpoints

Completions Endpoint

Chat Completions Endpoint

Configuration

Example Integration

Compatibility Checklist

Troubleshooting

Documentation Updates Required

docs/sdk/sdks/llm.mdx