TestingMixin

Source Code: src/gaia/agents/code/tools/testing.py

Component: TestingMixin Module: gaia.agents.code.tools.testing Import: from gaia.agents.code.tools.testing import TestingMixin

Overview

TestingMixin provides Python code execution and testing tools with timeout management and output capture. It enables running Python scripts and pytest test suites with proper isolation and error handling. Key Features:

Execute Python files as subprocesses
Run pytest test suites
Capture stdout and stderr
Timeout management
Environment variable injection
Working directory control
Test result parsing

Tool Specifications

1. execute_python_file

Execute a Python file as a subprocess with full control. Parameters:

file_path (str, required): Path to Python file
args (List[str] | str, optional): CLI arguments
timeout (int, optional): Timeout in seconds (default: 60)
working_directory (str, optional): Working directory
env_vars (Dict[str, str], optional): Environment variables

Returns:

{
    "status": "success" | "error",
    "file_path": str,
    "command": str,
    "stdout": str,
    "stderr": str,
    "return_code": int,
    "has_errors": bool,
    "duration_seconds": float,
    "timeout": int,
    "cwd": str,
    "output_truncated": bool,

    # On timeout
    "timed_out": bool
}

Example:

result = execute_python_file(
    file_path="/path/to/script.py",
    args=["--input", "data.txt"],
    timeout=120,
    working_directory="/path/to/project",
    env_vars={"DEBUG": "1"}
)

if result["has_errors"]:
    print(f"Script failed with code {result['return_code']}")
    print(result["stderr"])
else:
    print("Success!")
    print(result["stdout"])

2. run_tests

Run pytest test suite for a project. Parameters:

project_path (str, optional): Project directory (default: ”.”)
pytest_args (List[str] | str, optional): Pytest arguments
timeout (int, optional): Timeout in seconds (default: 120)
env_vars (Dict[str, str], optional): Environment variables

Returns:

{
    "status": "success" | "error",
    "project_path": str,
    "command": str,
    "stdout": str,
    "stderr": str,
    "return_code": int,
    "tests_passed": bool,
    "failure_summary": str,  # If failed
    "duration_seconds": float,
    "timeout": int,
    "output_truncated": bool,

    # On timeout
    "timed_out": bool
}

Example:

result = run_tests(
    project_path="/path/to/project",
    pytest_args=["-v", "tests/test_calculator.py"],
    timeout=300
)

if result["tests_passed"]:
    print("All tests passed!")
else:
    print(f"Tests failed: {result['failure_summary']}")
    print(result["stdout"])

Usage Examples

Example 1: Execute Python Script

from gaia import CodeAgent

agent = CodeAgent()

# Run a data processing script
result = agent.execute_python_file(
    file_path="scripts/process_data.py",
    args=["--input", "data/raw.csv", "--output", "data/processed.csv"],
    timeout=600,
    working_directory="/path/to/project"
)

if result["status"] == "success":
    if result["return_code"] == 0:
        print("Processing completed successfully")
        print(result["stdout"])
    else:
        print(f"Script failed with exit code {result['return_code']}")
        print("Error output:")
        print(result["stderr"])
else:
    if result.get("timed_out"):
        print(f"Script timed out after {result['timeout']} seconds")
    else:
        print(f"Error: {result['error']}")

Example 2: Run Full Test Suite

# Run all tests with verbose output
result = agent.run_tests(
    project_path="/path/to/project",
    pytest_args=["-v", "--tb=short"],
    timeout=300
)

print(f"Tests completed in {result['duration_seconds']:.2f}s")

if result["tests_passed"]:
    print("✓ All tests passed!")
else:
    print(f"✗ Tests failed")
    print(result["failure_summary"])

Example 3: Run Specific Test File

# Run specific test file with coverage
result = agent.run_tests(
    project_path="/path/to/project",
    pytest_args=["tests/test_calculator.py", "--cov=src", "--cov-report=term"],
    timeout=60
)

if result["tests_passed"]:
    # Parse coverage from output
    print("Tests passed with coverage:")
    print(result["stdout"])

Example 4: Environment Variables

# Run tests with custom environment
result = agent.run_tests(
    project_path="/path/to/project",
    pytest_args=["-v"],
    env_vars={
        "DATABASE_URL": "sqlite:///test.db",
        "DEBUG": "1",
        "TEST_MODE": "integration"
    }
)

Example 5: Handle Timeouts

result = agent.execute_python_file(
    file_path="scripts/long_process.py",
    timeout=30
)

if result.get("timed_out"):
    print(f"Process timed out after {result['timeout']}s")
    print("Partial output:")
    print(result["stdout"])
    print("\nConsider:")
    print("1. Increasing timeout")
    print("2. Optimizing the script")
    print("3. Running in background mode")

Output Handling

Truncation

Output is truncated to prevent memory issues:

MAX_OUTPUT = 10_000  # characters

if len(stdout) > MAX_OUTPUT:
    stdout = stdout[:MAX_OUTPUT] + "\n...output truncated (stdout)..."
    truncated = True

if len(stderr) > MAX_OUTPUT:
    stderr = stderr[:MAX_OUTPUT] + "\n...output truncated (stderr)..."
    truncated = True

Failure Summary Parsing

For pytest, extract failure count from output:

import re

summary_match = re.search(r"(\d+)\s+failed", stdout)
if summary_match:
    num_failed = summary_match.group(1)
    failure_summary = f"{num_failed} test(s) failed - check stdout for details"

Environment Configuration

PYTHONPATH Management

Automatically adds project directory to PYTHONPATH:

env = os.environ.copy()
if env_vars:
    env.update({key: str(value) for key, value in env_vars.items()})

existing_pythonpath = env.get("PYTHONPATH")
project_pythonpath = str(project_dir)

if existing_pythonpath:
    env["PYTHONPATH"] = f"{project_pythonpath}{os.pathsep}{existing_pythonpath}"
else:
    env["PYTHONPATH"] = project_pythonpath

Testing Requirements

File: tests/agents/code/test_testing.py

import pytest
from gaia.agents.code.tools.testing import TestingMixin

def test_execute_python_file(tmp_path):
    """Test Python file execution."""
    # Create test script
    script = tmp_path / "test_script.py"
    script.write_text("print('Hello World')\nprint('Success')")

    mixin = TestingMixin()
    result = mixin.execute_python_file(
        file_path=str(script),
        timeout=10
    )

    assert result["status"] == "success"
    assert result["return_code"] == 0
    assert "Hello World" in result["stdout"]
    assert not result["has_errors"]

def test_execute_with_args(tmp_path):
    """Test execution with CLI arguments."""
    script = tmp_path / "args_script.py"
    script.write_text("""
import sys
print(f"Args: {sys.argv[1:]}")
""")

    result = execute_python_file(
        str(script),
        args=["--input", "test.txt", "--output", "out.txt"]
    )

    assert "--input" in result["stdout"]
    assert "test.txt" in result["stdout"]

def test_timeout_handling(tmp_path):
    """Test timeout behavior."""
    script = tmp_path / "slow_script.py"
    script.write_text("""
import time
time.sleep(10)
print("Done")
""")

    result = execute_python_file(str(script), timeout=2)

    assert result["status"] == "error"
    assert result.get("timed_out")
    assert result["timeout"] == 2

def test_run_tests(tmp_path):
    """Test pytest execution."""
    # Create test file
    test_file = tmp_path / "test_example.py"
    test_file.write_text("""
def test_pass():
    assert True

def test_also_pass():
    assert 1 + 1 == 2
""")

    result = run_tests(str(tmp_path), pytest_args=["-v"])

    assert result["status"] == "success"
    assert result["tests_passed"]
    assert result["return_code"] == 0

def test_run_tests_with_failure(tmp_path):
    """Test pytest with failures."""
    test_file = tmp_path / "test_fail.py"
    test_file.write_text("""
def test_will_fail():
    assert False, "This test is designed to fail"
""")

    result = run_tests(str(tmp_path))

    assert result["status"] == "success"  # Command ran
    assert not result["tests_passed"]  # But tests failed
    assert result["return_code"] != 0
    assert result["failure_summary"]

Error Handling

File Not Found

if not path.exists():
    return {
        "status": "error",
        "error": f"File not found: {file_path}",
        "has_errors": True
    }

Invalid Arguments

if not isinstance(args, (str, list)):
    return {
        "status": "error",
        "error": "args must be a list of strings or a string",
        "has_errors": True
    }

Timeout Handling

try:
    result = subprocess.run(
        cmd,
        timeout=timeout,
        ...
    )
except subprocess.TimeoutExpired as exc:
    return {
        "status": "error",
        "error": f"Execution timed out after {timeout} seconds",
        "stdout": decode_output(exc.stdout),
        "stderr": decode_output(exc.stderr),
        "timed_out": True,
        "timeout": timeout
    }

Performance Characteristics

Script Execution: Depends on script complexity
Test Suite: Depends on test count and complexity
Timeout Precision: ±0.1 seconds
Output Truncation: 10,000 characters per stream
Environment Setup: ~10ms overhead

Dependencies

import os
import shlex
import subprocess
import sys
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List, Optional

TestingMixin Technical Specification

Core Framework

SDKs

Infrastructure

Code Infrastructure

Tool Mixins

Packaging

Agents & Apps

Overview

Tool Specifications

1. execute_python_file

2. run_tests

Usage Examples

Example 1: Execute Python Script

Example 2: Run Full Test Suite

Example 3: Run Specific Test File

Example 4: Environment Variables

Example 5: Handle Timeouts

Output Handling

Truncation

Failure Summary Parsing

Environment Configuration

PYTHONPATH Management

Testing Requirements

Error Handling

File Not Found

Invalid Arguments

Timeout Handling

Performance Characteristics

Dependencies

Core Framework

SDKs

Infrastructure

Code Infrastructure

Tool Mixins

Packaging

Agents & Apps

​Overview

​Tool Specifications

​1. execute_python_file

​2. run_tests

​Usage Examples

​Example 1: Execute Python Script

​Example 2: Run Full Test Suite

​Example 3: Run Specific Test File

​Example 4: Environment Variables

​Example 5: Handle Timeouts

​Output Handling

​Truncation

​Failure Summary Parsing

​Environment Configuration

​PYTHONPATH Management

​Testing Requirements

​Error Handling

​File Not Found

​Invalid Arguments

​Timeout Handling

​Performance Characteristics

​Dependencies

Overview

Tool Specifications

1. execute_python_file

2. run_tests

Usage Examples

Example 1: Execute Python Script

Example 2: Run Full Test Suite

Example 3: Run Specific Test File

Example 4: Environment Variables

Example 5: Handle Timeouts

Output Handling

Truncation

Failure Summary Parsing

Environment Configuration

PYTHONPATH Management

Testing Requirements

Error Handling

File Not Found

Invalid Arguments

Timeout Handling

Performance Characteristics

Dependencies