SD Agent

Status: Planning Priority: Medium Vote with 👍 on GitHub Issue #272

Overview

An AI agent that helps users generate better Stable Diffusion images through intelligent optimization of both prompts and generation parameters. The agent analyzes user intent, enhances text descriptions, and recommends optimal settings (model, size, steps, cfg_scale) for high-quality results. Triple Optimization Approach:

Prompt Enhancement: Transform simple text (“a cat”) into detailed, effective SD prompts
Parameter Optimization: Recommend model selection, dimensions, inference steps, and guidance scale
VLM-Powered Iteration: Analyze generated images with Vision LLM, score quality across categories, and automatically iterate until quality threshold is met

Key Features:

LLM-powered prompt analysis and enhancement (AMD NPU-accelerated)
Intelligent parameter recommendation (model, size, steps, cfg_scale)
VLM-powered image evaluation (composition, lighting, prompt adherence, style, technical quality)
Autonomous iteration loop - generate → evaluate → refine → regenerate until quality threshold met
Template library with proven patterns
A/B testing and strategy comparison
Terminal image display for immediate visual feedback
SQLite database for generation history
Agent-powered search and filtering (natural language queries)
Web gallery UI with task-based interface for browsing, annotating, and rating
Version control and reproducibility

Goal: Make professional-quality Stable Diffusion image generation accessible through intelligent optimization of all generation parameters, with a searchable database and gallery for managing your creations. The agent learns from your ratings to personalize future recommendations based on your unique aesthetic preferences.

SD Agent Gallery UI mockup showing task queue, generation progress, and image gallery with ratings

View the interactive mockup for a full preview of the gallery UI.

System Architecture

High-Level Overview

Data Flow

Generation Workflow (with VLM Iteration Loop):

User Request → CLI/Task Interface submits generation task
Prompt Enhancement → Agent sends original prompt to LLM
Parameter Optimization → Agent recommends SD parameters based on prompt + user preferences
Image Generation → Agent calls Lemonade SD endpoint
VLM Evaluation → Image Evaluator analyzes output using Qwen3-VL-4B
- Scores across categories: composition, lighting, prompt adherence, style consistency, technical quality
- Returns overall score (1-10) + category breakdown + improvement suggestions
Iteration Decision → If score < threshold (default 7/10) AND iterations < max:
- VLM feedback refines prompt and/or parameters
- Return to step 4 (regenerate with improvements)
Storage → Agent saves final image + all iterations to SQLite + file system
Display → Final image shown in terminal or Gallery UI with quality report
Learning → Successful patterns stored for future preference learning

Search Workflow:

User Query (natural language) → “show me all cyberpunk cities”
LLM Translation → Agent converts to SQL query
Database Query → Execute against SQLite
Results → Return matching generations with images

VLM Evaluation Workflow:

Image Input → Generated image passed to VLM (Qwen3-VL-4B)
Multi-Category Scoring → VLM evaluates:
- Composition (1-10): Rule of thirds, balance, focal point
- Lighting (1-10): Consistency, mood, shadows/highlights
- Prompt Adherence (1-10): How well image matches the prompt
- Style Consistency (1-10): Coherent artistic style throughout
- Technical Quality (1-10): Sharpness, artifacts, resolution
Overall Score → Weighted average of categories
Improvement Suggestions → VLM provides specific feedback for refinement
Iteration Trigger → If below threshold, suggestions feed back into enhancement loop

Personalization Workflow:

User Rates Images → 1-5 stars stored in database
Pattern Analysis → Agent analyzes high-rated generations
Preference Learning → Identify preferred styles, models, parameters
Future Enhancements → Bias recommendations toward learned preferences

Technical Decisions

Decision	Choice	Rationale
Optimization Engine	Qwen3-4B-Instruct-2507-FLM (AMD NPU)	Fast, efficient prompt enhancement optimized for NPU
Image Evaluator	Qwen3-VL-4B (AMD NPU)	VLM-powered quality scoring and iteration feedback
Evaluation Categories	Composition, lighting, prompt adherence, style, technical	Comprehensive quality assessment across key dimensions
Iteration Strategy	Auto-iterate until score ≥ 7/10 or max 3 iterations	Balance quality improvement with generation time
Focus Domain	Stable Diffusion only (via Lemonade Server)	Deep specialization, measurable quality improvement
Optimization Scope	Prompts + parameters + VLM feedback loop	Holistic optimization with automated quality assurance
SD Backend	Lemonade Server `/api/v1/images/generations`	AMD NPU/GPU optimized, local inference, privacy
Supported Models	SD-Turbo, SDXL-Turbo	Fast inference on AMD hardware
Parameters Optimized	model, size, steps, cfg_scale, seed	All tunable SD generation parameters
Template Library	SD-specific patterns (photorealistic, anime, etc.)	Codify proven prompt+parameter combinations
Storage	SQLite database + file system	DatabaseMixin, queryable history, fast retrieval
Database Schema	generations, templates, prompt_versions, evaluations	Structured storage for all generation + evaluation data
Image Files	`.gaia/cache/sd/images/`	Internal cache for generated images (not user-facing)
Output Formats	PNG (default), JPEG	PNG for quality, JPEG for smaller file size
Image Download	Explicit download to ~/Downloads	Gallery download button exports to user’s Downloads folder
Gallery UI	Task-based web interface (Electron or browser)	Submit tasks → view results, browse, annotate, rate, download
Terminal Display	`rich` + `term-image`	CLI image preview
Metadata Format	Database rows + JSON export	Queryable + portable

Architecture

Component Structure

src/gaia/agents/sd/
├── __init__.py
├── agent.py                    # SDAgent class (Agent + DatabaseMixin)
├── core/
│   ├── __init__.py
│   ├── prompt_enhancer.py      # LLM-based prompt enhancement
│   ├── prompt_analyzer.py      # Prompt quality scoring
│   ├── param_optimizer.py      # SD parameter recommendation
│   ├── image_evaluator.py      # VLM-based image quality evaluation
│   ├── iteration_controller.py # Manages generate→evaluate→refine loop
│   ├── task_queue.py           # Async task queue with priority + persistence
│   ├── template_library.py     # Template management (DB-backed)
│   └── validators.py           # Validation (token limits, param ranges)
├── database/
│   ├── __init__.py
│   ├── schema.py               # Database schema definitions
│   ├── models.py               # Generation, Template, PromptVersion models
│   └── queries.py              # Common query patterns
├── gallery/
│   ├── __init__.py
│   ├── app.py                  # Web gallery server (Flask/FastAPI)
│   ├── static/                 # CSS, JS for gallery UI
│   ├── templates/              # HTML templates
│   └── api.py                  # REST API for gallery
├── integrations/
│   ├── __init__.py
│   ├── image_generator.py      # Lemonade Server SD API wrapper
│   ├── image_utils.py          # Image processing, PNG/JPEG conversion
│   ├── download_manager.py     # Export images to ~/Downloads folder
│   └── terminal_display.py     # Terminal image rendering
└── templates/
    └── default_templates.json  # Starter template set (inserted into DB)

Database Schema

-- Core generation history table
CREATE TABLE generations (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,

    -- Prompt data
    prompt_original TEXT NOT NULL,
    prompt_enhanced TEXT NOT NULL,
    prompt_score_before INTEGER,
    prompt_score_after INTEGER,

    -- SD Parameters
    model TEXT NOT NULL,           -- "SD-Turbo", "SDXL-Turbo"
    size TEXT NOT NULL,            -- "512x512", "1024x1024", etc.
    steps INTEGER NOT NULL,
    cfg_scale REAL NOT NULL,
    seed INTEGER NOT NULL,

    -- Generation metadata
    generation_time_ms INTEGER,

    -- Image data
    image_path TEXT NOT NULL,      -- Relative path to .gaia/cache/sd/images/
    image_format TEXT DEFAULT 'png', -- Output format: 'png' or 'jpeg'
    image_hash TEXT,               -- SHA256 hash for deduplication

    -- User annotations (for gallery)
    user_rating INTEGER,           -- 1-5 stars
    user_notes TEXT,
    tags TEXT,                     -- Comma-separated tags
    is_favorite BOOLEAN DEFAULT 0
);

-- Template library
CREATE TABLE templates (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT UNIQUE NOT NULL,
    description TEXT,
    category TEXT,                 -- "photorealistic", "anime", "abstract", etc.

    -- Template data
    prompt_template TEXT NOT NULL,
    default_model TEXT,
    default_size TEXT,
    default_steps INTEGER,
    default_cfg_scale REAL,

    -- Usage statistics
    usage_count INTEGER DEFAULT 0,
    avg_user_rating REAL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Prompt version history (for iterative refinement)
CREATE TABLE prompt_versions (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    generation_id INTEGER NOT NULL,
    version INTEGER NOT NULL,
    prompt TEXT NOT NULL,
    score INTEGER,
    refinement_notes TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (generation_id) REFERENCES generations(id) ON DELETE CASCADE
);

-- VLM evaluation results (for quality iteration)
CREATE TABLE evaluations (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    generation_id INTEGER NOT NULL,
    iteration INTEGER NOT NULL,          -- Which iteration (1, 2, 3...)

    -- Category scores (1-10)
    score_composition INTEGER,
    score_lighting INTEGER,
    score_prompt_adherence INTEGER,
    score_style_consistency INTEGER,
    score_technical_quality INTEGER,
    score_overall REAL,                  -- Weighted average

    -- VLM feedback
    feedback TEXT,                       -- Improvement suggestions from VLM
    should_iterate BOOLEAN,              -- Whether another iteration is recommended

    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (generation_id) REFERENCES generations(id) ON DELETE CASCADE
);

-- Task queue for async processing
CREATE TABLE task_queue (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    status TEXT NOT NULL DEFAULT 'pending',  -- pending, in_progress, completed, failed, cancelled
    priority INTEGER DEFAULT 0,              -- Higher = more urgent

    -- Task input (natural language + optional overrides)
    prompt TEXT NOT NULL,                    -- Natural language request (e.g., "a cyberpunk city at night")
    locked_model TEXT,                       -- If set, forces specific model (bypasses agent recommendation)
    locked_size TEXT,                        -- If set, forces specific size
    locked_steps INTEGER,                    -- If set, forces specific steps
    locked_cfg_scale REAL,                   -- If set, forces specific cfg_scale
    locked_seed INTEGER,                     -- If set, forces specific seed
    quality_threshold REAL DEFAULT 7.0,      -- Min score to accept (1-10)
    max_iterations INTEGER DEFAULT 3,

    -- Task output
    generation_id INTEGER,                   -- Links to final generation when complete
    error_message TEXT,                      -- If failed

    -- Timestamps
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    started_at TIMESTAMP,
    completed_at TIMESTAMP,

    FOREIGN KEY (generation_id) REFERENCES generations(id) ON DELETE SET NULL
);

-- Indexes for fast queries
CREATE INDEX idx_generations_created ON generations(created_at DESC);
CREATE INDEX idx_generations_model ON generations(model);
CREATE INDEX idx_generations_favorite ON generations(is_favorite);
CREATE INDEX idx_generations_rating ON generations(user_rating);
CREATE INDEX idx_generations_tags ON generations(tags);

Class Hierarchy

Agent (base)
DatabaseMixin
  └── SDAgent (Agent + DatabaseMixin)
      ├── Optimization Tools:
      │   ├── enhance_prompt              # LLM-powered prompt enhancement
      │   ├── analyze_prompt              # Quality scoring (1-10)
      │   ├── optimize_parameters         # Recommend model, size, steps, cfg_scale
      │   ├── generate_image              # Full generation with optimization
      │   ├── refine_generation           # Iterative improvement
      │   └── compare_strategies          # A/B test different approaches
      │
      ├── VLM Evaluation Tools:
      │   ├── evaluate_image              # VLM-powered quality scoring
      │   ├── iterate_until_quality       # Auto-loop until score threshold met
      │   ├── get_evaluation_feedback     # Retrieve VLM suggestions
      │   └── compare_iterations          # Compare quality across iterations
      │
      ├── Gallery/Search Tools:
      │   ├── search_generations          # Natural language search
      │   ├── filter_by_params            # Filter by model, size, rating, etc.
      │   ├── get_favorites               # Retrieve favorited images
      │   ├── get_top_rated               # Get highest rated generations
      │   ├── tag_generation              # Add tags for organization
      │   ├── rate_generation             # Set user rating (1-5 stars)
      │   ├── add_notes                   # Annotate with notes
      │   └── download_image              # Export to ~/Downloads (PNG or JPEG)
      │
      ├── Template Tools:
      │   ├── list_templates              # Browse available templates
      │   ├── use_template                # Apply template to new generation
      │   ├── save_as_template            # Create template from generation
      │   └── delete_template             # Remove template
      │
      ├── Task Queue Tools:
      │   ├── submit_task                 # Add task to queue (NL prompt + optional locks)
      │   ├── get_queue_status            # View pending/in-progress/completed tasks
      │   ├── cancel_task                 # Cancel pending or in-progress task
      │   ├── prioritize_task             # Bump task priority in queue
      │   └── batch_submit                # Submit multiple tasks at once
      │
      ├── Components:
      │   ├── PromptEnhancer              # LLM-based enhancement
      │   ├── PromptAnalyzer              # Quality scoring
      │   ├── ParamOptimizer              # SD parameter recommendation
      │   ├── ImageEvaluator (VLM)        # VLM-based image quality assessment
      │   ├── IterationController         # Manages generate→evaluate→refine loop
      │   ├── TaskQueue                   # Async task queue with priority ordering
      │   ├── TemplateLibrary (DB)        # Template CRUD
      │   ├── ImageGenerator              # Lemonade Server integration
      │   └── TerminalDisplay             # CLI image preview
      │
      └── Gallery:
          ├── GalleryServer               # Task-based web UI server
          ├── GalleryAPI                  # REST API for task submission
          ├── DownloadManager             # Export images to ~/Downloads (PNG/JPEG)
          └── Database (SQLite)           # Persistent storage

User Experience

Mode 1: Prompt Enhancement Only

Use Case: Optimize prompts before generating

gaia sd enhance "a mountain"

Terminal Output:

─────────────────────────────────────────────────────────
 SD AGENT · Prompt Enhancement
─────────────────────────────────────────────────────────

INPUT    "a mountain"                              3/10

AGENT ACTIONS
  → Analyzing intent...
  → Adding style: photorealistic
  → Adding lighting: golden hour, atmospheric
  → Adding details: snow-capped peaks, misty valleys
  → Adding quality: high quality, 4k

OUTPUT   "a serene mountain landscape at sunset,   9/10
          golden hour lighting, detailed snow-capped
          peaks, atmospheric perspective, misty
          valleys, high quality, photorealistic, 4k"

TOKENS   23/77 ✓

─────────────────────────────────────────────────────────

Mode 2: Full Generation Pipeline

Use Case: Generate with automatic quality iteration

gaia sd generate "a mountain"
gaia sd generate "a mountain" --format jpeg  # Output as JPEG instead of PNG

Terminal Output:

─────────────────────────────────────────────────────────
 SD AGENT · Generate
─────────────────────────────────────────────────────────

INPUT    "a mountain"

AGENT ACTIONS
  [1/4] Enhancing prompt ························· done
  [2/4] Optimizing parameters ···················· done
        → SD-Turbo · 512x512 · 4 steps · cfg 1.0
  [3/4] Generating image ························· done
  [4/4] VLM evaluation ··························· done

─────────────────────────────────────────────────────────
 ITERATION 1
─────────────────────────────────────────────────────────

 ┌─────────────────────────────────────────────────────┐
 │                                                     │
 │                   [IMAGE PREVIEW]                   │
 │                                                     │
 └─────────────────────────────────────────────────────┘

 VLM SCORES                              OVERALL 7.6/10
 ├─ Composition        8    ████████░░
 ├─ Lighting           7    ███████░░░
 ├─ Prompt adherence   6    ██████░░░░  "snow detail missing"
 ├─ Style              8    ████████░░
 └─ Technical          9    █████████░

 STATUS   ✓ Passed threshold (7.0) · 1 iteration

─────────────────────────────────────────────────────────
 OUTPUT   .gaia/cache/sd/images/mountain_001.png
─────────────────────────────────────────────────────────

With Iteration (below threshold):

─────────────────────────────────────────────────────────
 SD AGENT · Generate (Quality Target: 8.0)
─────────────────────────────────────────────────────────

INPUT    "a dragon breathing fire"

─────────────────────────────────────────────────────────
 ITERATION 1                             SCORE 6.2/10 ✗
─────────────────────────────────────────────────────────
 VLM feedback: "Fire lacks intensity, scales need detail"

 AGENT ACTIONS
  → Refining prompt: added "intense flames, detailed scales"
  → Adjusting params: steps 4 → 6

─────────────────────────────────────────────────────────
 ITERATION 2                             SCORE 7.4/10 ✗
─────────────────────────────────────────────────────────
 VLM feedback: "Better fire, composition could be stronger"

 AGENT ACTIONS
  → Refining prompt: added "dramatic angle, epic composition"

─────────────────────────────────────────────────────────
 ITERATION 3 (FINAL)                     SCORE 8.3/10 ✓
─────────────────────────────────────────────────────────

 ┌─────────────────────────────────────────────────────┐
 │                   [FINAL IMAGE]                     │
 └─────────────────────────────────────────────────────┘

 STATUS   ✓ Passed after 3 iterations
 OUTPUT   .gaia/cache/sd/images/dragon_003.png
─────────────────────────────────────────────────────────

Mode 3: Task Queue

Use Case: Batch processing with autonomous execution

gaia sd submit "a futuristic city at night"
gaia sd submit "a dragon in a cave" --sd-model SDXL-Turbo
gaia sd submit "portrait of a wizard" --quality 8.5
gaia sd queue

Terminal Output:

─────────────────────────────────────────────────────────
 SD AGENT · Task Queue
─────────────────────────────────────────────────────────

 #   PROMPT                          STATUS       SCORE
 ─────────────────────────────────────────────────────────
 1   a futuristic city at night      ✓ done       8.2
 2   a dragon in a cave              ● running    6.1 → iter 2
 3   portrait of a wizard            ○ pending    -

─────────────────────────────────────────────────────────
 TASK #2 · In Progress
─────────────────────────────────────────────────────────

 AGENT ACTIONS
  [✓] Enhance prompt
  [✓] Generate (iteration 1)
  [✓] VLM evaluate → 6.1/10 (below 7.0)
  [●] Refining: "adding dramatic cave lighting"
  [ ] Generate (iteration 2)
  [ ] VLM evaluate

─────────────────────────────────────────────────────────
 Completed: 1 · Running: 1 · Pending: 1
─────────────────────────────────────────────────────────

View task details:

gaia sd show 1

─────────────────────────────────────────────────────────
 TASK #1 · Complete
─────────────────────────────────────────────────────────

 INPUT     "a futuristic city at night"
 OUTPUT    "a futuristic cyberpunk city at night, neon
            lights reflecting on wet streets..."

 PARAMS    SD-Turbo · 512x512 · 4 steps · seed 48291

 ┌─────────────────────────────────────────────────────┐
 │                   [FINAL IMAGE]                     │
 └─────────────────────────────────────────────────────┘

 ITERATIONS
  1 → 6.8/10  "needs more neon contrast"
  2 → 8.2/10  ✓ passed

 FINAL SCORE                                     8.2/10
 ├─ Composition        8    ████████░░
 ├─ Lighting           9    █████████░
 ├─ Prompt adherence   8    ████████░░
 ├─ Style              8    ████████░░
 └─ Technical          8    ████████░░

─────────────────────────────────────────────────────────

Mode 4: Strategy Comparison

Use Case: A/B test different styles with VLM scoring

gaia sd compare "a dragon" --styles photorealistic,anime,oil-painting

Terminal Output:

─────────────────────────────────────────────────────────
 SD AGENT · Compare Strategies
─────────────────────────────────────────────────────────

 INPUT    "a dragon"
 STYLES   photorealistic · anime · oil-painting

 AGENT ACTIONS
  [✓] Enhanced prompt × 3 styles
  [✓] Generated images × 3
  [✓] VLM evaluated × 3

─────────────────────────────────────────────────────────
 RESULTS
─────────────────────────────────────────────────────────

 ┌─────────────┬─────────────┬─────────────┐
 │ PHOTO       │ ANIME       │ OIL         │
 │             │             │             │
 │  [IMAGE]    │  [IMAGE]    │  [IMAGE]    │
 │             │             │             │
 │   8.8/10    │   7.8/10    │   8.8/10    │
 └─────────────┴─────────────┴─────────────┘

 STYLE           COMP  LIGHT PROMPT STYLE  TECH   SCORE
 ─────────────────────────────────────────────────────────
 photorealistic   9     8     9      9      9     8.8 ★
 anime            8     7     8      9      7     7.8
 oil-painting     9     9     8     10      8     8.8 ★

 VLM NOTES
  photo  → "Excellent detail, realistic lighting on scales"
  anime  → "Good style, minor background artifacts"
  oil    → "Beautiful texture, strong composition"

─────────────────────────────────────────────────────────
 WINNER   Tie: photorealistic & oil-painting (8.8/10)
─────────────────────────────────────────────────────────

Mode 5: Download Images

Use Case: Export images from gallery to Downloads folder

# Download single image (default PNG)
gaia sd download 1

# Download as JPEG
gaia sd download 1 --format jpeg

# Download multiple images
gaia sd download 1 2 3 --format png

# Download all favorites
gaia sd download --favorites

Terminal Output:

─────────────────────────────────────────────────────────
 SD AGENT · Download
─────────────────────────────────────────────────────────

 TASK #1   "a futuristic city at night"

 FORMAT    PNG
 OUTPUT    ~/Downloads/sd_futuristic_city_001.png

 ✓ Downloaded successfully

─────────────────────────────────────────────────────────

Gallery UI: Each image card displays a download button. Clicking opens a format picker (PNG/JPEG) and saves to the user’s Downloads folder.

Core Features

1. LLM-Powered Prompt Enhancement Engine

Problem: Users struggle to write effective prompts that produce desired results from AI systems (SD, LLMs, etc.). Prompt engineering is a specialized skill. Solution: Use GAIA’s AMD NPU-accelerated LLM to analyze user intent and generate optimized prompts using domain-specific best practices. Core Capabilities:

A) Intent Analysis

@tool
def analyze_intent(user_input: str, domain: str = "sd") -> Dict:
    """
    Analyze what the user wants to create and identify missing elements.

    Returns:
        {
            "subject": "mountain landscape",
            "style": None,  # Missing!
            "mood": "serene",
            "missing": ["lighting", "quality", "details"],
            "suggestions": [...]
        }
    """

B) Domain-Specific Enhancement

@tool
def enhance_prompt(
    original_prompt: str,
    domain: str = "sd",
    style: Optional[str] = None,
    target_score: int = 9
) -> Dict:
    """
    Enhance prompt using domain-specific best practices.

    Args:
        original_prompt: User's input (e.g., "a mountain")
        domain: Target system ("sd", "llm", "code")
        style: Optional style override ("photorealistic", "anime", etc.)
        target_score: Desired quality score (1-10)

    Returns:
        {
            "original": "a mountain",
            "enhanced": "a serene mountain landscape at sunset...",
            "score_before": 3,
            "score_after": 9,
            "changes": {
                "added_style": "photorealistic",
                "added_lighting": "golden hour",
                "added_quality": "high quality, 4k",
                "added_details": "snow-capped peaks, misty valleys"
            },
            "token_count": 23,
            "validation": "PASS"
        }
    """

Domain-Specific Strategies:

Domain	Enhancement Focus	Example
Stable Diffusion	Style, lighting, composition, quality keywords	”mountain” → “serene mountain landscape, golden hour…”
LLM Prompts	Clarity, context, examples, constraints	”write code” → “Write Python code that… Use type hints…”
Code Generation	Specificity, patterns, requirements, constraints	”make API” → “Create FastAPI endpoint with Pydantic models…”

Enhancement Techniques:

# SD Enhancement Strategy
class SDPromptEnhancer:
    """Stable Diffusion-specific prompt engineering."""

    def enhance(self, prompt: str) -> str:
        # 1. Identify subject and action
        # 2. Add artistic style (photorealistic, anime, oil painting, etc.)
        # 3. Add lighting (golden hour, dramatic, soft, etc.)
        # 4. Add composition details (rule of thirds, depth of field, etc.)
        # 5. Add quality keywords (high quality, detailed, 4k, masterpiece)
        # 6. Validate token count (< 77 for SD CLIP limit)
        # 7. Return enhanced prompt

User Control:

--auto (default): Fully automated enhancement
--suggest: Show suggestions, let user pick
--interactive: Collaborative refinement
--no-enhance: Use prompt as-is
--style <style>: Force specific artistic style

2. Prompt Quality Scoring & Analysis

Goal: Provide objective feedback on prompt quality before generation. Scoring Criteria (for Stable Diffusion):

Criterion	Weight	Checks
Clarity	20%	Clear subject, unambiguous intent
Style	20%	Artistic style specified (photorealistic, anime, etc.)
Details	20%	Specific details (colors, textures, objects)
Technical	15%	Lighting, composition, camera angle
Quality	15%	Quality keywords (4k, detailed, high quality)
Length	10%	Optimal token count (10-50 tokens)

Scoring Implementation:

@tool
def analyze_prompt_quality(prompt: str, domain: str = "sd") -> Dict:
    """
    Score prompt quality and provide improvement suggestions.

    Returns:
        {
            "overall_score": 7,  # Out of 10
            "breakdown": {
                "clarity": 9,
                "style": 5,  # Missing!
                "details": 8,
                "technical": 6,
                "quality": 5,
                "length": 8
            },
            "strengths": ["Clear subject", "Good detail level"],
            "weaknesses": ["No artistic style", "Missing quality keywords"],
            "suggestions": [
                "Add style: photorealistic, anime, or oil painting",
                "Include quality keywords: high quality, detailed, 4k"
            ],
            "token_count": 12,
            "optimal_range": "10-50"
        }
    """

Visual Scoring Display:

📊 Prompt Quality Analysis

Original: "a mountain landscape"

Overall Score: 6/10 ⚠️

Breakdown:
  ✅ Clarity       █████████░ 9/10  Clear subject
  ❌ Style         █████░░░░░ 5/10  No artistic style specified
  ✅ Details       ████████░░ 8/10  Landscape is specific
  ⚠️  Technical    ██████░░░░ 6/10  Missing lighting, composition
  ❌ Quality       █████░░░░░ 5/10  No quality keywords
  ✅ Length        ████████░░ 8/10  12 tokens (optimal: 10-50)

💡 Suggestions:
  1. Add artistic style (e.g., "photorealistic", "oil painting")
  2. Specify lighting (e.g., "golden hour", "dramatic lighting")
  3. Include quality keywords (e.g., "high quality, 4k")

🔧 Quick fixes:
  [1] Add photorealistic style
  [2] Add dramatic lighting
  [3] Add all suggestions
  [c] Custom edit

3. Terminal Image Display & Verification

Goal: Provide immediate visual feedback on prompt quality by generating test images in-terminal. Use Case: After enhancing a prompt, quickly verify it produces desired results without leaving the CLI. Terminal Support:

Terminal	Protocol	Library	Windows	Linux	macOS
Windows Terminal	Sixel	`term-image`	✅	-	-
iTerm2	Inline Images	`imgcat`	-	-	✅
Kitty	Graphics Protocol	`term-image`	✅	✅	✅
Standard terminals	ASCII art	`term-image`	✅	✅	✅

Implementation:

# core/terminal_display.py
class TerminalDisplay:
    """Handle cross-platform terminal image rendering."""

    def __init__(self):
        self.console = Console()  # Rich console
        self.display_method = self._detect_terminal_capabilities()

    def show_image(self, image_path: Path, max_width: int = 80):
        """Display image in terminal using best available method."""
        if self.display_method == "sixel":
            self._show_sixel(image_path, max_width)
        elif self.display_method == "iterm2":
            self._show_iterm2(image_path)
        elif self.display_method == "kitty":
            self._show_kitty(image_path)
        else:
            # Fallback: ASCII art or open in default viewer
            self._show_ascii(image_path, max_width)

    def _detect_terminal_capabilities(self) -> str:
        """Detect which image protocol terminal supports."""
        # Check environment variables, terminal type
        # Return best available protocol

Fallback Strategy:

Try native terminal image protocol
Fall back to Unicode block art (better than ASCII)
Finally, open image in default viewer

Rich Integration:

# Use Rich for beautiful formatting around image
panel = Panel(
    f"[bold cyan]Generated Image[/bold cyan]\n"
    f"Prompt: {prompt}\n"
    f"Model: {model} | Size: {size} | Seed: {seed}",
    title="🎨 Image Generation",
    border_style="cyan"
)
console.print(panel)
# Display image here

4. Prompt Template Library

Goal: Codify and reuse successful prompt patterns. Template Structure:

{
  "name": "scenic-landscape",
  "domain": "sd",
  "description": "Photorealistic landscape photography style",
  "template": "{subject} at {time_of_day}, {lighting} lighting, detailed {features}, atmospheric perspective, {quality_keywords}",
  "defaults": {
    "time_of_day": "sunset",
    "lighting": "golden hour",
    "features": "natural elements",
    "quality_keywords": "high quality, 4k, photorealistic"
  },
  "examples": [
    {
      "input": "mountain",
      "output": "mountain landscape at sunset, golden hour lighting, detailed snow-capped peaks, atmospheric perspective, high quality, 4k, photorealistic"
    }
  ],
  "metadata": {
    "created": "2026-01-26",
    "author": "PromptAgent",
    "usage_count": 42,
    "avg_score": 9.2
  }
}

Template Commands:

# List available templates
gaia prompt templates

# Use template
gaia prompt use scenic-landscape "mountain"

# Create template from successful prompt
gaia prompt save <prompt-id> --name my-template

# Edit template
gaia prompt template edit scenic-landscape

# Share template
gaia prompt template export scenic-landscape > template.json

Built-in Template Categories:

Category	Templates	Use Cases
Photography	portrait, landscape, macro, street, architectural	Photorealistic scene composition
Artistic	oil-painting, watercolor, anime, comic-book, sketch	Artistic style applications
Mood	dramatic, serene, energetic, mysterious, playful	Emotional atmosphere
Genre	cyberpunk, fantasy, sci-fi, horror, vintage	Genre-specific aesthetics
Technical	product-shot, technical-diagram, infographic	Professional/commercial use

Goal: Progressively improve prompts through multiple enhancement cycles. Workflow:

@tool
def refine_prompt(
    prompt_or_id: str,
    feedback: Optional[str] = None,
    iterations: int = 1
) -> List[Dict]:
    """
    Iteratively refine prompt based on feedback or automated analysis.

    Args:
        prompt_or_id: Prompt text or saved prompt ID
        feedback: User feedback on current version
        iterations: Number of refinement cycles

    Returns:
        List of progressively refined versions with scores
    """

Example Refinement Session:

🔄 Iterative Refinement: 3 cycles

Iteration 1: Initial Enhancement
  Input:  "a dragon"
  Output: "a majestic dragon, detailed scales, powerful wings,
           breathing fire, fantasy art style"
  Score: 7/10
  Issues: Generic fantasy, no specific setting

Iteration 2: Add Context
  Output: "a majestic dragon perched on a cliff at sunset,
           detailed iridescent scales, powerful wings spread wide,
           breathing golden fire, dramatic lighting, fantasy art"
  Score: 8/10
  Issues: Could be more specific about style

Iteration 3: Stylistic Refinement
  Output: "a majestic dragon perched on a mountain cliff at sunset,
           detailed iridescent scales reflecting golden light,
           powerful wings spread wide, breathing streams of golden fire,
           dramatic cinematic lighting, epic fantasy art, 4k, detailed"
  Score: 9/10
  ✅ Ready for generation

💾 All versions saved to: .gaia/cache/prompts/dragon_refinement/

6. Prompt Comparison & A/B Testing

Goal: Empirically determine which prompt strategies work best.

@tool
def compare_prompts(
    base_prompt: str,
    strategies: List[str],
    generate_images: bool = True
) -> Dict:
    """
    Generate multiple enhanced versions using different strategies
    and optionally generate images for visual comparison.

    Args:
        base_prompt: Original user prompt
        strategies: List of enhancement approaches
        generate_images: Whether to generate test images

    Returns:
        Comparison matrix with scores and generated images
    """

Comparison Output:

🔬 Prompt Strategy Comparison

Base: "a dragon"
Strategies: [photorealistic, anime, oil-painting, minimalist]

┌─────────────────┬────────┬─────────────────────────────────────────┐
│ Strategy        │ Score  │ Enhanced Prompt                         │
├─────────────────┼────────┼─────────────────────────────────────────┤
│ Photorealistic  │ 9/10   │ A dragon with realistic scales, photo.. │
│ Anime           │ 8/10   │ A dragon, anime style, vibrant colors.. │
│ Oil Painting    │ 9/10   │ A dragon, oil painting style, rich...  │
│ Minimalist      │ 7/10   │ A dragon, simple clean lines, minimal..│
└─────────────────┴────────┴─────────────────────────────────────────┘

[4 generated images displayed in grid]

📊 Results:
  - Best score: Photorealistic (9/10)
  - Most creative: Oil Painting
  - Fastest generation: Minimalist
  - User preference: [pending]

💾 Saved to: .gaia/cache/prompts/dragon_comparison_20260126/

7. Prompt Version Control & History

Goal: Track prompt evolution and enable rollback. Version Structure:

.gaia/cache/prompts/
├── mountain_landscape/
│   ├── v1.json           # Original + first enhancement
│   ├── v2.json           # Refinement iteration 2
│   ├── v3.json           # Final version
│   ├── images/
│   │   ├── v1.png
│   │   ├── v2.png
│   │   └── v3.png
│   └── metadata.json     # Tracks all versions
└── .index.json

Version Commands:

# View prompt history
gaia prompt history mountain_landscape

# Compare versions
gaia prompt diff v1 v3

# Rollback to previous version
gaia prompt rollback mountain_landscape v2

# Export lineage
gaia prompt export mountain_landscape --with-history

8. Image Storage & Verification Cache

Cache Structure:

.gaia/cache/prompts/
├── mountain_landscape/
│   ├── prompt.json              # Prompt metadata & versions
│   ├── v1.json                  # Version 1 (original + enhancement)
│   ├── v2.json                  # Version 2 (refinement)
│   ├── images/
│   │   ├── v1_seed42857.png     # Test image for v1
│   │   ├── v2_seed42857.png     # Test image for v2 (same seed for comparison)
│   │   └── variations/          # Different seeds/parameters
│   │       ├── v2_seed12345.png
│   │       └── v2_seed67890.png
│   └── analysis.json            # Quality scores, comparisons
├── cyberpunk_city/
│   └── ...
├── templates/
│   ├── scenic-landscape.json
│   ├── portrait.json
│   └── ...
└── .index.json                   # Fast lookup, search

Prompt Metadata Format (prompt.json):

{
  "id": "mountain_landscape_20260126",
  "created_at": "2026-01-26T15:30:00Z",
  "domain": "sd",

  "versions": [
    {
      "version": "v1",
      "created_at": "2026-01-26T15:30:00Z",
      "original": "a mountain",
      "enhanced": "a serene mountain landscape at sunset, golden hour lighting...",
      "score_before": 3,
      "score_after": 9,
      "enhancement_strategy": "photorealistic-landscape",
      "changes": {
        "added_style": "photorealistic",
        "added_lighting": "golden hour",
        "added_details": "snow-capped peaks, atmospheric perspective",
        "added_quality": "high quality, 4k"
      },
      "token_count": 23,
      "validation": "PASS"
    },
    {
      "version": "v2",
      "created_at": "2026-01-26T15:35:00Z",
      "original": "v1",
      "enhanced": "a serene mountain landscape at sunset, dramatic golden hour...",
      "score_before": 9,
      "score_after": 9,
      "refinement_feedback": "Make lighting more dramatic",
      "changes": {"modified_lighting": "dramatic golden hour"}
    }
  ],

  "current_version": "v2",

  "generation_history": [
    {
      "version": "v1",
      "model": "SD-Turbo",
      "parameters": {
        "size": "512x512",
        "steps": 4,
        "cfg_scale": 1.0,
        "seed": 42857
      },
      "image_path": "images/v1_seed42857.png",
      "generation_time_ms": 2300,
      "user_rating": null
    },
    {
      "version": "v2",
      "model": "SD-Turbo",
      "parameters": {
        "size": "512x512",
        "steps": 4,
        "cfg_scale": 1.0,
        "seed": 42857  // Same seed for fair comparison
      },
      "image_path": "images/v2_seed42857.png",
      "generation_time_ms": 2250,
      "user_rating": 5  // User preferred v2
    }
  ],

  "templates_used": ["scenic-landscape"],
  "tags": ["landscape", "nature", "sunset"],

  "statistics": {
    "total_generations": 5,
    "total_refinements": 1,
    "best_version": "v2",
    "best_score": 9,
    "avg_generation_time_ms": 2275
  }
}

Benefits:

Reproducible generations (same seed = same image)
Easy prompt history tracking
Performance benchmarking data
Version control friendly (JSON diff)

Cache Management:

# View cache stats
gaia cache status

# Clear image cache
gaia cache clear --images

# Export metadata
gaia image --export-metadata > prompts.json

9. Prompt Testing with Visual Variations

Goal: Test prompt robustness by generating multiple images with different seeds. Use Case: A good prompt should produce consistently good results across different seeds.

@tool
def test_prompt_consistency(
    prompt_or_id: str,
    test_count: int = 4,
    model: str = "SD-Turbo",
    size: str = "512x512"
) -> Dict:
    """
    Test prompt quality by generating multiple variations.
    Helps identify if prompt is too specific (same result every time)
    or too vague (wildly different results).

    Args:
        prompt_or_id: Prompt text or saved ID
        test_count: Number of test generations (different seeds)
        model: SD model to use
        size: Image dimensions

    Returns:
        Consistency analysis with generated images
    """

Consistency Test Output:

gaia sd test "a serene mountain landscape at sunset" --seeds 4

─────────────────────────────────────────────────────────
 SD AGENT · Consistency Test
─────────────────────────────────────────────────────────

 PROMPT   "a serene mountain landscape at sunset..."

 AGENT ACTIONS
  [✓] Generated 4 images with different seeds
  [✓] VLM evaluated all images

─────────────────────────────────────────────────────────
 RESULTS
─────────────────────────────────────────────────────────

 ┌─────────────┬─────────────┐
 │ seed 12345  │ seed 23456  │
 │  [IMAGE]    │  [IMAGE]    │
 │    8.4      │    7.9      │
 ├─────────────┼─────────────┤
 │ seed 34567  │ seed 45678  │
 │  [IMAGE]    │  [IMAGE]    │
 │    8.1      │    8.6 ★    │
 └─────────────┴─────────────┘

 CONSISTENCY ANALYSIS
  Range        7.9 - 8.6  (variance: 0.7)
  Average      8.25/10
  Best seed    45678

 CATEGORY VARIANCE
  Composition      8-9   ████░  low variance ✓
  Lighting         8-9   ████░  low variance ✓
  Prompt           7-9   ███░░  moderate
  Style            8-9   ████░  low variance ✓
  Technical        7-9   ███░░  moderate

─────────────────────────────────────────────────────────
 VERDICT   Excellent consistency · Use seed 45678
─────────────────────────────────────────────────────────

Agent Implementation

PromptAgent Class

class PromptAgent(Agent):
    """
    Agent specialized in prompt engineering and optimization for AI systems.

    Primary focus: Help users craft better prompts through LLM-powered analysis,
    enhancement, and iterative refinement. Supports multiple domains (SD, LLMs,
    code generation) with domain-specific best practices.

    Features:
    - LLM-powered prompt enhancement
    - Quality scoring and analysis
    - Iterative refinement workflow
    - Template library and version control
    - A/B testing and comparison
    - Optional SD integration for visual verification
    - Terminal image display for immediate feedback
    """

    def __init__(
        self,
        cache_dir: str = ".gaia/cache/prompts",
        default_domain: str = "sd",
        auto_enhance: bool = False,  # User must explicitly request
        enable_sd_integration: bool = True,
        default_sd_model: str = "SD-Turbo",
        template_dir: Optional[str] = None,
        base_url: Optional[str] = None,
        model_id: Optional[str] = None,
        max_steps: int = 8,  # More steps for analysis/refinement
        **kwargs
    ):
        """
        Initialize PromptAgent.

        Args:
            cache_dir: Directory for caching prompts and metadata
            default_domain: Default target domain ("sd", "llm", "code")
            auto_enhance: Auto-enhance without asking (default: False)
            enable_sd_integration: Enable image generation features
            default_sd_model: SD model to use for testing
            template_dir: Custom template directory
            base_url: Lemonade Server base URL
            model_id: LLM model for prompt enhancement
            max_steps: Max reasoning steps
        """
        super().__init__(
            base_url=base_url,
            model_id=model_id,
            max_steps=max_steps,
            **kwargs
        )

        self.cache_dir = Path(cache_dir)
        self.cache_dir.mkdir(parents=True, exist_ok=True)

        self.default_domain = default_domain
        self.auto_enhance = auto_enhance
        self.enable_sd_integration = enable_sd_integration
        self.default_sd_model = default_sd_model

        # Initialize components
        self.enhancer = PromptEnhancer(llm_client=self.llm_client)
        self.analyzer = PromptAnalyzer()
        self.template_library = TemplateLibrary(template_dir or self.cache_dir / "templates")

        # Optional SD integration
        if self.enable_sd_integration:
            self.image_generator = ImageGenerator(base_url=base_url)
            self.terminal_display = TerminalDisplay()

        # Register tools
        self._register_tools()

    def _get_system_prompt(self) -> str:
        """System prompt for prompt engineering agent."""
        return """You are an expert prompt engineering assistant.

Your primary role is to help users write better prompts for AI systems through:
- Analyzing user intent
- Identifying missing or weak elements
- Applying domain-specific best practices
- Iterative refinement based on feedback
- Quality scoring and suggestions

==== DOMAINS YOU SUPPORT ====

1. Stable Diffusion (SD):
   - Add style descriptors (photorealistic, anime, oil painting, etc.)
   - Specify lighting (golden hour, dramatic, soft light)
   - Include composition details (rule of thirds, depth of field)
   - Add quality keywords (high quality, detailed, 4k, masterpiece)
   - Keep under 77 tokens (CLIP limit)

2. LLM Prompts:
   - Provide clear context and background
   - Include specific examples (few-shot learning)
   - Define output format/constraints
   - Use clear, unambiguous language

3. Code Generation:
   - Specify language, framework, patterns
   - Include requirements and constraints
   - Define interfaces/types
   - Mention error handling needs

==== WORKFLOW ====

1. Analyze: Understand user intent and identify gaps
2. Score: Provide objective quality assessment
3. Enhance: Apply domain best practices
4. Verify: (Optional) Test with image generation
5. Refine: Iterate based on results/feedback

==== BEST PRACTICES ====

- Always analyze before enhancing
- Explain your reasoning
- Provide multiple options when appropriate
- Ask for clarification if intent is unclear
- Use templates when applicable
- Track version history
- Never auto-enhance without user consent (unless auto_enhance=True)

==== JSON RESPONSE FORMAT ====
{
    "thought": "analysis of user request",
    "goal": "what you're achieving",
    "tool": "analyze_prompt",  # Or enhance_prompt, refine_prompt, etc.
    "tool_args": {
        "prompt": "user's prompt",
        "domain": "sd"
    }
}

For final answers:
{
    "thought": "summary of what was accomplished",
    "goal": "user's request fulfilled",
    "answer": "Enhanced prompt ready! [details and suggestions]"
}
"""

    def _register_tools(self):
        """Register prompt engineering tools."""

        @tool
        def analyze_prompt(prompt: str, domain: str = self.default_domain) -> Dict:
            """
            Analyze prompt quality and provide improvement suggestions.

            Returns quality score, breakdown by criteria, and specific suggestions.
            """
            return self.analyzer.analyze(prompt, domain)

        @tool
        def enhance_prompt(
            prompt: str,
            domain: str = self.default_domain,
            style: Optional[str] = None,
            target_score: int = 9
        ) -> Dict:
            """
            Enhance prompt using domain-specific best practices.

            Returns enhanced version with before/after scores and changes made.
            """
            return self.enhancer.enhance(
                prompt, domain, style, target_score
            )

        @tool
        def refine_prompt(
            prompt_or_id: str,
            feedback: Optional[str] = None,
            iterations: int = 1
        ) -> List[Dict]:
            """
            Iteratively refine prompt based on feedback or analysis.

            Returns list of progressively improved versions.
            """
            # Implementation

        @tool
        def compare_prompts(
            base_prompt: str,
            strategies: List[str],
            generate_images: bool = False
        ) -> Dict:
            """
            Generate multiple enhanced versions using different strategies.

            Optionally generates test images for visual comparison.
            """
            # Implementation

        @tool
        def use_template(
            template_name: str,
            **template_vars
        ) -> Dict:
            """
            Apply template with user-provided variables.

            Example: use_template("scenic-landscape", subject="mountain")
            """
            return self.template_library.apply(template_name, template_vars)

        @tool
        def save_prompt(
            prompt_data: Dict,
            name: Optional[str] = None,
            tags: List[str] = None
        ) -> str:
            """
            Save prompt to cache with metadata and versioning.

            Returns prompt ID for future reference.
            """
            # Implementation

        # SD Integration Tools (if enabled)
        if self.enable_sd_integration:

            @tool
            def generate_test_image(
                prompt_or_id: str,
                model: str = self.default_sd_model,
                size: str = "512x512",
                seed: Optional[int] = None
            ) -> Dict:
                """
                Generate test image to verify prompt quality.

                Displays image in terminal for immediate feedback.
                """
                # Implementation

            @tool
            def test_prompt_consistency(
                prompt_or_id: str,
                test_count: int = 4,
                model: str = self.default_sd_model
            ) -> Dict:
                """
                Test prompt by generating multiple variations (different seeds).

                Helps identify if prompt is too specific or too vague.
                """
                # Implementation

System Prompt

SYSTEM_PROMPT = """You are an expert image generation assistant using AMD-optimized Stable Diffusion models.

==== YOUR CAPABILITIES ====
- Generate images from text descriptions using SD-Turbo and SDXL-Turbo
- Enhance user prompts for better artistic results
- Create multiple variations with different seeds
- Recommend optimal parameters based on requirements

==== STABLE DIFFUSION BEST PRACTICES ====
1. Prompt Enhancement:
   - Add style descriptors (photorealistic, anime, oil painting, etc.)
   - Include lighting details (golden hour, dramatic lighting, soft light)
   - Add quality keywords (high quality, detailed, 4k, masterpiece)
   - Keep under 77 tokens (CLIP limit)

2. Model Selection:
   - SD-Turbo: Fast (4 steps), 512x512, good for testing/iteration
   - SDXL-Turbo: Higher quality, 1024x1024, takes longer

3. Parameter Guidance:
   - Steps: 4 for SD-Turbo, 8 for SDXL-Turbo
   - CFG Scale: 1.0-2.0 for Turbo models (higher = more prompt adherence)
   - Seed: Use same seed for reproducibility

==== WORKFLOW ====
1. Understand user request
2. Enhance prompt (unless user says --no-enhance)
3. Select appropriate model and parameters
4. Generate image(s)
5. Display results and offer variations

==== JSON RESPONSE FORMAT ====
{
    "thought": "reasoning about the request",
    "goal": "what you're achieving",
    "tool": "generate_image",
    "tool_args": {
        "prompt": "enhanced prompt here",
        "model": "SD-Turbo",
        "size": "512x512",
        "steps": 4,
        "enhance": true
    }
}

For final answers:
{
    "thought": "what was accomplished",
    "goal": "user's request fulfilled",
    "answer": "Image generated successfully! [details]"
}
"""

CLI Integration

Command Structure

# Add to src/gaia/cli.py

# Main prompt command with subcommands
prompt_parser = subparsers.add_parser(
    "prompt",
    help="Prompt engineering assistant for AI systems",
    description="Analyze, enhance, and optimize prompts with LLM-powered assistance"
)

prompt_subparsers = prompt_parser.add_subparsers(dest="prompt_command", help="Prompt operations")

# === Subcommand: enhance ===
enhance_parser = prompt_subparsers.add_parser(
    "enhance",
    help="Enhance a prompt using domain best practices"
)

enhance_parser.add_argument(
    "prompt",
    nargs="+",
    help="Prompt to enhance"
)

enhance_parser.add_argument(
    "--for",
    dest="domain",
    choices=["sd", "llm", "code"],
    default="sd",
    help="Target domain (default: sd)"
)

enhance_parser.add_argument(
    "--style",
    help="Force specific style (e.g., photorealistic, anime)"
)

enhance_parser.add_argument(
    "--target-score",
    type=int,
    default=9,
    help="Target quality score (1-10)"
)

# === Subcommand: analyze ===
analyze_parser = prompt_subparsers.add_parser(
    "analyze",
    help="Analyze prompt quality and get suggestions"
)

analyze_parser.add_argument(
    "prompt",
    nargs="+",
    help="Prompt to analyze"
)

analyze_parser.add_argument(
    "--for",
    dest="domain",
    choices=["sd", "llm", "code"],
    default="sd",
    help="Target domain"
)

# === Subcommand: generate (enhance + generate image) ===
generate_parser = prompt_subparsers.add_parser(
    "generate",
    aliases=["gen", "g"],
    help="Enhance prompt and generate test image"
)

generate_parser.add_argument(
    "prompt",
    nargs="+",
    help="Prompt to enhance and generate from"
)

generate_parser.add_argument(
    "--sd-model",
    choices=["SD-Turbo", "SDXL-Turbo"],
    default="SD-Turbo",
    help="SD model (default: SD-Turbo)"
)

generate_parser.add_argument(
    "--size",
    choices=["512x512", "768x768", "1024x1024"],
    default="512x512",
    help="Image dimensions"
)

generate_parser.add_argument(
    "--no-enhance",
    action="store_true",
    help="Skip enhancement, use prompt as-is"
)

# === Subcommand: refine ===
refine_parser = prompt_subparsers.add_parser(
    "refine",
    help="Iteratively refine a prompt"
)

refine_parser.add_argument(
    "prompt_or_id",
    help="Prompt text or saved prompt ID"
)

refine_parser.add_argument(
    "--iterations",
    "-n",
    type=int,
    default=3,
    help="Number of refinement cycles (default: 3)"
)

refine_parser.add_argument(
    "--feedback",
    help="User feedback to guide refinement"
)

# === Subcommand: compare ===
compare_parser = prompt_subparsers.add_parser(
    "compare",
    help="Compare different prompt enhancement strategies"
)

compare_parser.add_argument(
    "prompt",
    nargs="+",
    help="Base prompt to compare strategies"
)

compare_parser.add_argument(
    "--strategies",
    nargs="+",
    default=["photorealistic", "anime", "oil-painting"],
    help="Strategies to compare"
)

compare_parser.add_argument(
    "--generate",
    action="store_true",
    help="Generate test images for visual comparison"
)

# === Subcommand: templates ===
templates_parser = prompt_subparsers.add_parser(
    "templates",
    help="List available prompt templates"
)

templates_parser.add_argument(
    "--domain",
    choices=["sd", "llm", "code", "all"],
    default="all",
    help="Filter by domain"
)

# === Subcommand: use ===
use_parser = prompt_subparsers.add_parser(
    "use",
    help="Use a saved template"
)

use_parser.add_argument(
    "template_name",
    help="Template name (e.g., scenic-landscape)"
)

use_parser.add_argument(
    "subject",
    help="Main subject to insert into template"
)

use_parser.add_argument(
    "--generate",
    action="store_true",
    help="Also generate test image"
)

# === Subcommand: save ===
save_parser = prompt_subparsers.add_parser(
    "save",
    help="Save a prompt to library"
)

save_parser.add_argument(
    "prompt_or_id",
    help="Prompt text or ID to save"
)

save_parser.add_argument(
    "--name",
    help="Save as named template"
)

save_parser.add_argument(
    "--tags",
    nargs="+",
    help="Tags for organization"
)

# === Subcommand: history ===
history_parser = prompt_subparsers.add_parser(
    "history",
    help="View prompt version history"
)

history_parser.add_argument(
    "prompt_id",
    help="Prompt ID to view history"
)

# === Subcommand: test ===
test_parser = prompt_subparsers.add_parser(
    "test",
    help="Test prompt consistency with multiple seeds"
)

test_parser.add_argument(
    "prompt_or_id",
    help="Prompt to test"
)

test_parser.add_argument(
    "--count",
    type=int,
    default=4,
    help="Number of test generations (default: 4)"
)

# Global options for all subcommands
for parser in [enhance_parser, analyze_parser, generate_parser, refine_parser,
               compare_parser, use_parser, save_parser, test_parser]:
    parser.add_argument(
        "--debug",
        action="store_true",
        help="Enable debug output"
    )

# Set default handlers
prompt_parser.set_defaults(func=prompt_command_dispatcher)

Command Handlers

def prompt_command_dispatcher(args):
    """Dispatch to appropriate prompt subcommand handler."""

    # Interactive mode if no subcommand
    if not hasattr(args, "prompt_command") or not args.prompt_command:
        run_interactive_prompt_engineering()
        return

    # Initialize agent once
    agent = PromptAgent(
        debug=args.debug,
        enable_sd_integration=True  # Enable for generate/test commands
    )

    # Dispatch to subcommand
    handlers = {
        "enhance": handle_enhance,
        "analyze": handle_analyze,
        "generate": handle_generate,
        "gen": handle_generate,  # Alias
        "g": handle_generate,    # Alias
        "refine": handle_refine,
        "compare": handle_compare,
        "templates": handle_templates,
        "use": handle_use_template,
        "save": handle_save,
        "history": handle_history,
        "test": handle_test
    }

    handler = handlers.get(args.prompt_command)
    if handler:
        handler(agent, args)
    else:
        print(f"Unknown command: {args.prompt_command}")


def handle_enhance(agent: PromptAgent, args):
    """Handle 'gaia prompt enhance' command."""
    prompt = " ".join(args.prompt)

    message = f"Enhance this {args.domain} prompt: {prompt}"
    if args.style:
        message += f" (style: {args.style})"
    if args.target_score != 9:
        message += f" (target score: {args.target_score}/10)"

    result = agent.run(message)
    # Agent handles display via console


def handle_analyze(agent: PromptAgent, args):
    """Handle 'gaia prompt analyze' command."""
    prompt = " ".join(args.prompt)

    message = f"Analyze this {args.domain} prompt and provide quality score: {prompt}"
    result = agent.run(message)


def handle_generate(agent: PromptAgent, args):
    """Handle 'gaia prompt generate' (enhance + generate image)."""
    prompt = " ".join(args.prompt)

    if args.no_enhance:
        message = f"Generate image with this prompt (no enhancement): {prompt}"
    else:
        message = f"Enhance this prompt and generate test image: {prompt}"

    # Add SD parameters
    params = []
    if args.model != "SD-Turbo":
        params.append(f"model {args.model}")
    if args.size != "512x512":
        params.append(f"size {args.size}")

    if params:
        message += f" ({', '.join(params)})"

    result = agent.run(message)


def handle_refine(agent: PromptAgent, args):
    """Handle 'gaia prompt refine' (iterative improvement)."""
    message = f"Refine this prompt through {args.iterations} iterations"
    if args.feedback:
        message += f" with feedback: {args.feedback}"
    message += f": {args.prompt_or_id}"

    result = agent.run(message)


def handle_compare(agent: PromptAgent, args):
    """Handle 'gaia prompt compare' (A/B testing)."""
    prompt = " ".join(args.prompt)
    strategies = ", ".join(args.strategies)

    message = f"Compare these strategies ({strategies}) for prompt: {prompt}"
    if args.generate:
        message += " (generate test images for visual comparison)"

    result = agent.run(message)


def handle_templates(agent: PromptAgent, args):
    """Handle 'gaia prompt templates' (list templates)."""
    from rich.table import Table
    from rich.console import Console

    templates = agent.template_library.list_templates(
        domain=None if args.domain == "all" else args.domain
    )

    console = Console()
    table = Table(title="Available Prompt Templates")
    table.add_column("Name", style="cyan")
    table.add_column("Domain", style="green")
    table.add_column("Description")
    table.add_column("Usage", style="dim")

    for template in templates:
        table.add_row(
            template["name"],
            template["domain"],
            template["description"],
            str(template["metadata"]["usage_count"])
        )

    console.print(table)


def handle_use_template(agent: PromptAgent, args):
    """Handle 'gaia prompt use' (apply template)."""
    message = f"Use template '{args.template_name}' with subject: {args.subject}"
    if args.generate:
        message += " (generate test image)"

    result = agent.run(message)


def handle_save(agent: PromptAgent, args):
    """Handle 'gaia prompt save'."""
    message = f"Save this prompt to library: {args.prompt_or_id}"
    if args.name:
        message += f" (template name: {args.name})"
    if args.tags:
        message += f" (tags: {', '.join(args.tags)})"

    result = agent.run(message)


def handle_history(agent: PromptAgent, args):
    """Handle 'gaia prompt history'."""
    message = f"Show version history for prompt: {args.prompt_id}"
    result = agent.run(message)


def handle_test(agent: PromptAgent, args):
    """Handle 'gaia prompt test' (consistency testing)."""
    message = f"Test consistency of prompt with {args.count} different seeds: {args.prompt_or_id}"
    result = agent.run(message)

Lemonade Client Extension

Add Image Generation Method

# src/gaia/llm/lemonade_client.py

class LemonadeClient:
    # ... existing methods ...

    def generate_image(
        self,
        prompt: str,
        model: str = "SD-Turbo",
        size: str = "512x512",
        steps: int = 4,
        cfg_scale: float = 1.0,
        seed: Optional[int] = None,
        response_format: str = "b64_json"
    ) -> Dict[str, Any]:
        """
        Generate image using Lemonade Server Stable Diffusion endpoint.

        Args:
            prompt: Text description of image to generate
            model: SD model to use ("SD-Turbo", "SDXL-Turbo")
            size: Image dimensions (e.g., "512x512", "1024x1024")
            steps: Inference steps (4 for SD-Turbo, 8 for SDXL)
            cfg_scale: Classifier-free guidance scale (1.0-2.0)
            seed: Random seed for reproducibility (optional)
            response_format: Only "b64_json" supported

        Returns:
            Dict with 'created' timestamp and 'data' list containing
            {'b64_json': '<base64-encoded-png>'}

        Raises:
            requests.HTTPError: If generation fails

        Example:
            >>> client = LemonadeClient()
            >>> result = client.generate_image(
            ...     prompt="a serene mountain landscape",
            ...     model="SD-Turbo",
            ...     size="512x512",
            ...     steps=4
            ... )
            >>> image_b64 = result['data'][0]['b64_json']
        """
        endpoint = f"{self.base_url}/images/generations"

        payload = {
            "model": model,
            "prompt": prompt,
            "size": size,
            "steps": steps,
            "cfg_scale": cfg_scale,
            "response_format": response_format
        }

        if seed is not None:
            payload["seed"] = seed

        logger.info(f"Generating image with {model}: {prompt[:50]}...")

        try:
            response = requests.post(
                endpoint,
                json=payload,
                timeout=DEFAULT_REQUEST_TIMEOUT
            )
            response.raise_for_status()

            result = response.json()
            logger.info(f"Image generated successfully (created: {result.get('created')})")

            return result

        except requests.exceptions.Timeout:
            raise TimeoutError(
                f"Image generation timed out after {DEFAULT_REQUEST_TIMEOUT}s. "
                f"Try reducing steps or size."
            )
        except requests.exceptions.HTTPError as e:
            logger.error(f"Image generation failed: {e.response.text}")
            raise

Dependencies

New Libraries Required

# Add to pyproject.toml

dependencies = [
    # ... existing dependencies ...
    "Pillow>=10.0.0",           # Image processing
    "term-image>=0.7.0",        # Terminal image display
]

Optional Dependencies

[project.optional-dependencies]
dev = [
    # ... existing dev dependencies ...
]

Notes:

Pillow: For image manipulation, format conversion
term-image: Cross-platform terminal image rendering (supports sixel, iTerm2, Kitty)
Both are lightweight and well-maintained

Testing Strategy

Unit Tests

# tests/unit/test_image_agent.py

def test_prompt_enhancement(mock_llm):
    """Test LLM-based prompt enhancement."""
    enhancer = PromptEnhancer(llm_client=mock_llm)
    enhanced = enhancer.enhance("mountain")
    assert "landscape" in enhanced.lower()
    assert len(enhanced) > len("mountain")

def test_image_generation(mock_lemonade_server):
    """Test image generation via Lemonade Server."""
    generator = ImageGenerator(base_url="http://mock")
    result = generator.generate(
        prompt="test prompt",
        model="SD-Turbo",
        size="512x512"
    )
    assert "data" in result
    assert "b64_json" in result["data"][0]

def test_metadata_creation():
    """Test metadata JSON creation."""
    metadata = create_metadata(
        prompt="test",
        model="SD-Turbo",
        size="512x512",
        seed=12345
    )
    assert metadata["model"] == "SD-Turbo"
    assert metadata["parameters"]["seed"] == 12345

Integration Tests

# tests/integration/test_image_generation.py

@pytest.mark.integration
def test_end_to_end_generation(require_lemonade):
    """Test full image generation workflow."""
    agent = ImageAgent()
    result = agent.run("Generate a red cube")

    assert result["status"] == "success"
    assert Path(result["image_path"]).exists()
    assert Path(result["metadata_path"]).exists()

@pytest.mark.integration
def test_batch_generation(require_lemonade):
    """Test generating multiple variations."""
    agent = ImageAgent()
    result = agent.run("Generate 3 variations of a blue sphere")

    assert len(result["images"]) == 3
    assert all(Path(img).exists() for img in result["images"])

CLI Tests

# tests/test_cli_image.py

def test_image_command_basic(capsys):
    """Test basic image generation command."""
    result = subprocess.run(
        ["gaia", "image", "test prompt"],
        capture_output=True,
        text=True
    )
    assert result.returncode == 0
    assert "Generated" in result.stdout

def test_image_command_with_options():
    """Test image generation with custom parameters."""
    result = subprocess.run(
        ["gaia", "image", "test", "--sd-model", "SDXL-Turbo", "--size", "1024x1024"],
        capture_output=True
    )
    assert result.returncode == 0

Documentation Requirements

User Guide

Create docs/guides/image.mdx:

Getting started with image generation
Prompt writing best practices
Model selection guide
Parameter tuning tips
Examples gallery

SDK Reference

Create docs/sdk/agents/image-agent.mdx:

ImageAgent API reference
Tool specifications
Code examples for programmatic use

CLI Reference

Update docs/reference/cli.mdx:

Add gaia image command documentation
All flags and options
Usage examples

Implementation Plan

Phased Approach: Terminal CLI → Web Gallery UI

Phase 1: Core Optimization Engine (Week 1)

Goal: SD prompt enhancement + parameter optimization working in CLI Deliverable: gaia sd enhance "a mountain" produces enhanced prompt + recommended parameters with quality score

Phase 2: Image Generation + VLM Evaluation (Week 2)

Goal: Full generation pipeline with VLM-powered quality iteration Deliverable: gaia sd generate "mountain" enhances, generates, evaluates with VLM, iterates if needed, displays final result with quality report

Phase 3: Templates & Search (Week 3)

Goal: Template library and natural language search

Implement TemplateLibrary class (DB-backed)
Build starter template set (10+ templates) with prompt+parameter combos
- Photography styles (portrait, landscape, macro)
- Artistic styles (photorealistic, anime, oil-painting, watercolor)
- Genre templates (cyberpunk, fantasy, sci-fi, horror)
Add template tools: list_templates, use_template, save_as_template
Implement natural language search tools:
- search_generations (e.g., “show me all cyberpunk images”)
- filter_by_params (e.g., “find images generated with SDXL-Turbo”)
- get_favorites, get_top_rated
Add CLI commands: gaia sd templates, gaia sd use, gaia sd search
LLM-powered query translation (natural language → SQL)
Unit tests

Deliverable: Users can browse templates, use proven patterns, and search generation history with natural language

Phase 4: Gallery UI with Task Interface (Week 4)

Goal: Standalone web UI for task-based image creation and gallery management UI Components:

Gallery Server (Flask/FastAPI)
- REST API for CRUD operations on generations
- WebSocket for real-time generation progress updates
- Static file serving for images
Task Submission Interface
- Natural language input: “a cyberpunk city at night, neon lights”
- Optional parameter locks: hardwire model, size, steps, seed (override agent recommendations)
- Submit task → agent processes autonomously → returns result
- Live progress indicator (enhancing → generating → evaluating → iterating)
- Quality score display with category breakdown
- Iteration history (show all attempts if multiple iterations)
Gallery View
- Grid/list view of all generations
- Filter controls (model, size, date range, rating)
- Natural language search box
- Sort by date, rating, favorites
Image Detail View
- Full-size image display
- Prompt and parameters display
- Rating system (1-5 stars)
- Notes/annotations text area
- Tags editor
- Favorite toggle
- Actions: regenerate, refine, save as template
Task Queue System
- Implement TaskQueue class with SQLite persistence
- Submit multiple tasks to queue (natural language + optional parameter locks)
- Agent processes tasks sequentially (or parallel if resources allow)
- Queue status display (pending, in-progress, completed, failed)
- Priority ordering (urgent tasks jump queue)
- Cancel/pause/resume individual tasks
- Batch submission (“generate 5 variations of this prompt”)
- WebSocket notifications when tasks complete
Reference-Based Generation
- Agent can retrieve top-rated images
- Use high-rated prompts/parameters as inspiration
- “Generate something similar to my favorite landscapes”
Template Browser
- Browse available templates
- Preview example images
- Quick-apply to new generation

Technical Stack:

Backend: FastAPI + SQLite (via DatabaseMixin)
Task Queue: In-memory queue with SQLite persistence for recovery
Frontend: React/Vue + Tailwind CSS
Communication: REST API + WebSockets for live updates
Packaging: Electron wrapper for desktop app

Web UI Mockup:

┌─────────────────────────────────────────────────────────────────────────┐
│  SD Agent                                          Queue (3)   Gallery  │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │  Enter prompt...                                          Submit │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
│  ○ Lock parameters   Model: [Auto ▼]  Size: [Auto ▼]  Quality: [7.0]   │
│                                                                         │
├─────────────────────────────────────────────────────────────────────────┤
│  QUEUE                                                                  │
│  ───────────────────────────────────────────────────────────────────── │
│  ● Task #4   "a cyberpunk city"           Generating (iter 2)   6.2    │
│    ├── [============================····] 70%                          │
│    └── VLM: "Adding more neon contrast"                                │
│                                                                         │
│  ○ Task #5   "portrait of an elf"         Pending                      │
│  ○ Task #6   "abstract art patterns"      Pending                      │
│                                                                         │
├─────────────────────────────────────────────────────────────────────────┤
│  RECENT                                                                 │
│  ───────────────────────────────────────────────────────────────────── │
│  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐          │
│  │         │ │         │ │         │ │         │ │         │          │
│  │  IMG 1  │ │  IMG 2  │ │  IMG 3  │ │  IMG 4  │ │  IMG 5  │          │
│  │         │ │         │ │         │ │         │ │         │          │
│  │  8.4/10 │ │  7.9/10 │ │  8.8/10 │ │  8.1/10 │ │  9.2/10 │          │
│  └─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘          │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Deliverable: Standalone UI at http://localhost:5000 with task-based image creation, queue management, and searchable gallery

Phase 5: Advanced Features & Polish (Week 5)

Goal: Production-ready with full feature set CLI Enhancements:

Interactive mode (gaia sd with no args → task submission interface)
Comparison mode (gaia sd compare "dragon" --strategies photorealistic,anime)
Batch generation (gaia sd batch "prompt" --count 10 --vary-params)
Queue management (gaia sd queue status, gaia sd queue cancel <id>)
Export tools (JSON, CSV, ZIP with images)

Gallery UI Enhancements:

Keyboard shortcuts
Bulk operations (tag multiple, export selection, delete)
Advanced filters (tag combinations, parameter ranges)
Gallery statistics (total images, by model, avg rating)
Settings page (default parameters, UI preferences)

Quality & Performance:

Performance optimization (query caching, lazy loading)
Error handling and user-friendly messages
Loading states and progress indicators
Image thumbnails for faster gallery loading
Database optimization (indexes, cleanup old entries)

Documentation:

User guide (docs/guides/sd-agent.mdx)
SDK reference (docs/sdk/agents/sd-agent.mdx)
Update CLI reference (docs/reference/cli.mdx)
Gallery UI guide
Prompt engineering best practices
Example gallery showcase

Testing:

Full test coverage (unit, integration, E2E)
Performance benchmarks
UI testing (Playwright/Cypress)

Deliverable: Production-ready SD optimization agent with CLI + Gallery UI, fully documented and tested

Future Enhancements

Advanced Prompt Engineering Features

Multi-Domain Expansion
- LLM Prompts: Chain-of-thought, few-shot, role-playing optimization
- Code Generation: Language-specific patterns, framework templates
- Vision Models: VLM-specific prompt engineering (Qwen2-VL, etc.)
- Audio Models: TTS/ASR prompt optimization (Whisper, Kokoro)
Collaborative Prompt Engineering
- Team Templates: Shared prompt libraries across organization
- Version Control Integration: Git-style branching for prompts
- Feedback Loop: Track which prompts perform best over time
- Prompt Marketplace: Share/discover templates from community
Advanced Analysis
- Semantic Similarity: Find similar successful prompts
- Performance Tracking: Which styles/keywords correlate with quality
- Automated A/B Testing: Run overnight experiments
- CLIP Score Integration: Objective image-prompt alignment scoring
MCP Integration
- Expose prompt enhancement as MCP tool
- Integration with VSCode, Claude Desktop, etc.
- Real-time prompt suggestions in external editors

Image Generation Enhancements

Advanced SD Features
- Negative Prompts: Specify what NOT to include
- Prompt Weights: Control emphasis on different elements
- ControlNet Support: Pose, depth, edge guidance
- LoRA Integration: Custom model fine-tuning
- Image-to-Image: Style transfer, variations from reference
- Inpainting/Outpainting: Edit specific regions
Multi-Model Support
- Support for multiple SD checkpoints
- FLUX, Midjourney-style prompts
- Cross-model prompt translation
- Model recommendation based on use case
Batch Operations
- Grid Search: Systematically test parameter combinations
- Style Exploration: Generate matrix of style variations
- Parameter Optimization: Find best steps/cfg_scale for prompt
- Scheduled Generation: Queue overnight batch jobs

Integration & Collaboration

Agent Ecosystem Integration
- PromptAgent + BlenderAgent: Generate texture prompts for 3D scenes
- PromptAgent + CodeAgent: Generate documentation with illustrations
- PromptAgent + ChatAgent: Enhance chat responses with visuals
- PromptAgent + JiraAgent: Create visual mockups from issue descriptions
UI/UX Enhancements
- Web UI: Browser-based prompt engineering workspace
- Electron App: Desktop app with drag-drop, galleries
- Mobile Companion: Review generations, rate prompts on mobile
- Browser Extension: Enhance prompts for SD web UIs (AUTOMATIC1111, ComfyUI)
Export & Publishing
- Prompt Cards: Beautiful shareable images of prompt+result
- Portfolio Export: Generate HTML galleries
- API Access: Programmatic prompt enhancement API
- Webhook Integration: Notify on completion, feed to other systems

Research & Experimental Features

LLM-as-Judge
- Use LLM to rate generated images
- Automated quality assessment
- Suggest prompt improvements based on output
Reinforcement Learning
- Learn from user preferences over time
- Personalized prompt enhancement
- Adapt to individual artistic style
Cross-Modal Prompt Engineering
- Text → Image → Text (caption generated images)
- Video prompts (SD animation)
- 3D prompt engineering (for 3D generative models)
Educational Features
- Prompt Engineering Tutor: Interactive lessons
- Challenge Mode: Daily prompt challenges
- Skill Progression: Track improvement over time

Success Metrics

Performance Targets

Metric	Target	Rationale
Prompt Analysis	< 500ms	Near-instant feedback
Prompt Enhancement	< 1s	LLM inference on AMD NPU
Image Generation (512x512)	< 3s	AMD NPU acceleration
Image Generation (1024x1024)	< 8s	SDXL on NPU
Terminal Display	< 500ms	Instant visual feedback
Template Application	< 100ms	Cache lookup
A/B Comparison (4 strategies)	< 5s analysis + image time	Parallel enhancement

Quality Metrics

Metric	Target	Measurement Method
Enhancement Accuracy	90%+ user satisfaction	User ratings on enhanced prompts
Score Correlation	85%+ correlation with actual quality	Compare scores vs. user ratings
Token Efficiency	Average enhanced prompt 15-40 tokens	CLIP token limit optimization
Quality Improvement	+6 points average (3 → 9/10)	Before/after scoring
CLIP Score Improvement	+15% on generated images	Automated CLIP scoring
Template Success Rate	95%+ successful applications	Error-free template application

User Experience Goals

Goal	Target	Measurement
Zero-config	Works immediately after install	No setup steps required
Fast Workflow	Analyze → enhance → verify in < 5s	End-to-end timing
No Context Switching	Everything in terminal	No external apps needed
Discoverability	80%+ find features without docs	CLI help clarity
Reproducibility	Same prompt + seed = same result	Deterministic generation
Learning Curve	First success in < 2 minutes	Time to first enhanced prompt

Adoption Metrics

Metric	1 Month	3 Months	6 Months
Active Users	100+	500+	1000+
Prompts Enhanced	1000+	10k+	50k+
Templates Created	50+	200+	500+
Average Quality Improvement	+5 points	+6 points	+7 points
User Satisfaction	80%+	85%+	90%+

Open Questions

Technical Decisions Needed

Prompt Enhancement Philosophy:
- How aggressive should auto-enhancement be?
- Always show before/after diff, or hide unless --show-enhancement?
- Should enhancement preserve exact user phrasing or fully rewrite?
- Support for multiple enhancement styles (conservative, creative, etc.)?
Scoring Algorithm:
- Use LLM-as-judge or rule-based scoring?
- How to weigh different criteria (clarity vs. detail vs. style)?
- Should scores be domain-specific or universal?
- Calibrate scores against human ratings?
Template System Design:
- JSON vs. Jinja2 templates vs. custom DSL?
- How much flexibility vs. simplicity?
- Support for nested/composed templates?
- Template versioning and updates?
Cache Management:
- Max cache size before cleanup warnings?
- LRU eviction vs. user-controlled deletion?
- Compress old metadata or keep full history?
- Export/import cache across machines?
SD Integration Scope:
- Support only Lemonade Server or also AUTOMATIC1111, ComfyUI?
- Implement image-to-image or MVP text-to-image only?
- Support for custom checkpoints/LoRAs?
AMD Hardware Optimization:
- Auto-detect NPU and adjust LLM model selection?
- Warn if running on CPU-only?
- Benchmark mode to showcase AMD performance?

Product Decisions

Target Audience Priority:
- Beginners (teach prompt engineering) or experts (power tools)?
- Both? How to balance?
Domain Expansion:
- Launch with SD-only or include LLM/code from start?
- Which domain after SD? (LLM, code, VLM, audio?)
UI Strategy:
- CLI-only MVP or invest in web UI early?
- Electron app before or after web UI?
- Terminal UI (TUI) with rich interactive widgets?
Community Features:
- Public template marketplace?
- Share prompts anonymously or with attribution?
- Rating/review system for templates?
- Moderation approach?
Integration Priority:
- Which GAIA agent integration first?
  - BlenderAgent (texture prompts)
  - ChatAgent (illustrated responses)
  - CodeAgent (UI mockups)
- External integration (VSCode, Claude Desktop, etc.)?
Monetization/Sustainability:
- Open source all features or premium tier?
- Cloud service for prompt analysis (privacy concerns)?
- Commercial template packs?
Documentation Approach:
- Auto-generate templates from successful prompts?
- Interactive tutorials vs. static docs?
- Video content priority?

References

External Documentation

Prompt Engineering:

OpenAI Prompt Engineering Guide
Anthropic Prompt Library
Learn Prompting - Comprehensive prompt engineering course

Stable Diffusion:

Lemonade Server API Spec
Stable Diffusion Documentation
Stable Diffusion Prompt Book - Style guide

Terminal Display:

Internal GAIA Specs

Core Framework:

Related Agents:

ChatAgent - Conversation patterns
BlenderAgent - 3D content generation
Routing Agent - Agent selection logic

Academic & Research

Similar Implementations

Within GAIA:

BlenderAgent: Domain-specific tool enhancement (3D scene generation)
SummarizerAgent: Multi-file processing, caching patterns
ChatAgent: Conversational refinement, RAG integration
RoutingAgent: Agent selection based on analysis

External Tools:

Midjourney: /imagine command with prompt engineering
DALL-E: Prompt suggestions and variations
ChatGPT: System prompts and role optimization

Approval Checklist

Planning & Scope

Problem statement clear (prompt engineering accessibility)
User experience defined (multiple workflows)
Primary use case identified (Stable Diffusion)
Secondary use cases documented (LLM, code, future)
User personas considered (beginner to expert)

Technical Design

Architecture designed (PromptAgent + components)
Technical decisions documented
Integration points identified (LLM, SD, cache)
Dependencies listed (Pillow, term-image)
AMD NPU optimization strategy defined

Implementation

Implementation plan with 5 phases
Clear milestones and deliverables
Testing strategy defined (unit, integration, CLI)
Performance targets established
Error handling approach outlined

Documentation & Quality

Approval & Next Steps

Document Version: 1.0 Last Updated: 2026-01-26 Author: Claude Sonnet 4.5 (with kalin) Status: Awaiting approval Next Steps:

Review with AMD team
Finalize open questions
Confirm resource allocation
Begin Phase 1: Core Prompt Analysis & Enhancement

What's Next

Plans

​SD Agent

​Overview

​System Architecture

​High-Level Overview

​Data Flow

​Technical Decisions

​Architecture

​Component Structure

​Database Schema

​Class Hierarchy

​User Experience

​Mode 1: Prompt Enhancement Only

​Mode 2: Full Generation Pipeline

​Mode 3: Task Queue

​Mode 4: Strategy Comparison

​Mode 5: Download Images

​Core Features

​1. LLM-Powered Prompt Enhancement Engine

​A) Intent Analysis

​B) Domain-Specific Enhancement

​2. Prompt Quality Scoring & Analysis

​3. Terminal Image Display & Verification

​4. Prompt Template Library

​5. Iterative Refinement Workflow

​6. Prompt Comparison & A/B Testing

​7. Prompt Version Control & History

​8. Image Storage & Verification Cache

​9. Prompt Testing with Visual Variations

​Agent Implementation

​PromptAgent Class

​System Prompt

​CLI Integration

​Command Structure

​Command Handlers

​Lemonade Client Extension

​Add Image Generation Method

​Dependencies

​New Libraries Required

​Optional Dependencies

​Testing Strategy

​Unit Tests

​Integration Tests

​CLI Tests

​Documentation Requirements

​User Guide

​SDK Reference

​CLI Reference

​Implementation Plan

​Phase 1: Core Optimization Engine (Week 1)

​Phase 2: Image Generation + VLM Evaluation (Week 2)

​Phase 3: Templates & Search (Week 3)

​Phase 4: Gallery UI with Task Interface (Week 4)

​Phase 5: Advanced Features & Polish (Week 5)

​Future Enhancements

​Advanced Prompt Engineering Features

​Image Generation Enhancements

​Integration & Collaboration

​Research & Experimental Features

​Success Metrics

​Performance Targets

​Quality Metrics

​User Experience Goals

​Adoption Metrics

​Open Questions

​Technical Decisions Needed

​Product Decisions

​References

​External Documentation

​Internal GAIA Specs

​Academic & Research

​Similar Implementations

​Approval Checklist

​Planning & Scope

​Technical Design

​Implementation

​Documentation & Quality

​Approval & Next Steps

SD Agent

Overview

System Architecture

High-Level Overview

Data Flow

Technical Decisions

Architecture

Component Structure

Database Schema

Class Hierarchy

User Experience

Mode 1: Prompt Enhancement Only

Mode 2: Full Generation Pipeline

Mode 3: Task Queue

Mode 4: Strategy Comparison

Mode 5: Download Images

Core Features

1. LLM-Powered Prompt Enhancement Engine

A) Intent Analysis

B) Domain-Specific Enhancement

2. Prompt Quality Scoring & Analysis

3. Terminal Image Display & Verification

4. Prompt Template Library

5. Iterative Refinement Workflow

6. Prompt Comparison & A/B Testing

7. Prompt Version Control & History

8. Image Storage & Verification Cache

9. Prompt Testing with Visual Variations

Agent Implementation

PromptAgent Class

System Prompt

CLI Integration

Command Structure

Command Handlers

Lemonade Client Extension

Add Image Generation Method

Dependencies

New Libraries Required

Optional Dependencies

Testing Strategy

Unit Tests

Integration Tests

CLI Tests

Documentation Requirements

User Guide

SDK Reference

CLI Reference

Implementation Plan

Phase 1: Core Optimization Engine (Week 1)

Phase 2: Image Generation + VLM Evaluation (Week 2)

Phase 3: Templates & Search (Week 3)

Phase 4: Gallery UI with Task Interface (Week 4)

Phase 5: Advanced Features & Polish (Week 5)

Future Enhancements

Advanced Prompt Engineering Features

Image Generation Enhancements

Integration & Collaboration

Research & Experimental Features

Success Metrics

Performance Targets

Quality Metrics

User Experience Goals

Adoption Metrics

Open Questions

Technical Decisions Needed

Product Decisions

References

External Documentation

Internal GAIA Specs

Academic & Research

Similar Implementations

Approval Checklist

Planning & Scope

Technical Design

Implementation

Documentation & Quality

Approval & Next Steps