Privacy-First AI: This agent runs entirely on your AI PC with Ryzen AI. All processing happens locally—document content never leaves your machine.
If you have hundreds of PDF documents spread across your system—manuals, reports, specifications—finding specific information is tedious. You need to remember which file contains what, open each one, search manually, and piece together information from multiple sources.This agent automates that process entirely on your AI PC:
Finds relevant documents on your drive
Indexes them with vector embeddings (NPU-accelerated on Ryzen AI)
Searches semantically using cosine similarity
Returns specific information with source citations
Runs completely locally—no cloud, no data leaving your machine
What you’re building:A chat agent that combines:
Agent reasoning - LLM-based tool selection and orchestration
RAG (Retrieval-Augmented Generation) - Vector search over document chunks
File discovery - Automated search across common directories
File monitoring - Watches folders and re-indexes on changes
Session persistence - Saves indexed state across restarts
Local execution - Runs entirely on your AI PC using Ryzen AI NPU/iGPU acceleration
Get a working agent running to understand the basic flow.
1
Install dependencies
Copy
uv pip install -e ".[rag]"
2
Start Lemonade Server
Copy
# Start local LLM server with AMD NPU/iGPU accelerationlemonade-server serve
Lemonade Server provides AMD-optimized inference for AI PCs with Ryzen AI. Models run on your NPU or iGPU for fast, private processing.
3
Create your first agent
Create my_chat_agent.py:
my_chat_agent.py
Copy
import jsonfrom gaia.agents.chat.agent import ChatAgent, ChatAgentConfig# Create agent with a documentconfig = ChatAgentConfig( rag_documents=["./manual.pdf"] # Your document here)agent = ChatAgent(config)# Ask a questionresult = agent.process_query("What does the manual say about installation?")import jsonprint(json.dumps(result, indent=2))
4
Run it
Copy
python my_chat_agent.py
What happens:
PDF text extraction (PyMuPDF)
Chunking into 500-token segments
Embedding generation (nomic-embed running on NPU/iGPU)
FAISS index creation
Query processing via vector search
LLM generates answer using retrieved chunks (Ryzen AI acceleration)
If you don’t have a PDF, the agent operates in general conversation mode using the LLM’s training data.
no_documents.py
Copy
agent = ChatAgent() # No documents specifiedresult = agent.process_query("What is Python?")# No RAG retrieval, uses general knowledge
Start with a minimal agent implementation to understand the core structure.
Implementation
Run
What You Have
step1_basic.py
Copy
from gaia.agents.base.agent import Agentfrom gaia.agents.base.console import AgentConsoleclass SimpleChatAgent(Agent): """Minimal chat agent with no tools.""" def _get_system_prompt(self) -> str: return "You are a helpful AI assistant." def _create_console(self): return AgentConsole() def _register_tools(self): # No tools registered yet pass# Use itagent = SimpleChatAgent()result = agent.process_query("Hello! How are you?")print(result)
Copy
python step1_basic.py
Expected output:
Copy
Agent: Hello! I'm doing well, thank you for asking. How can I help you today?
✓ Agent with reasoning loop
✓ System prompt definition
✓ Console output
✗ No tools (can only chat)
✗ No document access
Under the Hood: Execution Flow
Initialization:
Copy
SimpleChatAgent() → Agent.__init__() → Initialize LLM client → Load system prompt → Create console → Register tools (none in this case)
Query processing:
Copy
process_query("Hello! How are you?") → Construct messages: [system_prompt, user_message] → Send to LLM (Lemonade Server) → LLM generates response → Display via AgentConsole → Return result dict
Limitations:
No tools = no ability to execute actions
Cannot access external data sources
Relies solely on LLM training data
This basic agent cannot retrieve information from documents or perform actions. It’s limited to conversation using the LLM’s pre-trained knowledge.
Add RAG capability to enable document search and retrieval.
Implementation
Usage
Output Example
step2_with_rag.py
Copy
from gaia.agents.base.agent import Agentfrom gaia.agents.base.console import AgentConsolefrom gaia.agents.base.tools import toolfrom gaia.rag.sdk import RAGSDK, RAGConfigclass DocQAAgent(Agent): """Agent with document Q&A capability.""" def __init__(self, documents=None, **kwargs): # Initialize RAG SDK first rag_config = RAGConfig( chunk_size=500, max_chunks=5, chunk_overlap=100 ) self.rag = RAGSDK(rag_config) self.indexed_files = set() # Index documents if documents: for doc in documents: self.rag.index_document(doc) self.indexed_files.add(doc) print(f"✓ Indexed: {doc}") super().__init__(**kwargs) def _get_system_prompt(self) -> str: indexed = "\n".join(f"- {doc}" for doc in self.indexed_files) return f"""You are a document Q&A assistant.Currently indexed:{indexed}Use query_documents to search for information.""" def _create_console(self): return AgentConsole() def _register_tools(self): @tool def query_documents(query: str) -> dict: """Search indexed documents.""" if not self.rag.indexed_files: return {"error": "No documents indexed"} response = self.rag.query(query) return { "chunks": response.chunks, "scores": response.chunk_scores, "sources": response.source_files, "answer": response.text }# Use itagent = DocQAAgent(documents=["./manual.pdf"])response = agent.process_query("What does the manual say about installation?")print(response)
Copy
# Create agent with documentagent = DocQAAgent(documents=["./manual.pdf"])# Query itresult = agent.process_query("What are the system requirements?")print(result)
Copy
{ "answer": "According to the manual, system requirements are: Python 3.10+, 8GB RAM, 50GB disk space...", "sources": ["manual.pdf"], "steps": 2, "tools_used": ["query_documents"]}
Under the Hood: Indexing and Retrieval
Indexing phase (__init__):
Copy
DocQAAgent(documents=["manual.pdf"]) → RAGSDK instance created → For each PDF: → PyMuPDF extracts text → Text split into 500-token chunks (100 overlap) → nomic-embed generates embeddings (384 dimensions, runs on NPU/iGPU) → Embeddings stored in FAISS index (in-memory) → Ready for queries
On AI PCs with Ryzen AI, the embedding model runs on the NPU for efficient processing.
Query phase (process_query):
Copy
process_query("What are the system requirements?") → Agent decides to use query_documents tool → Tool execution: → Generate query embedding (nomic-embed on NPU) → Compute cosine similarity vs all chunk embeddings → Sort by similarity score → Return top 5 chunks → Agent receives chunks → LLM synthesizes answer using chunks as context (Ryzen AI acceleration) → Return result
Key benefit: Grounding LLM responses in actual document text reduces hallucination for domain-specific content.
What you have: Document retrieval via vector search. The agent can answer questions using your specific documents rather than general knowledge.
Add file discovery to avoid hardcoding document paths.
Copy
from gaia.agents.base.agent import Agentfrom gaia.agents.base.console import AgentConsolefrom gaia.agents.base.tools import toolfrom gaia.rag.sdk import RAGSDK, RAGConfigfrom fnmatch import fnmatchfrom pathlib import Pathclass SmartDocAgent(Agent): """Agent with smart document discovery.""" def __init__(self, **kwargs): self.rag = RAGSDK(RAGConfig()) self.indexed_files = set() super().__init__(**kwargs) def _get_system_prompt(self) -> str: indexed_docs = "\n".join(f"- {Path(f).name}" for f in self.indexed_files) return f"""You are an intelligent document assistant.Indexed documents:{indexed_docs or "None yet"}**Smart Discovery Workflow:**When user asks about something (e.g., "oil & gas manual"):1. Use search_files to find it2. Index it automatically3. Then query to answer their questionThis creates a more natural user experience where file paths don't need to be specified.""" def _create_console(self): return AgentConsole() def _register_tools(self): @tool def search_files(pattern: str) -> dict: """Find files matching a pattern (searches common locations).""" # Search common directories search_paths = [ Path.home() / "Documents", Path.home() / "Downloads", Path.home() / "Desktop", Path.cwd(), ] # Support "*" wildcards (agent sends patterns with them) and fall back to # substring matching by wrapping non-wildcard patterns with "*". normalized_pattern = pattern.lower() if "*" not in normalized_pattern and "?" not in normalized_pattern: normalized_pattern = f"*{normalized_pattern}*" found_files = [] for search_path in search_paths: if search_path.exists(): # Find PDF files matching pattern for pdf in search_path.rglob("*.pdf"): if fnmatch(pdf.name.lower(), normalized_pattern): found_files.append(str(pdf)) return { "files": found_files, "count": len(found_files), "message": f"Found {len(found_files)} file(s)" } @tool def index_document(file_path: str) -> dict: """Index a document for searching.""" if not Path(file_path).exists(): return {"error": f"File not found: {file_path}"} result = self.rag.index_document(file_path) if result.get("success"): self.indexed_files.add(file_path) return { "status": "success", "chunks": result.get("num_chunks"), "file": Path(file_path).name } return {"error": result.get("error")} @tool def query_documents(query: str) -> dict: """Search all indexed documents.""" if not self.indexed_files: return {"error": "No documents indexed"} response = self.rag.query(query) return { "answer": response.text, "sources": [Path(f).name for f in response.source_files], "scores": response.chunk_scores }# Use itagent = SmartDocAgent()# User can now ask naturally!result = agent.process_query("Find and search the user manual for installation steps")# Agent will:# 1. Call search_files("user manual")# 2. Call index_document(found_file)# 3. Call query_documents("installation steps")# 4. Return the answer!
Under the Hood: Smart Discovery Pattern
Example query:“Find the oil manual and tell me about safety”LLM orchestration (automatic):
Copy
// Step 1: Locate file{ "thought": "Need to find oil manual first", "tool": "search_files", "tool_args": {"pattern": "oil manual"}}// Returns: ["C:/Docs/Oil-Gas-Manual.pdf"]// Step 2: Index document{ "thought": "Found file, index it for searching", "tool": "index_document", "tool_args": {"file_path": "C:/Docs/Oil-Gas-Manual.pdf"}}// Returns: {"status": "success", "chunks": 150}// Step 3: Query for safety info{ "thought": "Document indexed, search for safety", "tool": "query_documents", "tool_args": {"query": "safety"}}// Returns: Relevant chunks about safety protocols// Step 4: Synthesize answer{ "answer": "According to the Oil & Gas Manual, safety protocols include..."}
Implementation note: You define Python functions. The LLM decides when to call them based on tool schemas and user intent.
What you have: File discovery without hardcoded paths. The agent can locate, index, and query documents based on pattern matching.