NPU Acceleration

GAIA supports running agents on your AMD Ryzen AI NPU via the FastFlowLM (FLM) backend. NPU inference is power-efficient and frees your GPU for other tasks.

Requirements

Hardware: AMD Ryzen AI 300/400/Max series processor with XDNA2 NPU
- Strix Point, Strix Halo, Kraken Point, or Gorgon Point
- Ryzen AI 7000/8000/200-series (XDNA1) is not supported
Software: Lemonade Server v10.10.0+
Driver: Latest AMD NPU driver (firmware v1.1.0.0+)

Ryzen AI 7000/8000/200-series chips have XDNA1 NPUs which are not supported by FastFlowLM. Use --device gpu instead.

Linux

NPU inference works on Linux the same way — run gaia init --profile npu and Lemonade Server installs the FLM backend automatically. If Lemonade detects that your system needs additional setup (for example, the amdxdna kernel driver on older kernels), it will point you to the Lemonade Linux FLM guide with distro-specific steps. Follow them, then re-run gaia init --profile npu.

Setup

Initialize GAIA with the NPU profile:

gaia init --profile npu

This will:

Verify NPU hardware is detected
Install the FLM backend (flm:npu)
Download the NPU-optimized model (gemma4-it-e2b-FLM)
Run a verification test

Usage

CLI

# Chat on NPU
gaia chat --device npu

# Chat on GPU (default)
gaia chat

# Chat on CPU
gaia chat --device cpu

# Single prompt on NPU
gaia prompt "Hello" --device npu

Agent UI

Launch with gaia chat --ui
In the Agent Hub, each agent card shows a device dropdown
Select NPU from the dropdown (only visible if NPU hardware is detected)
A green ✓ badge indicates the agent has been verified on that device

Evaluation

Compare agent quality across devices:

# Run eval on GPU (baseline)
gaia eval agent --device gpu

# Run eval on NPU
gaia eval agent --device npu

# Compare results
gaia eval agent --compare eval/results/run1/scorecard.json eval/results/run2/scorecard.json

Device Selection

Device	Model	Backend	Best For
GPU (default)	Gemma-4-E4B-it-GGUF	llamacpp:vulkan	General use, best throughput
CPU	Gemma-4-E4B-it-GGUF	llamacpp:cpu	No GPU available (slower)
NPU	gemma4-it-e2b-FLM	flm:npu	Power efficiency, background tasks

Fallback Behavior

Explicit --device npu: fails immediately if NPU not available
Default (no --device): uses GPU; falls back to CPU with a warning if no GPU detected

Email agent: no `--device` flag — auto-detected instead

The email triage agent doesn’t take a --device flag. When it isn’t given an explicit model, it probes the Lemonade Server it’s configured against and picks gemma4-it-e2b-FLM automatically if an NPU is available there and the model is already downloaded — otherwise it falls back to Gemma-4-E4B-it-GGUF, the same GPU/CPU default above. GET /v1/email/init reports which one was resolved. See the email agent contract for the full resolution order.

Troubleshooting

NPU not detected

❌ No AMD NPU detected. The 'npu' profile requires AMD NPU hardware

Verify your hardware:

Check processor model supports XDNA2 (Ryzen AI 300+ series)
Ensure NPU driver is installed: check Device Manager (Windows) or lspci | grep NPU (Linux)
Try lemonade backends to see available backends

FLM backend installation fails

❌ Failed to install backend 'flm:npu'

Install manually:

lemonade backends install flm:npu

Model not available

If gemma4-it-e2b-FLM is not found, pull it manually:

lemonade pull gemma4-it-e2b-FLM --recipe flm

Getting Started

C++ Framework

Python Framework

Requirements

Linux

Setup

Usage

CLI

Agent UI

Evaluation

Device Selection

Fallback Behavior

Email agent: no `--device` flag — auto-detected instead

Troubleshooting

NPU not detected

FLM backend installation fails

Model not available

​Requirements

​Linux

​Setup

​Usage

​CLI

​Agent UI

​Evaluation

​Device Selection

​Fallback Behavior

​Email agent: no --device flag — auto-detected instead

​Troubleshooting

​NPU not detected

​FLM backend installation fails

​Model not available

Requirements

Linux

Setup

Usage

CLI

Agent UI

Evaluation

Device Selection

Fallback Behavior

Email agent: no `--device` flag — auto-detected instead

Troubleshooting

NPU not detected

FLM backend installation fails

Model not available