Skip to content

Hardware matrix

hal0 targets AMD APU/GPU plus the AMD XDNA NPU, with a CPU fallback that is always available. This page lists what runs on each accelerator and the order hal0 prefers them in.

TargetBackend / runtimeWhat runs there
AMD GPU (ROCm)llama.cpp ROCmchat / embed / rerank LLM inference (highest tps)
AMD GPU (Vulkan)llama.cpp Vulkanchat / embed / rerank LLM inference (broad-compat)
AMD XDNA NPUFastFlowLM (FLM)chat + speech-to-text + embeddings (one process)
CPUllama.cpp / ONNXfallback LLM inference; Kokoro TTS

The reference platform is AMD Strix Halo: an 8060S-class iGPU, an XDNA NPU, and a unified-memory pool the GPU shares with the host via GTT. hal0’s profile flags are bench-tuned for that target, but the GPU/Vulkan and CPU paths run on any modern Linux AMD system.

Hardware metrics panel showing GPU, NPU, and memory utilization Hardware monitoring dashboard for GPU, NPU, and system memory.

hal0 probes the host and advertises the backends it can actually run, in this order:

  1. NPU — only when an XDNA NPU is present and the FLM toolbox image is already pulled locally.
  2. GPU (Vulkan) — whenever a GPU is detected (every modern Linux GPU has Mesa Vulkan).
  3. GPU (ROCm) — only when the detected GPU is an AMD GPU with compute support (compute_capable).
  4. CPU — always reachable; the guaranteed fallback.
Backend idLabelProviderMultiplexAdvertised when
npuNPUflmyesXDNA present + FLM image pulled
gpu-vulkanGPU (Vulkan)llama-servernoany GPU detected
gpu-rocmGPU (ROCm)llama-servernoAMD GPU with ROCm compute support
cpuCPUllama-servernoalways

The NPU is multiplex: a single flm serve process answers chat, STT, and embeddings at once, so one NPU slot can back three capabilities.

The hal0 Slots view showing iGPU and FLM (NPU) slots in the Inference Engine, above the shared GTT memory carve-out The Inference Engine slot inventory — here 4 iGPU · 3 FLM — with the shared iGPU GTT carve-out above it.

The NPU path runs FastFlowLM in a container, but the host must provide the XDNA driver and firmware:

  • the amdxdna kernel driver (in-tree on kernel ≥ 6.14, or via the amdxdna-dkms package on kernel ≥ 6.10), and
  • NPU firmware ≥ 1.1.0.0.