hardware

Hardware compatibility.

hal0 runs a hardware probe at install time, writes the result to /etc/hal0/hardware.json, and picks the right provider per slot automatically. Vulkan-backed llama.cpp is the universal baseline — it’s what every install gets out of the box. ROCm, CUDA, and the AMD XDNA NPU (FLM) layer on top where the hardware supports them.

baseline

llama.cpp · Vulkan
opt-in

llama.cpp · ROCm / CUDA
npu

FLM · AMD XDNA
stt

Moonshine · CPU / Vulkan
tts

Kokoro · CPU / Vulkan
os

Linux + systemd

tiers

Three tiers, one platform.

First-class is the box we design against. Supported is the path every other modern AMD/NVIDIA host takes. Fallback is the CI smoke target — useful for sanity checks, not the headline experience.

first-class llama.cpp · Vulkan + FLM (NPU)

Strix Halo / Ryzen AI Max+

The reference deployment. APU + XDNA NPU + a single unified memory pool — the iGPU carveout is BIOS-tunable up to ~96 GB on 128 GB SKUs, and the slot lifecycle and FLM provider were written against this hardware first.

Verified on Ryzen AI Max iGPU + Vulkan: Qwen 0.5B 217–413 tok/s; Phi-3 Mini Q4 ~71 tok/s, ~280 ms round-trip; concurrent primary + embed ~258 tok/s, <200 ms dispatch.

Ryzen AI Max+ 395 (128 GB)
Ryzen AI Max 385 / 390 (64 GB)

Strix Halo deep dive →

supported llama.cpp · ROCm / CUDA

AMD discrete · NVIDIA

Same slot lifecycle, same dispatcher, same API surface — what changes is dedicated VRAM in place of the unified pool. AMD discrete uses the hal0-toolbox-rocm image (pending publish); NVIDIA uses the CUDA-backed llama.cpp build. Trade vs. Strix Halo: tighter context budgets at the same model size, no headroom for STT/TTS alongside a 30B primary.

RTX 5090
RTX 4090 / 4080
RX 7900 XTX / XT

Provider details →

fallback llama.cpp · Vulkan-CPU

CPU-only x86_64

What CI runs on with Qwen 0.5B (tests/slots/test_integration.py). Usable for tiny models and dev smoke; expect a few tok/s on chat — fine for occasional Q&A, not a streaming experience. stt and tts work in theory but need at least an iGPU for usable latency.

≥32 GB RAM
no GPU required

Provider details →

matrix

Compatibility matrix.

Status is the honest answer, not aspirational. Anything not on the v1 CI matrix is marked accordingly.

hal0 v1 hardware support

Hardware	Vendor	Unified / VRAM	Support	Notes
Ryzen AI Max+ 395 (Strix Halo, 128 GB)	AMD	~96–110 GB UMA	first-class	iGPU + XDNA NPU + unified memory. Reference deployment.
Ryzen AI Max 385 / 390 (Strix Halo, 64 GB)	AMD	~48 GB UMA	first-class	Same providers as 128 GB; tighter ceiling on large models.
AMD discrete (RX 7900 XTX / XT)	AMD	16–24 GB	supported	Vulkan works today; ROCm toolbox image pending publish.
NVIDIA RTX 50 / 40 / 30 series	NVIDIA	10–32 GB	supported	CUDA-backed llama.cpp build. In v1 target list.
CPU-only (x86_64, ≥32 GB RAM)	any	system RAM	supported	Vulkan-CPU path; CI smoke tier. Small models only.
Apple Silicon (M-series)	Apple	UMA	planned	Linux + systemd required for v1. Not in scope.
Intel Arc / Xe	Intel	8–16 GB	experimental	Vulkan path may work; not on the CI matrix.
Raspberry Pi / ARM SBC	various	system RAM	planned	aarch64 builds not part of v1.

first-class: Reference target. Daily-driven and benchmarked.
supported: In v1 scope, on the CI matrix or close to it.
experimental: May work via the universal Vulkan path; not vetted.
planned: Out of scope for v1; on the roadmap.

the honest answer

What about Apple Silicon, Intel Arc, Raspberry Pi?

Apple Silicon (M1 / M2 / M3 / M4)

Not in v1 scope. hal0 hard-requires Linux + systemd (installer/install.sh:86) — slot lifecycle is built on systemd template units, atomic env file writes, journalctl tailing. macOS doesn’t have systemd, and porting the lifecycle to launchd isn’t on the v1 punch list. If you want hal0 on Apple Silicon today, run it inside an Asahi Linux install or a Linux VM.

Intel Arc / Xe

Experimental via the universal Vulkan path. Mesa’s ANV driver gives llama.cpp-Vulkan something to talk to on Linux, so the primary and embed slots should light up. But Intel isn’t on the v1 CI matrix and we don’t have a daily-driven Arc box — treat it as "patches welcome, no promises." If you run it, tell us how it goes.

Raspberry Pi / aarch64 / ARM SBCs

Not in v1. The installer is x86_64-only today — aarch64 builds aren’t on the release matrix. llama.cpp itself runs on a Pi 5; bringing hal0 along would mean cross-building the toolbox images for arm64 and validating the lifecycle there. It’s on the wish list, not the roadmap.

WSL2 / Docker Desktop / cloud VMs

WSL2 doesn’t expose systemd by default, and rootless Docker can’t hold the slot units. A real Linux VM (KVM/QEMU, Proxmox, Hyper-V Gen2) works fine; a privileged LXC works fine too — see the hal0-test Strix Halo LXC recipe in the docs.

crown jewel

The deep Strix Halo guide.

UMA pool sizing, BIOS iGPU carveout, what the XDNA NPU does for you, recommended loadouts at 64 GB and 128 GB. The reference deployment, documented end to end.

Read the guide →