hardware

Hardware compatibility.

hal0 runs a hardware probe at install time, writes the result to /etc/hal0/hardware.json, and picks the right provider per slot automatically. Vulkan-backed llama.cpp is the universal baseline — it’s what every install gets out of the box. ROCm, CUDA, and the AMD XDNA NPU (FLM) layer on top where the hardware supports them.

  • baseline

    llama.cpp · Vulkan

  • opt-in

    llama.cpp · ROCm / CUDA

  • npu

    FLM · AMD XDNA

  • stt

    Moonshine · CPU / Vulkan

  • tts

    Kokoro · CPU / Vulkan

  • os

    Linux + systemd

tiers

Three tiers, one platform.

First-class is the box we design against. Supported is the path every other modern AMD/NVIDIA host takes. Fallback is the CI smoke target — useful for sanity checks, not the headline experience.

first-class llama.cpp · Vulkan + FLM (NPU)

Strix Halo / Ryzen AI Max+

The reference deployment. APU + XDNA NPU + a single unified memory pool — the iGPU carveout is BIOS-tunable up to ~96 GB on 128 GB SKUs, and the slot lifecycle and FLM provider were written against this hardware first.

Verified on Ryzen AI Max iGPU + Vulkan: Qwen 0.5B 217–413 tok/s; Phi-3 Mini Q4 ~71 tok/s, ~280 ms round-trip; concurrent primary + embed ~258 tok/s, <200 ms dispatch.

  • Ryzen AI Max+ 395 (128 GB)
  • Ryzen AI Max 385 / 390 (64 GB)
Strix Halo deep dive →
supported llama.cpp · ROCm / CUDA

AMD discrete · NVIDIA

Same slot lifecycle, same dispatcher, same API surface — what changes is dedicated VRAM in place of the unified pool. AMD discrete uses the hal0-toolbox-rocm image (pending publish); NVIDIA uses the CUDA-backed llama.cpp build. Trade vs. Strix Halo: tighter context budgets at the same model size, no headroom for STT/TTS alongside a 30B primary.

  • RTX 5090
  • RTX 4090 / 4080
  • RX 7900 XTX / XT
Provider details →
fallback llama.cpp · Vulkan-CPU

CPU-only x86_64

What CI runs on with Qwen 0.5B (tests/slots/test_integration.py). Usable for tiny models and dev smoke; expect a few tok/s on chat — fine for occasional Q&A, not a streaming experience. stt and tts work in theory but need at least an iGPU for usable latency.

  • ≥32 GB RAM
  • no GPU required
Provider details →

matrix

Compatibility matrix.

Status is the honest answer, not aspirational. Anything not on the v1 CI matrix is marked accordingly.

hal0 v1 hardware support

Hardware Vendor Unified / VRAM Support Notes
Ryzen AI Max+ 395 (Strix Halo, 128 GB) AMD ~96–110 GB UMA first-class iGPU + XDNA NPU + unified memory. Reference deployment.
Ryzen AI Max 385 / 390 (Strix Halo, 64 GB) AMD ~48 GB UMA first-class Same providers as 128 GB; tighter ceiling on large models.
AMD discrete (RX 7900 XTX / XT) AMD 16–24 GB supported Vulkan works today; ROCm toolbox image pending publish.
NVIDIA RTX 50 / 40 / 30 series NVIDIA 10–32 GB supported CUDA-backed llama.cpp build. In v1 target list.
CPU-only (x86_64, ≥32 GB RAM) any system RAM supported Vulkan-CPU path; CI smoke tier. Small models only.
Apple Silicon (M-series) Apple UMA planned Linux + systemd required for v1. Not in scope.
Intel Arc / Xe Intel 8–16 GB experimental Vulkan path may work; not on the CI matrix.
Raspberry Pi / ARM SBC various system RAM planned aarch64 builds not part of v1.
first-class
Reference target. Daily-driven and benchmarked.
supported
In v1 scope, on the CI matrix or close to it.
experimental
May work via the universal Vulkan path; not vetted.
planned
Out of scope for v1; on the roadmap.

the honest answer

What about Apple Silicon, Intel Arc, Raspberry Pi?

Apple Silicon (M1 / M2 / M3 / M4)

Not in v1 scope. hal0 hard-requires Linux + systemd (installer/install.sh:86) — slot lifecycle is built on systemd template units, atomic env file writes, journalctl tailing. macOS doesn’t have systemd, and porting the lifecycle to launchd isn’t on the v1 punch list. If you want hal0 on Apple Silicon today, run it inside an Asahi Linux install or a Linux VM.

Intel Arc / Xe

Experimental via the universal Vulkan path. Mesa’s ANV driver gives llama.cpp-Vulkan something to talk to on Linux, so the primary and embed slots should light up. But Intel isn’t on the v1 CI matrix and we don’t have a daily-driven Arc box — treat it as "patches welcome, no promises." If you run it, tell us how it goes.

Raspberry Pi / aarch64 / ARM SBCs

Not in v1. The installer is x86_64-only today — aarch64 builds aren’t on the release matrix. llama.cpp itself runs on a Pi 5; bringing hal0 along would mean cross-building the toolbox images for arm64 and validating the lifecycle there. It’s on the wish list, not the roadmap.

WSL2 / Docker Desktop / cloud VMs

WSL2 doesn’t expose systemd by default, and rootless Docker can’t hold the slot units. A real Linux VM (KVM/QEMU, Proxmox, Hyper-V Gen2) works fine; a privileged LXC works fine too — see the hal0-test Strix Halo LXC recipe in the docs.

ready to try it?

One line. Most modern Linux boxes.

The installer probes hardware on first run and picks providers automatically — Vulkan baseline, ROCm/CUDA where applicable, FLM where the NPU is there.