Skip to content

Devices, providers & profiles

A hal0 slot is described by three orthogonal facts: a device (the hardware preference), a provider (the inference runtime label), and a profile (a reusable container template — image plus bench-tuned flags). This page is the authoritative reference for the valid values of each and how they connect.

device carries hardware intent only. It is validated at config-load time — a typo raises a ValidationError with the field path.

deviceHardwareDefault provider
gpu-rocmAMD GPU via ROCmllama-server
gpu-vulkanAny GPU via Vulkanllama-server
cpuCPU-only fallbackllama-server
npuAMD XDNA NPU via FastFlowLMflm

gpu-rocm is the package-level default device constant, but the hardware recommender actively steers Strix Halo (unified memory) installs to gpu-vulkan for broader compatibility. The recommender may also select cpu or npu on hosts without ROCm support.

provider is the runtime label round-tripped on each slot. Every slot runs as a podman container regardless of provider; the field exists for UI labels and backwards compatibility.

providerRuntimeTypical slot types
llama-serverllama.cpp server (ROCm / Vulkan / CPU)chat, embed, rerank
flmFastFlowLM on the NPUchat / stt / embed (one process, trio)
kokoroKokoro-82M text-to-speech (ONNX)tts
comfyuiComfyUI image generationimage

Profiles live in /etc/hal0/profiles.toml. When that file is absent, hal0 serves the built-in seed catalog below, so GET /api/profiles is always populated on a fresh install. A profile supplies the container image, the bench-tuned flag bundle, and an optional MTP toggle; the slot supplies the model, context size, and port.

Profiledevice_classImageMTPIntent
rocmgpughcr.io/hal0ai/amd-strix-halo-toolboxes:rocm-7.2.4-rocmfp4-serveroffMoE agents · q8 KV
rocm-dnsegpughcr.io/hal0ai/amd-strix-halo-toolboxes:rocm-7.2.4-rocmfp4-serveronDense + MTP · q4 KV
rocm-moegpughcr.io/hal0ai/amd-strix-halo-toolboxes:rocm-7.2.4-rocmfp4-serveronMoE + MTP · q4 KV
vulkangpughcr.io/hal0ai/amd-strix-halo-toolboxes:vulkan-radv-serveroffVulkan std · fallback
flmnpughcr.io/hal0ai/hal0-toolbox-flm:0.9.43offFLM NPU inference
ttscpughcr.io/hal0ai/hal0-toolbox-kokoro:v1offTTS · Kokoro
comfyuiimgdocker.io/kyuz0/amd-strix-halo-comfyui@sha256:0066678ae9043f69…offImage generation

The Profiles tab in the hal0 dashboard showing seed profile cards with device class, intent, and bench metrics Profiles tab in the hal0 dashboard — seed profile cards display device class, quantisation, and bench throughput (tokens/sec or RTF).

When a profile’s effective MTP setting is on (profile mtp = true, or a slot’s mtp = true override), the multi-token-prediction draft-speculation flag bundle is appended after the profile flags at resolve time. A slot’s mtp field (true / false / unset-to-inherit) overrides the profile default.

The create-slot device picker and legacy-slot migration use this map to choose a starting profile per device class:

deviceDefault profile
gpu-rocmrocm
gpu-vulkanvulkan
cputts
npuflm

Toolbox images are pinned per hal0 release in manifest.json (by short name → canonical ref plus a sha256 digest). They are public on ghcr.io/hal0ai/; the installer pulls them anonymously. An empty digest means the image is unpublished for that release and the runtime pulls by tag with a warning.

Short nameImage ref (current release)Notes
vulkanghcr.io/hal0ai/amd-strix-halo-toolboxes:vulkan-radv-serverllama.cpp Vulkan backend
rocmghcr.io/hal0ai/amd-strix-halo-toolboxes:rocm-7.2.4-rocmfp4-serverllama.cpp ROCm backend
flmghcr.io/hal0ai/hal0-toolbox-flm:0.9.43FastFlowLM on AMD XDNA2 NPU
kokoroghcr.io/hal0ai/hal0-toolbox-kokoro:v1Kokoro-82M TTS (CPU ONNX)
comfyuidocker.io/kyuz0/amd-strix-halo-comfyui@sha256:0066678ae9043f69a1c8c7699e70626ceffd35c1a8ca03227a05640ad0241ed2ComfyUI image generation
  • Hardware matrix — what runs on each device and the backend availability order.
  • Config schema — the full slot TOML, profile, and hal0.toml field reference.
  • Paths & files — where profiles.toml, slot TOMLs, and manifest.json live.