Devices, providers & profiles

A hal0 slot is described by three orthogonal facts: a device (the hardware preference), a provider (the inference runtime label), and a profile (a reusable container template — image plus bench-tuned flags). This page is the authoritative reference for the valid values of each and how they connect.

Devices

device carries hardware intent only. It is validated at config-load time — a typo raises a ValidationError with the field path.

`device`	Hardware	Default provider
`gpu-rocm`	AMD GPU via ROCm	`llama-server`
`gpu-vulkan`	Any GPU via Vulkan	`llama-server`
`cpu`	CPU-only fallback	`llama-server`
`npu`	AMD XDNA NPU via FastFlowLM	`flm`

gpu-rocm is the package-level default device constant, but the hardware recommender actively steers Strix Halo (unified memory) installs to gpu-vulkan for broader compatibility. The recommender may also select cpu or npu on hosts without ROCm support.

Providers

provider is the runtime label round-tripped on each slot. Every slot runs as a podman container regardless of provider; the field exists for UI labels and backwards compatibility.

`provider`	Runtime	Typical slot types
`llama-server`	llama.cpp server (ROCm / Vulkan / CPU)	chat, embed, rerank
`flm`	FastFlowLM on the NPU	chat / stt / embed (one process, trio)
`kokoro`	Kokoro-82M text-to-speech (ONNX)	tts
`comfyui`	ComfyUI image generation	image

Seed profiles

Profiles live in /etc/hal0/profiles.toml. When that file is absent, hal0 serves the built-in seed catalog below, so GET /api/profiles is always populated on a fresh install. A profile supplies the container image, the bench-tuned flag bundle, and an optional MTP toggle; the slot supplies the model, context size, and port.

Profile	`device_class`	Image	MTP	Intent
`rocm`	gpu	`ghcr.io/hal0ai/amd-strix-halo-toolboxes:rocm-7.2.4-rocmfp4-server`	off	MoE agents · q8 KV
`rocm-dnse`	gpu	`ghcr.io/hal0ai/amd-strix-halo-toolboxes:rocm-7.2.4-rocmfp4-server`	on	Dense + MTP · q4 KV
`rocm-moe`	gpu	`ghcr.io/hal0ai/amd-strix-halo-toolboxes:rocm-7.2.4-rocmfp4-server`	on	MoE + MTP · q4 KV
`vulkan`	gpu	`ghcr.io/hal0ai/amd-strix-halo-toolboxes:vulkan-radv-server`	off	Vulkan std · fallback
`flm`	npu	`ghcr.io/hal0ai/hal0-toolbox-flm:0.9.43`	off	FLM NPU inference
`tts`	cpu	`ghcr.io/hal0ai/hal0-toolbox-kokoro:v1`	off	TTS · Kokoro
`comfyui`	img	`docker.io/kyuz0/amd-strix-halo-comfyui@sha256:0066678ae9043f69…`	off	Image generation

The Profiles tab in the hal0 dashboard showing seed profile cards with device class, intent, and bench metrics Profiles tab in the hal0 dashboard — seed profile cards display device class, quantisation, and bench throughput (tokens/sec or RTF).

MTP flag bundle

When a profile’s effective MTP setting is on (profile mtp = true, or a slot’s mtp = true override), the multi-token-prediction draft-speculation flag bundle is appended after the profile flags at resolve time. A slot’s mtp field (true / false / unset-to-inherit) overrides the profile default.

Device-to-default-profile map

The create-slot device picker and legacy-slot migration use this map to choose a starting profile per device class:

`device`	Default profile
`gpu-rocm`	`rocm`
`gpu-vulkan`	`vulkan`
`cpu`	`tts`
`npu`	`flm`

Toolbox images

Toolbox images are pinned per hal0 release in manifest.json (by short name → canonical ref plus a sha256 digest). They are public on ghcr.io/hal0ai/; the installer pulls them anonymously. An empty digest means the image is unpublished for that release and the runtime pulls by tag with a warning.

Short name	Image ref (current release)	Notes
`vulkan`	`ghcr.io/hal0ai/amd-strix-halo-toolboxes:vulkan-radv-server`	llama.cpp Vulkan backend
`rocm`	`ghcr.io/hal0ai/amd-strix-halo-toolboxes:rocm-7.2.4-rocmfp4-server`	llama.cpp ROCm backend
`flm`	`ghcr.io/hal0ai/hal0-toolbox-flm:0.9.43`	FastFlowLM on AMD XDNA2 NPU
`kokoro`	`ghcr.io/hal0ai/hal0-toolbox-kokoro:v1`	Kokoro-82M TTS (CPU ONNX)
`comfyui`	`docker.io/kyuz0/amd-strix-halo-comfyui@sha256:0066678ae9043f69a1c8c7699e70626ceffd35c1a8ca03227a05640ad0241ed2`	ComfyUI image generation