vLLM-Omni MetaX¶

vllm-omni-metax is the MetaX adapter plugin for vllm-omni. It reuses the existing vllm-metax hardware backend and adds the glue required for Omni's multi-stage multimodal runtime.

In practice, the responsibilities are split as follows:

Component	Responsibility
`vllm-metax`	MetaX hardware platform, kernels, runtime integration
`vllm-omni`	Multi-stage multimodal inference pipeline
`vllm-omni-metax`	Omni platform plugin that bridges the two

Why this project exists¶

vllm-omni ships with GPU-oriented execution paths, but MetaX enablement should stay owned by the MetaX backend instead of being maintained as an in-tree fork. This repository keeps that boundary clear:

It does not reimplement a new hardware backend.
It detects whether a usable MetaX runtime is present.
It registers an out-of-tree Omni platform plugin only when activation is appropriate.
It applies a focused runtime patch only where upstream CUDA-only assumptions block MetaX execution in vllm-omni 0.20.0.
It mirrors Omni's stage device selection to the MetaX runtime environment.

Recommended workflow¶

If you are deploying on MetaX hardware, the recommended order is:

Prepare a working vllm-metax environment.
Install vllm-omni.
Install vllm-omni-metax.
Start the usual vllm-omni workflow and let this plugin activate automatically.

See:

Design goals¶

Reuse the existing MetaX platform implementation from vllm-metax.
Keep compatibility fixes small, explicit, and runtime-scoped instead of carrying a source fork of vllm-omni.
Keep operational behavior predictable for users already familiar with vllm-metax.
Make debugging explicit with environment-controlled enable and disable switches.