vLLM-Omni MetaX¶

vllm-omni-metax is the MetaX adapter plugin for
vllm-omni. It reuses the
existing vllm-metax hardware
backend and adds the glue required for Omni's multi-stage multimodal runtime.
In practice, the responsibilities are split as follows:
| Component | Responsibility |
|---|---|
vllm-metax |
MetaX hardware platform, kernels, runtime integration |
vllm-omni |
Multi-stage multimodal inference pipeline |
vllm-omni-metax |
Omni platform plugin that bridges the two |
Why this project exists¶
vllm-omni ships with GPU-oriented execution paths, but MetaX enablement should
stay owned by the MetaX backend instead of being maintained as an in-tree fork.
This repository keeps that boundary clear:
- It does not reimplement a new hardware backend.
- It detects whether a usable MetaX runtime is present.
- It registers an out-of-tree Omni platform plugin only when activation is appropriate.
- It applies a focused runtime patch only where upstream CUDA-only assumptions
block MetaX execution in
vllm-omni 0.20.0. - It mirrors Omni's stage device selection to the MetaX runtime environment.
Recommended workflow¶
If you are deploying on MetaX hardware, the recommended order is:
- Prepare a working
vllm-metaxenvironment. - Install
vllm-omni. - Install
vllm-omni-metax. - Start the usual
vllm-omniworkflow and let this plugin activate automatically.
See:
Design goals¶
- Reuse the existing MetaX platform implementation from
vllm-metax. - Keep compatibility fixes small, explicit, and runtime-scoped instead of
carrying a source fork of
vllm-omni. - Keep operational behavior predictable for users already familiar with
vllm-metax. - Make debugging explicit with environment-controlled enable and disable switches.