You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I currently use a Gateway API implementation with an inference extension to perform similar functionality as the vLLM router. I would like to use vllm-stack but with my current router implementation. Will you consider integrating with inference extension as a supported router implementation?
The text was updated successfully, but these errors were encountered:
danehans
changed the title
Pluggable router Implementations
Pluggable Router Implementations
Jan 29, 2025
Thanks for asking!
We don't have an immediate plan to integrate with the inference extension API for now. But will absolutely consider it as this project grows.
An alternative solution is to disable the router and directly connect the gateway API to the vLLM service. (Note that currently the helm chart does not support disabling router via values.yaml, so maybe create an issue for this if you want so that we can work on that?)
@ApostaC thanks for the feedback. I created #66 to support disabling the router via Helm values. I would like to keep this issue open so production-stack can consider supporting Gateway API with inference extension.
I currently use a Gateway API implementation with an inference extension to perform similar functionality as the vLLM router. I would like to use vllm-stack but with my current router implementation. Will you consider integrating with inference extension as a supported router implementation?
The text was updated successfully, but these errors were encountered: