vllm
vLLM plugin for Stained Glass Output Protection.
Consists of several major components, each of which has its own module. See each module for more details.
The vLLM plugin system allows for patching the the components in any vLLM process running the vllm.LLMEngine
(i.e. all of the runner
processes), but does not currently support patching the main vLLM process that runs the OpenAI-compatible RESTful API server. An
alternative entrypoint is also required for launching a vLLM OpenAI-compatible RESTful API server with Stained Glass Output Protection
enabled.
Warning
vLLM will not run properly if the stainedglass_output_protection
package is installed (i.e. the plugin is installed), but vLLM is not
launched via the alternative entrypoint. Likewise, if the plugin is not installed, but the alternative entrypoint is invoked, vLLM will
also not run properly.
Warning
Under almost no circumstances should you need to import this package directly. If stainedglass_output_protection
is installed, and
you launch vLLM via the alternative entrypoint, this module will be automatically applied.
Modules:
Name | Description |
---|---|
entrypoint |
Alternative entrypoint for launching a vLLM OpenAI-compatible RESTful API server with Stained Glass Output Protection enabled. |
middleware |
Middleware for the vLLM OpenAI-compatible RESTful API server that reads a user-provided public key from the request headers, registers |
registry |
User key registry for Stained Glass Output Protection in vLLM, shared across all vLLM processes. |
request_output |
Patched vLLM |
server_keys |
Utilities for managing ephemeral server keys in a FastAPI application. |