Skip to content

middleware

Middleware for the vLLM OpenAI-compatible RESTful API server that reads a user-provided public key from the request headers, registers it with a local key registry for the duration of the request, adds the server's public key to the response headers, and then finally clears the client key from the registry after the response is sent.

Warning

Under almost no circumstances should you need to import this module directly. If stainedglass_output_protection is installed, and you launch vLLM via the alternative entrypoint, this module will be automatically applied.

Classes:

Name Description
UserPublicKeyMiddleware

Middleware that reads a user-provided public key from a completions endpoint POST request, and use it to generate a shared secret.

Functions:

Name Description
inject_middleware

Add the UserPublicKeyMiddleware to a function that builds a vLLM OpenAI-compatible FastAPI application.

UserPublicKeyMiddleware

Bases: BaseHTTPMiddleware

Middleware that reads a user-provided public key from a completions endpoint POST request, and use it to generate a shared secret.

Methods:

Name Description
dispatch

Intercept the user-provided public key from headers, generate the shared secret, then register that shared key.

dispatch async

dispatch(
    request: Request, call_next: RequestResponseEndpoint
) -> fastapi.Response

Intercept the user-provided public key from headers, generate the shared secret, then register that shared key.

Note

The user must provide the public key using the x-client-public-key header. This must be a X25519 public key, base64 encoded.

Note

The response will have a x-server-public-key header which contains the public key that the Inference Server used to encrypt the response. The client can use this public key and its private key to derive a shared secret to decrypt the text in the response. This public key is an X25519 public key, base64 encoded.

Parameters:

Name Type Description Default

request

Request

A FastAPI request.

required

call_next

RequestResponseEndpoint

Function returning a coroutine which calls the endpoint.

required

Returns:

Type Description
fastapi.Response

A Completions Response (with additional x-server-public-key header).

Raises:

Type Description
ValueError

If the user did not include a x-client-public-key header item.

inject_middleware

inject_middleware(
    build_app_func: Callable[[Namespace], FastAPI],
) -> Callable[[argparse.Namespace], fastapi.FastAPI]

Add the UserPublicKeyMiddleware to a function that builds a vLLM OpenAI-compatible FastAPI application.

Parameters:

Name Type Description Default

build_app_func

Callable[[Namespace], FastAPI]

Function that builds the vLLM OpenAI-compatible FastAPI application

required

Returns:

Type Description
Callable[[argparse.Namespace], fastapi.FastAPI]

A new function compatible with the same signature as build_app_func that also adds the UserPublicKeyMiddleware after using

Callable[[argparse.Namespace], fastapi.FastAPI]

build_app_func to build the app.