request_output
Patched vLLM RequestOutput
class with encrypted text fields.
Warning
Under almost no circumstances should you need to import this module directly. If stainedglass_output_protection
is installed, and
you launch vLLM via the alternative entrypoint, this module will be automatically applied.
Classes:
Name | Description |
---|---|
EncryptedRequestOutput |
|
EncryptedRequestOutput
¶
Bases: RequestOutput
RequestOutput
with user-facing text encrypted upon instantiation.
Methods:
Name | Description |
---|---|
__init__ |
Encrypt the string fields of a RequestOutput before initializing it. |
Attributes:
Name | Type | Description |
---|---|---|
key_registry |
DictProxy[str, bytes | None]
|
Reference to the class's shared key registry. |
key_registry
property
¶
Reference to the class's shared key registry.
Opening a connection is expensive (on the order of 170-ish milliseconds), but maintaining one is cheap. This class can be used to
maintain a single connection to the shared registry as a class variable (i.e. all EncryptedRequestOutput
s can share a single
connection per process), instead of re-establishing a connection every time one is needed.
__init__
¶
__init__(
request_id: str,
prompt: str | None,
prompt_token_ids: list[int] | None,
prompt_logprobs: PromptLogprobs | None,
outputs: list[CompletionOutput],
finished: bool,
metrics: RequestMetrics | None = None,
lora_request: LoRARequest | None = None,
encoder_prompt: str | None = None,
encoder_prompt_token_ids: list[int] | None = None,
num_cached_tokens: int | None = None,
*,
multi_modal_placeholders: MultiModalPlaceholderDict
| None = None,
kv_transfer_params: dict[str, Any] | None = None,
**kwargs: Any,
) -> None
Encrypt the string fields of a RequestOutput before initializing it.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
|
str
|
The unique ID of the request. |
required |
|
str | None
|
The prompt string of the request. For encoder/decoder models, this is the decoder input prompt. |
required |
|
list[int] | None
|
The token IDs of the prompt. For encoder/decoder models, this is the decoder input prompt token ids. |
required |
|
PromptLogprobs | None
|
The log probabilities to return per prompt token. |
required |
|
list[CompletionOutput]
|
The output sequences of the request. |
required |
|
bool
|
Whether the whole request is finished. |
required |
|
RequestMetrics | None
|
Metrics associated with the request. |
None
|
|
LoRARequest | None
|
The LoRA request that was used to generate the output. |
None
|
|
str | None
|
The encoder prompt string of the request. None if decoder-only. |
None
|
|
list[int] | None
|
The token IDs of the encoder prompt. None if decoder-only. |
None
|
|
int | None
|
The number of tokens with prefix cache hit. |
None
|
|
MultiModalPlaceholderDict | None
|
Placeholder data for multimodal requests. |
None
|
|
dict[str, Any] | None
|
The params for remote K/V transfer. |
None
|
|
Any
|
Keyword arguments for |
required |
Raises:
Type | Description |
---|---|
NotImplementedError
|
If |
NotImplementedError
|
If |
NotImplementedError
|
If |
NotImplementedError
|
If |