Stained Glass Transform Proxy¶
A forward proxy service accepts OpenAI API specification chat completion requests, transforms the chat message contents using Stained Glass Transform (SGT) and serves responses for the transformed (obfuscated) chat input from a custom vllm
inference service.
Note
The Stained Glass Transform Proxy deployment requires the deployment of Protopia's custom vllm
docker image. Please get in touch with us for access to this image. You can read more about configuring the vllm
container deployment here.
Note
Protopia's custom vllm
docker image is created with --enable-chunked-prefill
set to False
to enable embedding based generation.
sequenceDiagram
autonumber
participant Client
participant SGP as Stained Glass Transform Proxy
Note right of SGP: Validate request,<br/> Transform message prompts<br/> and return/stream response from LLM API.
participant LLM as LLM API
Client->>SGP: OpenAI API Spec Request
SGP->>LLM: Inference Request
LLM-->>SGP: Inference Response
SGP-->>Client: OpenAI API Spec Response
Deployment¶
- Procure the docker image tag for
stainedglass_proxy
. -
For kubernetes (k8s), update the kubernetes deployment config with the
stainedglass_proxy
image tag: -
Set the following environment variables for your container deployment (for k8s, this needs to be done via the manifest file):
SGP_INFERENCE_SERVICE_HOST
: Host URL for the inference service.SGP_SGT_PATH
: SGT model file path. For the typical docker container,SGP_SGT_PATH
should be set tosgt_model.pt
.SGP_DEVICE
: Device type to match that of the SGT file. Can be either "cpu" or "cuda".SGP_API_USERNAME
: Username for the LLM API.SGP_API_PASSWORD
: Password for the LLM API.
-
Deploy the container.
Usage¶
The service listens on the port 8600
. You can access the API endpoint documentation either as a JSON export via the /openapi.json
endpoint, or as a web page via the /docs
endpoint on your browser.
Chat Completions Example¶
curl
Request¶
curl --location 'http://127.0.0.1:8600/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "mistral-7b-instruct",
"messages": [
{
"role": "user",
"content": "Write me a poem."
}
],
"max_tokens": 3000,
"temperature": 1.7,
"seed": 123456
}'
Response¶
{
"id": "9325c716-deb8-46b5-bfca-1a904a7af0c7",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"role": "assistant",
"content": "In the quiet of the twilight, where the sun's last rays reside,\n\nA symphony of colors paint the sky, a breathtaking, wondrous tide.\n\nThe day has ended, and the night awakes, in hues of pink and gold,\n\nA gentle breeze caresses the earth, a whisper soft and bold.\n\nThe stars begin to twinkle, like diamonds strewn across the black,\n\nA canvas painted by the hand of God, a masterpiece to track.\n\nThe moon ascends her throne, a beacon in the night,\n\nA silver glow that bathes the world in soft, ethereal light.\n\nThe crickets sing their lullabies, a chorus of the night,\n\nA melody that soothes the soul, a balm for all that's right.\n\nIn the quiet of the twilight, where the world is still and calm,\n\nA moment of peace, a time to dream, a moment to reclaim.\n\nSo let us bask in the beauty of this twilight hour,\n\nA reminder of the magic that lies within the power,\n\nOf the simple, yet profound, enchantment of the night.",
"tool_calls": null
},
"logprobs": null
}
],
"created": 1724875449,
"model": "mistral-7b-instruct",
"service_tier": null,
"system_fingerprint": null,
"object": "chat.completion",
"usage": null
}
Next Steps¶
-
API Reference
View detailed descriptions of the endpoints and models supported by Stained Glass Transform Proxy.
-
Logging Configuration
Read about Stained Glass Transform Proxy's logging configuration.
-
Tutorials
Learn how to create and deploy a Stained Glass Transform.