AWS Marketplace with SageMaker Endpoint¶
Using Stained Glass Transform with a SageMaker Endpoint requires subscribing to two marketplace offerings:
- Stained Glass Transform Proxy is available as an AWS Marketplace offering.
meta-llama/Llama-3.1-8B-Instruct
with Stained Glass support is available as an AWS Marketplace for SageMaker offering.
The AWS Marketplace offering includes a Helm chart which deploys Stained Glass Transform Proxy to your Elastic Kubernetes Service (EKS) cluster. A pre-trained Stained Glass Transform is included in the Stained Glass Transform Proxy container image. See the offering page for more details.
Both components are required. The SGT Proxy will run within EKS within your VPC, but the SageMaker Endpoint will be managed by SageMaker.
Llama-3.1-8B-Instruct with Stained Glass Support SageMaker Endpoint¶
Subscribe to "meta-llama/Llama-3.1-8B-Instruct with Stained Glass support"¶
- Sign in to AWS Marketplace.
- Navigate to the Marketplace product page.
- Click the "Continue to Subscribe" button.
- Click the "Accept offer" button.
- Click the "Continue to Configuration" button.
- Select your launch method. (Using CloudFormation is often the easiest option, but any of the provided options are supported).
- Follow the Configuration instructions for your chosen launch method. These instructions are provided further down on the page.
- Note the Endpoint name used to deploy.
a. If using CloudFormation, it's included in the deploy instructions under "Execute inference".
b. If using SageMaker Console or the AWS CLI, you specify this yourself.
The Endpoint name will need to be set in Stained Glass Transform Proxy later.
Stained Glass Transform Proxy¶
Subscribe to SGT Proxy on AWS Marketplace¶
For more details in launching a Helm Fulfillment Container Product from AWS Marketplace, see the AWS Marketplace documentation.
- Sign in to AWS Marketplace.
- Navigate to the Marketplace product page.
- Click the "Continue to Subscribe" button.
- Review the pricing and terms, and click the "Create Contract" button.
- Click the "Continue to Configuration" button.
- In the "Fulfillment option" dropdown, select "Helm chart".
- Select the most recent version of the product.
- Click the "Continue to Launch" button.
- Follow the Deploy on Amazon Elastic Kubernetes Service (EKS) instructions below.
Deploy on Amazon Elastic Kubernetes Service (EKS)¶
Requirements¶
IAM Role for AWS License Manager¶
This guide assumes that you have an AWS Account with necessary permissions to subscribe to offerings in the AWS Marketplace, execute a helm chart on your EKS cluster, and that your EKS cluster has the necessary permissions to interact with AWS License Manager.
The following IAM policy is an example of the permissions required by the EKS cluster to interact with AWS License Manager:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"license-manager:ListLicenses",
"license-manager:GetLicense",
"license-manager:CheckoutLicense",
"license-manager:CheckInLicense",
"license-manager:ExtendLicenseConsumption",
"kms:GetPublicKey"
],
"Resource": "*"
}
]
}
Warning
This product uses AWS License Manager to manage licensing, and an ongoing connection to AWS License Manager is required for the product to function. The product will not function if the connection to AWS License Manager is lost.
Warning
This product, when used with a SageMaker Endpoint as its upstream inference server, requires permissions to the SageMaker endpoint. Make sure your IAM policy allows that. It must be able to InvokeModel
and InvokeModelWithStreamingResponse
.
Amazon Elastic Kubernetes Service (EKS) Cluster¶
Prior to following this guide, you should have an EKS cluster set up and configured. If you do not have an EKS cluster, you can follow the Amazon EKS Getting Started Guide.
Deploy Helm Chart¶
This guide assumes that you have completed the steps in the Subscribe on AWS Marketplace section, and are currently on the "Launch this software" page in the AWS Marketplace.
- Under the "Launch target" dropdown, select "Amazon managed Kubernetes".
- Under "Launch method", select "Launch on existing cluster".
- Follow the Launch instructions given to create the AWS IAM role and Kubernetes service account.
- Follow the Launch instructions given to pull the helm chart and deploy it to your EKS cluster using your local CLI.
Warning
By default the helm chart will also launch the Stained Glass Inference Server (powered by vLLM) into your EKS cluster. Since you are using SageMaker as your upstream inference server, this is not necessary. Make sure to add --set llmApi.enabled=False
to any helm install commands. This will disable launching a self-hosted inference server. You should also set --set sgProxy.config.sagemakerEndpointName=<your endpoint name>
, so SGT Proxy will communicate with your SageMaker endpoint For example,
helm install stained-glass-engine ./stained-glass-engine --set llmApi.enabled=False --set sgProxy.config.sagemakerEndpointName=<your endpoint name>
./stained-glass-engine
to match the launch instructions from amazon.
Inference using Stained Glass Proxy¶
Connecting to Stained Glass Proxy¶
The Helm Chart deployment supports an ingress controller, if desired, which can be enabled via .Values.sgProxy.ingress.enabled
.
You can test your connection using its built-in Swagger UI at the /docs
endpoint.
Interacting with the Stained Glass Proxy API¶
Once you can connect to the Stained Glass Proxy service, you can interact with its REST API to perform inference (see the API Reference for more details). The REST API is OpenAI-compatible, so you can use tools such as OpenAI's client or LangChain to interact with the service. See Tutorials for examples of how to use the service.