Engedélyezze a többrégiós Amazon SageMaker végpontok CI/CD-jét, a PlatoBlockchain Data Intelligence-t. Függőleges keresés. Ai.

Engedélyezze a többrégiós Amazon SageMaker végpontok CI/CD-jét

Amazon SageMaker és a SageMaker inference endpoints provide a capability of training and deploying your AI and machine learning (ML) workloads. With inference endpoints, you can deploy your models for real-time or batch inference. The endpoints support various types of ML models hosted using AWS Deep Learning Containers or your own containers with custom AI/ML algorithms. When you launch SageMaker inference endpoints with multiple instances, SageMaker distributes the instances across multiple Availability Zones (in a single Region) for high availability.

In some cases, however, to ensure lowest possible latency for customers in diverse geographical areas, you may require deploying inference endpoints in multiple Regions. Multi-Regional deployment of SageMaker endpoints and other related application and infrastructure components can also be part of a disaster recovery strategy for your mission-critical workloads aimed at mitigating the risk of a Regional failure.

SageMaker projektek implements a set of pre-built MLOps templates that can help manage endpoint deployments. In this post, we show how you can extend an MLOps SageMaker Projects pipeline to enable multi-Regional deployment of your AI/ML inference endpoints.

Megoldás áttekintése

SageMaker Projects deploys both training and deployment MLOPs pipelines; you can use these to train a model and deploy it using an inference endpoint. To reduce complexity and cost of a multi-Region solution, we assume that you train the model in a single Region and deploy inference endpoints in two or more Regions.

This post presents a solution that slightly modifies a SageMaker project template to support multi-Region deployment. To better illustrate the changes, the following figure displays both a standard MLOps pipeline created automatically by SageMaker (Steps 1-5) as well as changes required to extend it to a secondary Region (Steps 6-11).

The SageMaker Projects template automatically deploys a boilerplate MLOps solution, which includes the following components:

  1. Amazon EventBridge monitorok AWS CodeCommit repositories for changes and starts a run of AWS CodePipeline if a code commit is detected.
  2. If there is a code change, AWS CodeBuild orchestrates the model training using SageMaker training jobs.
  3. After the training job is complete, the SageMaker modellnyilvántartás registers and catalogs the trained model.
  4. To prepare for the deployment stage, CodeBuild extends the default AWS felhőképződés template configuration files with parameters of an approved model from the model registry.
  5. Finally, CodePipeline runs the CloudFormation templates to deploy the approved model to the staging and production inference endpoints.

The following additional steps modify the MLOps Projects template to enable the AI/ML model deployment in the secondary Region:

  1. Mása a Amazon egyszerű tárolási szolgáltatás (Amazon S3) bucket in the primary Region storing model artifacts is required in the secondary Region.
  2. The CodePipeline template is extended with more stages to run a cross-Region deployment of the approved model.
  3. As part of the cross-Region deployment process, the CodePipeline template uses a new CloudFormation template to deploy the inference endpoint in a secondary Region. The CloudFormation template deploys the model from the model artifacts from the S3 replica bucket created in Step 6.

9–11 optionally, create resources in Amazon út 53, Amazon API átjáróés AWS Lambda to route application traffic to inference endpoints in the secondary Region.

Előfeltételek

Create a SageMaker project in your primary Region (us-east-2 in this post). Complete the steps in ML munkafolyamatok felépítése, automatizálása, kezelése és skálázása az Amazon SageMaker Pipelines segítségével until the section A mintakód módosítása egyéni felhasználási esethez.

Update your pipeline in CodePipeline

In this section, we discuss how to add manual CodePipeline approval and cross-Region model deployment stages to your existing pipeline created for you by SageMaker.

  1. On the CodePipeline console in your primary Region, find and select the pipeline containing your project name and ending with deploy. This pipeline has already been created for you by SageMaker Projects. You modify this pipeline to add AI/ML endpoint deployment stages for the secondary Region.
  2. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a szerkesztése.
  3. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Színpad hozzáadása.
  4. A Művésznév, belép SecondaryRegionDeployment.
  5. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Színpad hozzáadása.
  6. A SecondaryRegionDeployment stage, choose Add action group.In this action group, you add a manual approval step for model deployment in the secondary Region.
  7. A Művelet neve, belép ManualApprovaltoDeploytoSecondaryRegion.
  8. A Akciószolgáltató, választ Manual approval.
  9. Leave all other settings at their defaults and choose csinált.
  10. A SecondaryRegionDeployment stage, choose Add action group (után ManualApprovaltoDeploytoSecondaryRegion).In this action group, you add a cross-Region AWS CloudFormation deployment step. You specify the names of build artifacts that you create later in this post.
  11. A Művelet neve, belép DeploytoSecondaryRegion.
  12. A Akciószolgáltató, választ AWS felhőképződés.
  13. A Vidék, enter your secondary Region name (for example, us-west-2).
  14. A Input artifacts, belép BuildArtifact.
  15. A ActionMode, belép CreateorUpdateStack.
  16. A StackName, belép DeploytoSecondaryRegion.
  17. Alatt Sablon, A Műtárgy neveválassza BuildArtifact.
  18. Alatt Sablon, A File Name, belép template-export-secondary-region.yml.
  19. torony Use Configuration File on.
  20. Alatt Sablon, A Műtárgy neveválassza BuildArtifact.
  21. Alatt Sablon, A File Name, belép secondary-region-config-export.json.
  22. Alatt Képességek, választ CAPABILITY_NAMED_IAM.
  23. A Szerep, választ AmazonSageMakerServiceCatalogProductsUseRole created by SageMaker Projects.
  24. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a csinált.
  25. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Megtakarítás.
  26. Ha egy Save pipeline changes dialog appears, choose Megtakarítás újra.

Módosítsa az IAM szerepkört

We need to add additional permissions to the AWS Identity and Access Management (IAM) szerepet AmazonSageMakerServiceCatalogProductsUseRole készítette AWS szolgáltatáskatalógus to enable CodePipeline and S3 bucket access for cross-Region deployment.

  1. Az IAM konzolon válassza a lehetőséget szerepek a navigációs ablaktáblában.
  2. Keressen és válasszon AmazonSageMakerServiceCatalogProductsUseRole.
  3. Choose the IAM policy under Szabályzat neve: AmazonSageMakerServiceCatalogProductsUseRole-XXXXXXXXX.
  4. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Szabályzat szerkesztése és azután JSON.
  5. Modify the AWS CloudFormation permissions to allow CodePipeline to sync the S3 bucket in the secondary Region. You can replace the existing IAM policy with the updated one from the following GitHub repo (see lines:16-18, 198, 213)
  6. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Áttekintési szabályzat.
  7. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a A változtatások mentéséhez.

Add the deployment template for the secondary Region

To spin up an inference endpoint in the secondary Region, the SecondaryRegionDeployment stage needs a CloudFormation template (for endpoint-config-template-secondary-region.yml) and a configuration file (secondary-region-config.json).

The CloudFormation template is configured entirely through parameters; you can further modify it to fit your needs. Similarly, you can use the config file to define the parameters for the endpoint launch configuration, such as the instance type and instance count:

{
  "Parameters": {
    "StageName": "secondary-prod",
    "EndpointInstanceCount": "1",
    "EndpointInstanceType": "ml.m5.large",
    "SamplingPercentage": "100",
    "EnableDataCapture": "true"
  }

To add these files to your project, download them from the provided links and upload them to Amazon SageMaker Studio in the primary Region. In Studio, choose File Browser and then the folder containing your project name and ending with modeldeploy.

Upload these files to the deployment repository’s root folder by choosing the upload icon. Make sure the files are located in the root folder as shown in the following screenshot.

Screenshot of config files

Modify the build Python file

Next, we need to adjust the deployment build.py file to enable SageMaker endpoint deployment in the secondary Region to do the following:

  • Retrieve the location of model artifacts and Amazon Elastic Container Registry (Amazon ECR) URI for the model image in the secondary Region
  • Prepare a parameter file that is used to pass the model-specific arguments to the CloudFormation template that deploys the model in the secondary Region

Letöltheti a frissített verziót build.py file and replace the existing one in your folder. In Studio, choose File Browser and then the folder containing your project name and ending with modeldeploy. Locate the build.py file and replace it with the one you downloaded.

The CloudFormation template uses the model artifacts stored in a S3 bucket and the Amazon ECR image path to deploy the inference endpoint in the secondary Region. This is different from the deployment from the model registry in the primary Region, because you don’t need to have a model registry in the secondary Region.

Screenshot of primary and secondary environment parameters

Modify the buildspec file

buildspec.yml contains instructions run by CodeBuild. We modify this file to do the following:

  • Install the SageMaker Python library needed to support the code run
  • Pass through the –secondary-region and model-specific parameters to build.py
  • Add the S3 bucket content sync from the primary to secondary Regions
  • Export the secondary Region CloudFormation template and associated parameter file as artifacts of the CodeBuild step

Nyissa meg a buildspec.yml file from the model deploy folder and make the highlighted modifications as shown in the following screenshot.

Screenshot of build yaml file

Alternatively, you can download the following buildspec.yml file to replace the default file.

Add CodeBuild environment variables

In this step, you add configuration parameters required for CodeBuild to create the model deployment configuration files in the secondary Region.

  1. On the CodeBuild console in the primary Region, find the project containing your project name and ending with deploy. This project has already been created for you by SageMaker Projects.

Screenshot of code pipeline

  1. Choose the project and on the szerkesztése menüben válasszon Környezet.

Screenshot of configurations

  1. A Speciális konfiguráció szakaszban törölje a kijelölést Allow AWS CodeBuild to modify this service role so it can be used with this build project.
  2. Add the following environment variables, defining the names of the additional CloudFormation templates, secondary Region, and model-specific parameters:
    1. EXPORT_TEMPLATE_NAME_SECONDARY_REGION - Azért Érték, belép template-export-secondary-region.yml és a típus, választ Egyszerű szöveg.
    2. EXPORT_TEMPLATE_SECONDARY_REGION_CONFIG - Azért Érték, belép secondary-region-config-export.json és a típus, választ Egyszerű szöveg.
    3. AWS_SECONDARY_REGION - Azért Érték, enter us-west-2 and for típus, választ Egyszerű szöveg.
    4. KERETRENDSZER - Azért Érték, belép xgboost (replace with your framework) and for típus, választ Egyszerű szöveg.
    5. MODEL_VERSION - Azért Érték, enter 1.0-1 (replace with your model version) and for típus, választ Egyszerű szöveg.
  3. Másolja az értékét ARTIFACT_BUCKET into Notepad or another text editor. You need this value in the next step.
  4. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Update environment.

You need the values you specified for model training for FRAMEWORK és a MODEL_VERSION. For example, to find these values for the Abalone model used in MLOps boilerplate deployment, open Studio and on the File Browser menu, open the folder with your project name and ending with modelbuild. Navigate to pipelines/abalone és nyissa meg pipeline.py file. Search for sagemaker.image_uris.retrieve and copy the relevant values.

Screenshot of ML framework

Create an S3 replica bucket in the secondary Region

We need to create an S3 bucket to hold the model artifacts in the secondary Region. SageMaker uses this bucket to get the latest version of model to spin up an inference endpoint. You only need to do this one time. CodeBuild automatically syncs the content of the bucket in the primary Region to the replication bucket with each pipeline run.

  1. Az Amazon S3 konzolon válassza a lehetőséget Vödör létrehozása.
  2. A Vödör neve, írja be az értékét ARTEFACT_BUCKET copied in the previous step and append -replica to the end (for example, sagemaker-project-X-XXXXXXXX-replica.
  3. A AWS régió, enter your secondary Region (us-west-2).
  4. Leave all other values at their default and choose Vödör létrehozása.

Approve a model for deployment

The deployment stage of the pipeline requires an approved model to start. This is required for the deployment in the primary Region.

  1. In Studio (primary Region), choose SageMaker források a navigációs ablaktáblában.
  2. A Select the resource to view, választ Modell nyilvántartás.
  3. Choose model group name starting with your project name.
  4. In the right pane, check the model version, stage and status.
  5. If the status shows pending, choose the model version and then choose Állapot frissítése.
  6. Change status to jóváhagyott, majd válassza ki Állapot frissítése.

Deploy and verify the changes

All the changes required for multi-Region deployment of your SageMaker inference endpoint are now complete and you can start the deployment process.

  1. In Studio, save all the files you edited, choose megy, and choose the repository containing your project name and ending with deploy.
  2. Choose the plus sign to make changes.
  3. Alatt megváltozott, add build.py és a buildspec.yml.
  4. Alatt Követetlen, add endpoint-config-template-secondary-region.yml és a secondary-region-config.json.
  5. Enter a comment in the Összegzésként mezőbe és válassza Commit.
  6. Push the changes to the repository by choosing Nyomja.

Pushing these changes to the CodeCommit repository triggers a new pipeline run, because an EventBridge event monitors for pushed commits. After a few moments, you can monitor the run by navigating to the pipeline on the CodePipeline console.

Make sure to provide manual approval for deployment to production and the secondary Region.

You can verify that the secondary Region endpoint is created on the SageMaker console, by choosing Műszerfal in the navigation pane and confirming the endpoint status in Legutóbbi tevékenység.

Screenshot of sage maker dashboard

Add API Gateway and Route 53 (Optional)

You can optionally follow the instructions in Hívjon egy Amazon SageMaker modell végpontot az Amazon API Gateway és az AWS Lambda használatával to expose the SageMaker inference endpoint in the secondary Region as an API using API Gateway and Lambda.

Tisztítsuk meg

To delete the SageMaker project, see Delete an MLOps Project using Amazon SageMaker Studio. To ensure the secondary inference endpoint is destroyed, go to the AWS CloudFormation console and delete the related stacks in your primary and secondary Regions; this destroys the SageMaker inference endpoints.

Következtetés

In this post, we showed how a MLOps specialist can modify a preconfigured MLOps template for their own multi-Region deployment use case, such as deploying workloads in multiple geographies or as part of implementing a multi-Regional disaster recovery strategy. With this deployment approach, you don’t need to configure services in the secondary Region and can reuse the CodePipeline and CloudBuild setups in the primary Region for cross-Regional deployment. Additionally, you can save on costs by continuing the training of your models in the primary Region while utilizing SageMaker inference in multiple Regions to scale your AI/ML deployment globally.

Please let us know your feedback in the comments section.


A szerzőkről

Engedélyezze a többrégiós Amazon SageMaker végpontok CI/CD-jét, a PlatoBlockchain Data Intelligence-t. Függőleges keresés. Ai. Mehran Najafi, PhD, az AWS vezető megoldástervezője, aki a Scale AI/ML és SaaS megoldásaira összpontosít.

Engedélyezze a többrégiós Amazon SageMaker végpontok CI/CD-jét, a PlatoBlockchain Data Intelligence-t. Függőleges keresés. Ai.Steven Alyekhin is a Senior Solutions Architect for AWS focused on MLOps at Scale.

Időbélyeg:

Még több AWS gépi tanulás