از اشتراک گذاری خط لوله آمازون SageMaker برای مشاهده یا مدیریت خطوط لوله در سراسر حساب های AWS، اطلاعات PlatoBlockchain Data Intelligence استفاده کنید. جستجوی عمودی Ai.

از اشتراک گذاری خط لوله Amazon SageMaker برای مشاهده یا مدیریت خطوط لوله در حساب های AWS استفاده کنید

On August 9, 2022, we announced the general availability of cross-account sharing of Amazon SageMaker Pipelines entities. You can now use cross-account support for خطوط لوله آمازون SageMaker to share pipeline entities across AWS accounts and access shared pipelines directly through آمازون SageMaker تماس های API

Customers are increasingly adopting multi-account architectures for deploying and managing machine learning (ML) workflows with SageMaker Pipelines. This involves building workflows in development or experimentation (dev) accounts, deploying and testing them in a testing or pre-production (test) account, and finally promoting them to production (prod) accounts to integrate with other business processes. You can benefit from cross-account sharing of SageMaker pipelines in the following use cases:

  • When data scientists build ML workflows in a dev account, those workflows are then deployed by an ML engineer as a SageMaker pipeline into a dedicated test account. To further monitor those workflows, data scientists now require cross-account read-only permission to the deployed pipeline in the test account.
  • ML engineers, ML admins, and compliance teams, who manage deployment and operations of those ML workflows from a shared services account, also require visibility into the deployed pipeline in the test account. They might also require additional permissions for starting, stopping, and retrying those ML workflows.

In this post, we present an example multi-account architecture for developing and deploying ML workflows with SageMaker Pipelines.

بررسی اجمالی راه حل

A multi-account strategy helps you achieve data, project, and team isolation while supporting software development lifecycle steps. Cross-account pipeline sharing supports a multi-account strategy, removing the overhead of logging in and out of multiple accounts and improving ML testing and deployment workflows by sharing resources directly across multiple accounts.

In this example, we have a data science team that uses a dedicated dev account for the initial development of the SageMaker pipeline. This pipeline is then handed over to an ML engineer, who creates a continuous integration and continuous delivery (CI/CD) pipeline in their shared services account to deploy this pipeline into a test account. To still be able to monitor and control the deployed pipeline from their respective dev and shared services accounts, resource shares are set up with مدیریت دسترسی به منابع AWS in the test and dev accounts. With this setup, the ML engineer and the data scientist can now monitor and control the pipelines in the dev and test accounts from their respective accounts, as shown in the following figure.

از اشتراک گذاری خط لوله آمازون SageMaker برای مشاهده یا مدیریت خطوط لوله در سراسر حساب های AWS، اطلاعات PlatoBlockchain Data Intelligence استفاده کنید. جستجوی عمودی Ai.

In the workflow, the data scientist and ML engineer perform the following steps:

  1. The data scientist (DS) builds a model pipeline in the dev account.
  2. The ML engineer (MLE) productionizes the model pipeline and creates a pipeline, (for this post, we call it sagemaker-pipeline).
  3. sagemaker-pipeline code is committed to an AWS CodeCommit repository in the shared services account.
  4. The data scientist creates an AWS RAM resource share for sagemaker-pipeline and shares it with the shared services account, which accepts the resource share.
  5. From the shared services account, ML engineers are now able to describe, monitor, and administer the pipeline runs in the dev account using SageMaker API calls.
  6. A CI/CD pipeline triggered in the shared service account builds and deploys the code to the test account using AWS CodePipeline.
  7. The CI/CD pipeline creates and runs sagemaker-pipeline in the test account.
  8. بعد از دویدن sagemaker-pipeline in the test account, the CI/CD pipeline creates a resource share for sagemaker-pipeline in the test account.
  9. A resource share from the test sagemaker-pipeline with read-only permissions is created with the dev account, which accepts the resource share.
  10. The data scientist is now able to describe and monitor the test pipeline run status using SageMaker API calls from the dev account.
  11. A resource share from the test sagemaker-pipeline with extended permissions is created with the shared services account, which accepts the resource share.
  12. The ML engineer is now able to describe, monitor, and administer the test pipeline run using SageMaker API calls from the shared services account.

In the following sections, we go into more detail and provide a demonstration on how to set up cross-account sharing for SageMaker pipelines.

How to create and share SageMaker pipelines across accounts

In this section, we walk through the necessary steps to create and share pipelines across accounts using AWS RAM and the SageMaker API.

محیط را تنظیم کنید

First, we need to set up a multi-account environment to demonstrate cross-account sharing of SageMaker pipelines:

  1. Set up two AWS accounts (dev and test). You can set this up as member accounts of an organization or as independent accounts.
  2. If you’re setting up your accounts as member of an organization, you can enable resource sharing with your organization. With this setting, when you share resources in your organization, AWS RAM doesn’t send invitations to principals. Principals in your organization gain access to shared resources without exchanging invitations.
  3. In the test account, launch Amazon SageMaker Studio and run the notebook train-register-deploy-pipeline-model. This creates an example pipeline in your test account. To simplify the demonstration, we use SageMaker Studio in the test account to launch the the pipeline. For real life projects, you should use Studio only in the dev account and launch SageMaker Pipeline in the test account using your CI/CD tooling.

Follow the instructions in the next section to share this pipeline with the dev account.

Set up a pipeline resource share

To share your pipeline with the dev account, complete the following steps:

  1. در کنسول RAM AWS، را انتخاب کنید ایجاد اشتراک منابع.
  2. برای Select resource type، انتخاب کنید خطوط لوله SageMaker.
  3. Select the pipeline you created in the previous step.
  4. را انتخاب کنید بعدی.
  5. برای ویرایش, choose your associated permissions.
  6. را انتخاب کنید بعدی.
    از اشتراک گذاری خط لوله آمازون SageMaker برای مشاهده یا مدیریت خطوط لوله در سراسر حساب های AWS، اطلاعات PlatoBlockchain Data Intelligence استفاده کنید. جستجوی عمودی Ai.Next, you decide how you want to grant access to principals.
  7. If you need to share the pipeline only within your organization accounts, select Allow Sharing only within your organization; otherwise select Allow sharing with anyone.
  8. برای اصولگرایان, choose your principal type (you can use an AWS account, organization, or organizational unit, based on your sharing requirement). For this post, we share with anyone at the AWS account level.
  9. Select your principal ID.
  10. را انتخاب کنید بعدی.
    از اشتراک گذاری خط لوله آمازون SageMaker برای مشاهده یا مدیریت خطوط لوله در سراسر حساب های AWS، اطلاعات PlatoBlockchain Data Intelligence استفاده کنید. جستجوی عمودی Ai.
  11. بر بررسی و ایجاد کنید page, verify your information is correct and choose ایجاد اشتراک منابع.
    از اشتراک گذاری خط لوله آمازون SageMaker برای مشاهده یا مدیریت خطوط لوله در سراسر حساب های AWS، اطلاعات PlatoBlockchain Data Intelligence استفاده کنید. جستجوی عمودی Ai.
  12. Navigate to your destination account (for this post, your dev account).
  13. در کنسول RAM AWS، زیر به اشتراک گذاشته شده با من در قسمت ناوبری، را انتخاب کنید سهام منابع.
  14. Choose your resource share and choose اشتراک منابع را بپذیرید.
    از اشتراک گذاری خط لوله آمازون SageMaker برای مشاهده یا مدیریت خطوط لوله در سراسر حساب های AWS، اطلاعات PlatoBlockchain Data Intelligence استفاده کنید. جستجوی عمودی Ai.

Resource sharing permissions

When creating your resource share, you can choose from one of two supported permission policies to associate with the SageMaker pipeline resource type. Both policies grant access to any selected pipeline and all of its runs.

La AWSRAMDefaultPermissionSageMakerPipeline policy allows the following read-only actions:

"sagemaker:DescribePipeline"
"sagemaker:DescribePipelineDefinitionForExecution"
"sagemaker:DescribePipelineExecution"
"sagemaker:ListPipelineExecutions"
"sagemaker:ListPipelineExecutionSteps"
"sagemaker:ListPipelineParametersForExecution"
"sagemaker:Search"

La AWSRAMPermissionSageMakerPipelineAllowExecution policy includes all of the read-only permissions from the default policy, and also allows shared accounts to start, stop, and retry pipeline runs.

The extended pipeline run permission policy allows the following actions:

"sagemaker:DescribePipeline"
"sagemaker:DescribePipelineDefinitionForExecution"
"sagemaker:DescribePipelineExecution"
"sagemaker:ListPipelineExecutions"
"sagemaker:ListPipelineExecutionSteps"
"sagemaker:ListPipelineParametersForExecution"
"sagemaker:StartPipelineExecution"
"sagemaker:StopPipelineExecution"
"sagemaker:RetryPipelineExecution"
"sagemaker:Search"

Access shared pipeline entities through direct API calls

In this section, we walk through how you can use various SageMaker Pipeline API calls to gain visibility into pipelines running in remote accounts that have been shared with you. For testing the APIs against the pipeline running in the test account from the dev account, log in to the dev account and use AWS CloudShell.

For the cross-account SageMaker Pipeline API calls, you always need to use your pipeline ARN as the pipeline identification. That also includes the commands requiring the pipeline name, where you need to use your pipeline ARN as the pipeline name.

To get your pipeline ARN, in your test account, navigate to your pipeline details in Studio via SageMaker Resources.

از اشتراک گذاری خط لوله آمازون SageMaker برای مشاهده یا مدیریت خطوط لوله در سراسر حساب های AWS، اطلاعات PlatoBlockchain Data Intelligence استفاده کنید. جستجوی عمودی Ai.

را انتخاب کنید خطوط لوله on your resources list.

از اشتراک گذاری خط لوله آمازون SageMaker برای مشاهده یا مدیریت خطوط لوله در سراسر حساب های AWS، اطلاعات PlatoBlockchain Data Intelligence استفاده کنید. جستجوی عمودی Ai.

Choose your pipeline and go to your pipeline تنظیمات tab. You can find the pipeline ARN with your متاداده information. For this example, your ARN is defined as "arn:aws:sagemaker:us-east-1:<account-id>:pipeline/serial-inference-pipeline".

از اشتراک گذاری خط لوله آمازون SageMaker برای مشاهده یا مدیریت خطوط لوله در سراسر حساب های AWS، اطلاعات PlatoBlockchain Data Intelligence استفاده کنید. جستجوی عمودی Ai.

ListPipelineExecutions

This API call lists the runs of your pipeline. Run the following command, replacing $SHARED_PIPELINE_ARN with your pipeline ARN from CloudShell or using the رابط خط فرمان AWS (AWS CLI) configured with the appropriated هویت AWS و مدیریت دسترسی (من هستم) نقش:

aws sagemaker list-pipeline-executions --pipeline-name $SHARED_PIPELINE_ARN

The response lists all the runs of your pipeline with their PipelineExecutionArn, StartTime, PipelineExecutionStatusو PipelineExecutionDisplayName:

{
  "PipelineExecutionSummaries": [
    {
      "PipelineExecutionArn": "arn:aws:sagemaker:<region>:<account_id>:pipeline/<pipeline_name>/execution/<execution_id>",
      "StartTime": "2022-08-10T11:32:05.543000+00:00",
      "PipelineExecutionStatus": "Executing",
      "PipelineExecutionDisplayName": "execution-321"
    },
    {
      "PipelineExecutionArn": "arn:aws:sagemaker:<region>:<account_id>:pipeline/<pipeline_name>/execution/<execution_id>",
      "StartTime": "2022-08-10T11:28:03.680000+00:00",
      "PipelineExecutionStatus": "Stopped",
      "PipelineExecutionDisplayName": "test"
    },
    {
      "PipelineExecutionArn": "arn:aws:sagemaker:<region>:<account_id>:pipeline/<pipeline_name>/execution/<execution_id>",
      "StartTime": "2022-08-10T11:03:47.406000+00:00",
      "PipelineExecutionStatus": "Succeeded",
      "PipelineExecutionDisplayName": "execution-123"
    }
  ]
}

DescribePipeline

This API call describes the detail of your pipeline. Run the following command, replacing $SHARED_PIPELINE_ARN with your pipeline ARN:

aws sagemaker describe-pipeline --pipeline-name $SHARED_PIPELINE_ARN

The response provides the metadata of your pipeline, as well as information about creation and modifications of it:

Output(truncated): 
{
"PipelineArn": "arn:aws:sagemaker:<region>:<account-id>:pipeline/<pipeline_name>",
"PipelineName": "serial-inference-pipeline",
"PipelineDisplayName": "serial-inference-pipeline",
"PipelineDefinition": "{"Version": "2020-12-01", "Metadata": {}, "Parameters": [{"Name": "TrainingInstanceType", "Type": "String", "DefaultValue": "ml.m5.xlarge"}, {"Name": "ProcessingInstanceType", "Type": "String", "DefaultValue": "ml.m5.xlarge"}, {"Name": "ProcessingInstanceCount", "Type": "Integer", "DefaultValue": 1}, {"Name": "InputData", "Type":

..

"PipelineStatus": "Active",
"CreationTime": "2022-08-08T21:33:39.159000+00:00",
"LastModifiedTime": "2022-08-08T21:48:14.274000+00:00",
"CreatedBy": {},
"LastModifiedBy": {}
}

DescribePipelineExecution

This API call describes the detail of your pipeline run. Run the following command, replacing $SHARED_PIPELINE_ARN with your pipeline ARN:

aws sagemaker describe-pipeline-execution 
--pipeline-execution-arn $PIPELINE_EXECUTION_ARN

The response provides details on your pipeline run, including the PipelineExecutionStatus, ExperimentNameو TrialName:

{
  "PipelineArn": "arn:aws:sagemaker:<region>:<account_id>:pipeline/<pipeline_name>",
  "PipelineExecutionArn": "arn:aws:sagemaker:<region>:<account_id>:pipeline/<pipeline_name>/execution/<execution_id>",
  "PipelineExecutionDisplayName": "execution-123",
  "PipelineExecutionStatus": "Succeeded",
  "PipelineExperimentConfig": {
  "ExperimentName": "<pipeline_name>",
  "TrialName": "<execution_id>"
},
  "CreationTime": "2022-08-10T11:03:47.406000+00:00",
  "LastModifiedTime": "2022-08-10T11:15:01.102000+00:00",
  "CreatedBy": {},
  "LastModifiedBy": {}
}

StartPipelineExecution

This API call شروع می شود a pipeline run. Run the following command, replacing $SHARED_PIPELINE_ARN with your pipeline ARN and $CLIENT_REQUEST_TOKEN with a unique, case-sensitive identifier that you generate for this run. The identifier should have between 32–128 characters. For instance, you can generate a string using the AWS CLI kms generate-random command.

aws sagemaker start-pipeline-execution 
  --pipeline-name $SHARED_PIPELINE_ARN 
  --client-request-token $CLIENT_REQUEST_TOKEN

As a response, this API call returns the PipelineExecutionArn of the started run:

{
  "PipelineExecutionArn": "arn:aws:sagemaker:<region>:<account_id>:pipeline/<pipeline_name>/execution/<execution_id>"
}

StopPipelineExecution

This API call متوقف می شود a pipeline run. Run the following command, replacing $PIPELINE_EXECUTION_ARN with the pipeline run ARN of your running pipeline and $CLIENT_REQUEST_TOKEN with an unique, case-sensitive identifier that you generate for this run. The identifier should have between 32–128 characters. For instance, you can generate a string using the AWS CLI kms generate-random command.

aws sagemaker stop-pipeline-execution 
  --pipeline-execution-arn $PIPELINE_EXECUTION_ARN 
  --client-request-token $CLIENT_REQUEST_TOKEN

As a response, this API call returns the PipelineExecutionArn of the stopped pipeline:

{
  "PipelineExecutionArn": "arn:aws:sagemaker:<region>:<account_id>:pipeline/<pipeline_name>/execution/<execution_id>"
}

نتیجه

Cross-account sharing of SageMaker pipelines allows you to securely share pipeline entities across AWS accounts and access shared pipelines through direct API calls, without having to log in and out of multiple accounts.

In this post, we dove into the functionality to show how you can share pipelines across accounts and access them via SageMaker API calls.

As a next step, you can use this feature for your next ML project.

منابع

To get started with SageMaker Pipelines and sharing pipelines across accounts, refer to the following resources:


درباره نویسندگان

از اشتراک گذاری خط لوله آمازون SageMaker برای مشاهده یا مدیریت خطوط لوله در سراسر حساب های AWS، اطلاعات PlatoBlockchain Data Intelligence استفاده کنید. جستجوی عمودی Ai.رام ویتال یک معمار راه حل های تخصصی ML در AWS است. او بیش از 20 سال تجربه در زمینه معماری و ساخت برنامه های کاربردی توزیع شده، ترکیبی و ابری دارد. او مشتاق ساختن راه‌حل‌های AI/ML و کلان داده ایمن و مقیاس‌پذیر است تا به مشتریان سازمانی در پذیرش و سفر بهینه‌سازی ابر برای بهبود نتایج کسب‌وکارشان کمک کند. او در اوقات فراغت خود از تنیس، عکاسی و فیلم های اکشن لذت می برد.

از اشتراک گذاری خط لوله آمازون SageMaker برای مشاهده یا مدیریت خطوط لوله در سراسر حساب های AWS، اطلاعات PlatoBlockchain Data Intelligence استفاده کنید. جستجوی عمودی Ai.مایرا لادیرا تانکه یک معمار راه حل های تخصصی ML در AWS است. او با سابقه ای در علم داده، 9 سال تجربه معماری و ساخت برنامه های کاربردی ML با مشتریان در سراسر صنایع دارد. او به عنوان یک رهبر فنی، به مشتریان کمک می کند تا از طریق فناوری های نوظهور و راه حل های نوآورانه، دستیابی به ارزش تجاری خود را تسریع بخشند. مایرا در اوقات فراغت خود از مسافرت و گذراندن وقت با خانواده در مکانی گرم لذت می برد.

از اشتراک گذاری خط لوله آمازون SageMaker برای مشاهده یا مدیریت خطوط لوله در سراسر حساب های AWS، اطلاعات PlatoBlockchain Data Intelligence استفاده کنید. جستجوی عمودی Ai.گابریل زیلکا is a Professional Services Consultant at AWS. He works closely with customers to accelerate their cloud adoption journey. Specialized in the MLOps domain, he focuses on productionizing machine learning workloads by automating end-to-end machine learning lifecycles and helping achieve desired business outcomes. In his spare time, he enjoys traveling and hiking in the Bavarian Alps.

تمبر زمان:

بیشتر از آموزش ماشین AWS