Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.

Futtassa a notebookokat kötegelt munkákként az Amazon SageMaker Studio Labban

Nemrég a Amazon SageMaker Studio launched an easy way to run notebooks as batch jobs that can run on a recurring schedule. Amazon SageMaker Studio Lab also supports this feature, enabling you to run notebooks that you develop in SageMaker Studio Lab in your AWS account. This enables you to quickly scale your machine learning (ML) experiments with bigger datasets and more powerful instances, without having to learn anything new or change one line of code.

In this post, we walk you through the one time prerequisite to connect your Studio Lab environment to an AWS account. After that, we walk you through the steps to run notebooks as a batch job from Studio Lab.

Megoldás áttekintése

Studio Lab incorporated the same extension as Studio, which is based on the Jupyter open-source extension for scheduled notebooks. This extension has additional AWS-specific parameters, like the compute type. In Studio Lab, a scheduled notebook is first copied to an Amazon egyszerű tárolási szolgáltatás (Amazon S3) bucket in your AWS account, then run at the scheduled time with the selected compute type. When the job is complete, the output is written to an S3 bucket, and the AWS compute is completely halted, preventing ongoing costs.

Előfeltételek

To use Studio Lab notebook jobs, you need administrative access to the AWS account you’re going to connect with (or assistance from someone with this access). In the rest of this post, we assume that you’re the AWS administrator, if that’s not the case, ask your administrator or account owner to review these steps with you.

Create a SageMaker execution role

We need to ensure that the AWS account has an AWS Identity and Access Management (IAM) SageMaker execution role. This role is used by SageMaker resources within the account, and provides access from SageMaker to other resources in the AWS account. In our case, our notebook jobs run with these permissions. If SageMaker has been used previously in this account, then a role may already exist, but it may not have all the permissions required. So let’s go ahead and make a new one.

The following steps only need to be done once, regardless of how many SageMaker Studio Lab environments will access this AWS account.

  1. Az IAM konzolon válassza a lehetőséget szerepek a navigációs ablaktáblában.
  2. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Szerep létrehozása.
  3. A Megbízható entitás típusaválassza AWS szolgáltatás.
  4. A Use cases for other AWS Services, választ SageMaker.
  5. választ SageMaker – Execution.
  6. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Következő.
  7. Review the permissions, then choose Következő.Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.
  8. A Szerepnév, írjon be egy nevet (ehhez a bejegyzéshez használjuk sagemaker-execution-role-notebook-jobs).
  9. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Szerep létrehozása.
  10. Make a note of the role ARN.

The role ARN will be in the format of arn:aws:iam::[account-number]:role/service-role/[role-name] and is required in the Studio Lab setup.

Hozzon létre egy IAM-felhasználót

For a Studio Lab environment to access AWS, we need to create an IAM user within AWS and grant it necessary permissions. We then need to create a set of access keys for that user and provide them to the Studio Lab environment.

This step should be repeated for each SageMaker Studio Lab environment that will access this AWS account.

Note that administrators and AWS account owners should ensure that to the greatest extent possible, well-architected security practices are followed. For example, user permissions should always be scoped down, and access keys should be rotated regularly to minimize the impact of credential compromise.

In this blog we show how to use the AmazonSageMakerFullAccess managed policy. This policy provides broad access to Amazon SageMaker that may go beyond what is required. Details about AmazonSageMakerFullAccess megtalálható itt.

Although Studio Lab employs enterprise security, it should be noted that Studio Lab user credentials don’t form part of your AWS account, and therefore, for example, are not subject to your AWS password or MFA policies.

To scope down permissions as much as possible, we create a user profile specifically for this access.

  1. Az IAM konzolon válassza a lehetőséget felhasználók a navigációs ablaktáblában.
  2. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Felhasználók hozzáadása.
  3. A felhasználónév, enter a name.It’s good practice to use a name that is linked to an individual person who will be using this account; this helps if reviewing audit logs.
  4. A Válassza ki az AWS hozzáférés típusátválassza Hozzáférési kulcs – Programozott hozzáférés.
  5. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Következő: Engedélyek.Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.
  6. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Csatolja közvetlenül a meglévő szabályzatokat.
  7. Keressen és válasszon AmazonSageMakerFullAccess.Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.
  8. Keressen és válasszon AmazonEventBridgeFullAccess.Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.
  9. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Következő: Címkék.
  10. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Következő: Áttekintés.
  11. Confirm your policies, then choose Felhasználó létrehozása.Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.The final page of the user creation process should show you the user’s access keys. Leave this tab open, because we can’t navigate back here and we need these details.
  12. Open a new browser tab in Studio Lab.
  13. A filé menüben válasszon Új Launcher, majd válassza ki terminál.Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.
  14. At the command line, enter the following code:
    aws configure

  15. Írja be a következő kódot:
    1. Enter the values from the IAM console page for your access key ID and secret access key.
    2. A Default region name, belép us-west-2.
    3. Szabadság Default output format as text.
      (studiolab) studio-lab-user@default:~$ aws configure 
      AWS Access Key ID []: 01234567890
      AWS Secret Access Key []: ABCDEFG1234567890ABCDEFG
      Default region name []: us-west-2
      Default output format [text]: 
      
      (studiolab) studio-lab-user@default:~$

Congratulations, your Studio Lab environment should now be configured to access the AWS account. To test the connection, issue the following command:

aws sts get-caller-identity

This command should return details about the IAM user your configured to use.

Create a notebook job

Notebook jobs are created using Jupyter notebooks within Studio Lab. If your notebook runs in Studio Lab, then it can run as a notebook job (with more resources and access to AWS services). However, there are a couple of things to watch for.

If you have installed packages to get your notebook working, add commands to load these packages in a cell at the top of your notebook. By using a & symbol at the start of each line, the code will be sent to the command line to run. In the following example, the first cell uses pip to install PyTorch libraries:

%%capture
%pip install torch
%pip install torchvision

Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.

Our notebook will generate a trained PyTorch model. With our regular code, we save the model to the file system in Studio Labs.

When we run this as a notebook job, we need to save the model somewhere we can access it afterwards. The easiest way to do this is to save the model in Amazon S3. We created an S3 bucket to save our models, and use another command line cell to copy the object into the bucket.

Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai. Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.

Az általunk használt AWS parancssori interfész (AWS CLI) here to copy the object. We could also use the AWS SDK Pythonhoz (Boto3) if we wanted to have a more sophisticated or automated control of the file name. For now, we will ensure that we change the file name each time we run the notebook so the models don’t get overwritten.

Now we are ready to create the notebook job.

  1. Choose (right-click) the notebook name, then choose Create Notebook Job.Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.
    If this menu option is missing, you may need to refresh your Studio Lab environment. To do this, open Terminal from the launcher and run the following code:
    conda deactivate && conda env remove —name studiolab

  2. Next, restart your JupyterLab instance by choosing Amazon SageMaker Studio Lab from the top menu, then choose Restart JupyterLab.Alternatively, go to the project page, and shut down and restart the runtime.
  3. A Állás létrehozása oldal, mert Számítási típus, choose the compute type that suites your job.

    For more information on the different types of compute capacity, including the cost, see Amazon SageMaker árképzés (választ Igény szerinti árképzés és keresse meg a Képzések tab. You may also need to check the quota availability of the compute type in your AWS account. For more information about service quotas, see: AWS szolgáltatási kvóták.For this example, we’ve selected an ml.p3.2xlarge instance, which offers 8 vCPU, 61 GB of memory and a Tesla V100 GPU.Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.

    If there are no warnings on this page, you should be ready to go. If there are warnings, check to ensure the correct role ARN is specified in További lehetőségek. This role should match the ARN of the SageMaker execution role we created earlier.The ARN is in the format arn:aws:iam::[account-number]:role/service-role/[role-name].

    There are other options available within További lehetőségek; for example, you can select a particular image and kernel that may already have the configuration you need without the need to install additional libraries.

  4. If you want to run this notebook on a schedule, select Fuss ütemterv szerint and specify how often you want the job to run.Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.We want this notebook to run once, so we select Fuss most.
  5. A pop-art design, négy időzóna kijelzése egyszerre és méretének arányai azok az érvek, amelyek a NeXtime Time Zones-t kiváló választássá teszik. Válassza a Teremt.
    Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.

Notebook jobs list

A Notebook állások page lists all the jobs currently running and those that ran in the past. You can find this list from the Launcher (choose, filé, Új Launcher), majd válassza a lehetőséget Notebook állások a Más szakasz.

Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.

When the notebook job is complete, you will see the status change to Completed (használja a Reload option if required). You can then choose the download icon to access the output files.

Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.

When the files have downloaded, you can review the notebook along with the code output and output log. In our case, because we added code to time the run of the training cell, we can see how long the training job took—16 minutes and 21 seconds, which is much faster than if the code had run inside of Studio Lab (1 hour, 38 minutes, 55 seconds). In fact, the whole notebook ran in 1,231 seconds (just over 20 minutes) at a cost of under $1.30 (USD).

Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.

W can now increase the number of epochs and adjust the hyperparameters to improve the loss value of the model, and submit another notebook job.

Következtetés

In this post, we showed how to use Studio Lab notebook jobs to scale out the code we developed in Studio Lab and run it with more resources in an AWS account.

By adding AWS credentials to our Studio Lab environment, not only can we access notebook jobs, but we can also access other resources from an AWS account right from within our Studio Lab notebooks. Take a look at the AWS SDK for Python.

This extra capability of Studio Lab lifts the limits of the kinds and sizes of projects you can achieve. Let us know what you build with this new capability!


A szerzőkről

Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai.Mike Chambers is a Developer Advocate for AI and ML at AWS. He has spent the last 7 years helping builders to learn cloud, security and ML. Originally from the UK, Mike is a passionate tea drinker and Lego builder.

Run notebooks as batch jobs in Amazon SageMaker Studio Lab PlatoBlockchain Data Intelligence. Vertical Search. Ai. Michele Monclova az AWS fő termékmenedzsere a SageMaker csapatban. New York-i és a Szilícium-völgy veteránja. Szenvedélyesen rajong az életminőségünket javító innovációkért.

Időbélyeg:

Még több AWS gépi tanulás