Getting Started with AWS Sagemaker for Image Classification using TensorFlow
Applicable Products
- Firefly-DL
Application Note Description
Amazon Sagemaker is a fully managed service that provides the ability to build, train, and deploy machine learning models that can be deployed on Teledyne FLIR Firefly-DL cameras. Sagemaker removes the heavy lifting from each step of the machine learning process to make it easier to develop high quality models. This application note aims to provide the following:
- Provide links to Amazon Web Services (AWS) resources on how to setup an AWS account and request resources (e.g. EC2).
- Provide some information on Sagemaker and how to setup a Notebook instance.
- Provide some information on S3 storage and how to a create storage bucket.
- Provide a working code example using Jupyter notebook to train an image classification model.
AWS Account Setup
To create an account:
1. Navigate to the Amazon AWS website: https://aws.amazon.com
2. If you don't already have one, create an AWS account. You can find some instructions on the account setup process here.
3. Sign in to your account console using the Sign In to the Console button.
AWS Services
All new AWS accounts are allocated several cloud compute resources by default. These resources include images, instances, volumes, and snapshots. When you create your AWS account, AWS sets default quotas (also referred to as limits) on these resources on a per-region basis. You can find a full list of the available resources to you by navigating to user profile menu and selecting My Service Quotas. More information regarding AWS EC2 resources can be found here.
To request AWS services:
1. Under AWS services, search for Amazon Elastic Compute Cloud (Amazon (EC2).
This brings you to a list of EC2 cloud compute services available in your region. You can also find your account specific applied quota for each service.
2. Search for "running dedicated P2 host service", which gives you access to an accelerated compute instance (a dedicated GPU machine which can speed up your training job). You can also find more information about other types of instances here.
3. By default the applied quota for a dedicated P2 host service is zero. Submit a request to AWS support to increase your dedicated P2 host service. You need a minimum of one dedicated P2 host instance to complete the training job given in the example code. You can find more information on how to request a limit increase here.
Note: the resource limit increase may take several days before it is applied.
AWS Identity and Access Management (IAM)
IAM service controls who is authenticated (signed in) and authorized (has permissions) to use resources. When you first create an AWS account you begin with a single sign-in identity that has complete access to all AWS services and resources in the account. However, if you're not the root user (for example in an organization AWS account) the admin user may have set some limitations on what services you can access. Check with your AWS administrator for more details. To complete the notebook example, you may need to add some specific IAM roles to your user profile.
1. In the search bar enter "roles" and select IAM service.
2. Select Roles.
3. A list of current roles is displayed in the main window. Select the role that was created or selected to run the notebook instance.
4. Attach the following policies to your role.
- AmazonSageMakerFullAccess
- AmazonEC2ContainerRegistryFullAccess
AWS Sagemaker
Amazon Sagemaker is a fully managed machine learning service. It provides an integrated Jupyter notebook instance for ease of access to your data sources for exploration and analysis. It also provides common machine learning algorithms that are optimized to run efficiently against extremely large data in a distributed environment.
To set up your Sagemaker notebook:
1. From the AWS Management Console Find Services, launch Amazon Sagemaker.
3. On the Notebook instances tab, click Create notebook instance.
4. Enter a name for your notebook instance.
5. Under Permissions and encryption create a new IAM role and select Any S3 bucket (or enter your Specific S3 bucket) to allow the notebook instance access to your S3 bucket. Keep the other default settings.
6. Under Git repositories select the Clone a public Git repository to this notebook instance only option and enter the following repository address: https://github.com/FLIR/amazon-sagemaker-firefly-dl
6. Click Create notebook instance to initialize and create a new instance. It might take several minutes for the instance to initialize.
7. Click Open JupyterLab to start your Jupyter notebook.
You should have the following files and folders uploaded to your notebook instance.
You can now launch the Jupyter notebook example (example_classification_tensorflow_Sagemaker.ipnb) inside your newly created instance, and run through the python code for training a model.
Jupyter Notebook Example for Sagemaker
What is Jupyter Notebook?
Jupyter Notebook is an open source web application that you can use to create and share documents that contain live code, equations, visualizations, and text.
How does it work?
The notebook consists of a sequence of cells. The cells could either be a Markdown cell (Text description) or a code cell (Python code). You can click on each code cell to highlight it, then click run (or ctrl-enter) to run the python code in the cell. When the cell is running it displays "In [*]" in the left margin. Once it has finished the * changes to a number such as "In [1]" where the number represents the order in which the cell was run relative to the other cells. The output for each cell is printed out below the cell.
Additional Information
If you restart the Notebook kernel (or stop the notebook instance) the output from each run code cell is still displayed, however, you need to re-run the code cell to restore the actual output values.