How to accomplish Machine Learning Operations (MLOps) in SageMaker

How to accomplish Machine Learning Operations (MLOps) in SageMaker

As Data Science practices mature, the need to develop complex ML models and deploy them efficiently is becoming increasingly complex. If a Data Scientist is unable to productionize the ML Models built, it would mean huge opportunities lost in terms of cost to decision making.

To overcome these problems, MLOps as a best practice was introduced . This blog helps you with how to achieve ML ops with AWS Sagemaker.

MLOps in SageMaker can be achieved through the Console as well as the SageMaker API. We make the API call in the Jupyter Notebook. The blog explains both the modes of achieving them. Please find a link to the GitHub Repository for implementing MLOps for Starters below :

https://github.com/Ideas2IT/datascience-lab

Alright! Now let’s get started.

MLOps has different phases in its process. The image below illustrates it :

MLOps has different phases in its process - Ideas2IT blog

Now let’s look into what each step in the MLOps process entails :

1.Model Building Phase :
In this phase, the Data Scientists build ML Models based on the Business Problem at hand, with varying hyper parameters. Typically, a Data Scientist would create the ML Models on a cloud-instance or an on-premise computer Instance that uses Jupyter Notebook or any other Machine Learning Tools. ML Algorithms could be written using cloud platform supported ML Libraries or making use of Programming language supported libraries like SciKit Learn in Python.

2.Model Selection Phase:
In this phase, now that a Data Scientist would have built different ML Models, it is now time to select the best model in terms of the Model Evaluation metrics. Based on the chosen metric, the best model to be deployed is identified.

3.Model Deployment Phase :
Having identified the best model, we need to deploy it to automate the ML flow process. The Deployment would usually be creating a Docker Image of the ML Model from External ML Libraries(Sci-Kit Learn/TensorFlow) or by making use of the Cloud Native ML Algorithms’ Image if external ML Libraries (Sci-Kit Learn/TensorFlow) are not used and creating an end-point on a Cloud Platform like AWS or Azure.

For the purpose of this write-up, we have an improvised version of Sample Jupyter Notebook Example shared by AWS to the Developer Community. We have used the example Jupyter Notebook for Starters (https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_applying_machine_learning/video_game_sales/video-game-sales-xgboost.ipynb) and have modified the code present in this example to customize it to our blog specified objective. We have retained the Dataset used in the notebook.

We will use the VideoGame Sales Dataset hosted by Kaggle – https://www.kaggle.com/rush4ratio/video-game-sales-with-ratings

MLOps using Jupyter Notebook:

MLOps process inside of SageMaker - Ideas2IT blog

The brief process of the MLOps inside of SageMaker is given below with code snippets for your understanding:

1.Open the SageMaker Environment by logging on to AWS using your credentials and create a Jupyter Notebook Instance.

2.Ensure that you have an S3 storage with the relevant bucket name and place the CSV file that you have downloaded from the Kaggle page mentioned above and upload it inside the S3 bucket created.

3.Open up Jupyter Notebook and load the packages as shown below:

Jupyter Notebook - Ideas2IT blog

4.Load the Dataset and find the shape of the dataset :

Dataset - Ideas2IT blog

There are 16719 records and 16 variables in the dataset on which we are trying to run an ML Model on.

The Problem statement is to find out whether a particular game will become a hit or not.
So for demonstration purpose, we have categorized the video games that have more than 2 million(number) in global sales as being declared a hit vis-a-vis the video games that sold less than 2 million(numbers) as not being a hit

The Outcome now looks like this

outcome new look - Ideas2IT blog

5.Performing pre-processing with the Target Variable and creating dummy variables for the categorical variables before it is fed into the model .

creating dummy variables - Ideas2IT blog

6.Splitting the dataset into 70% – 20% training and testing splits – with 10 % being validation dataset.
validation dataset - Ideas2IT blog

7.Build XGBoost models making use of SageMaker’s native ML capabilities with varying hyper parameters of Maximum Depth being set at 3, 4 and 5 respectively.We are creating these models with different hyper parameters for the Model Selection phase. The next step would be the Model Selection Phase.

Model Selection Phase - Ideas2IT blog

The below line of code will create a Training Job with an XGBoost of Max Depth 3

XGBoost of Max Depth 3 - Ideas2IT blog

Performing the same activity with max-depth = 4 and 5.

Performing the same activity with max-depth = 4 and 5 - Ideas2IT blog

input validation

depth 5 - XGBoost of Max - Ideas2IT blog

input validation 3

8.This step in the process would be to identify the best model in terms of the Model Evaluation Metrics – AUC value. We thus make use of SageMaker’s Search API in the Jupyter Notebook.

The Code Snippet below shows how to leverage the Search API

Search API

Search API pandas

Now let’s have a look at which variant of the XGBoost model needs to be deployed :

XGBoost model needs to be deployed

From the above dataframe, we see that the XGBoost Model with Max Depth = 5 has the highest Validation AUC and Training AUC. Hence we are going to use this model to deploy it in SageMaker.

9.To deploy the model in SageMaker, the following code helps us do that :

deploy the model in SageMaker

10.To validate if an end point is created, it needs to be checked in the console.

checked in the console

The end point is thus created and we have it validated.

MLOps using the Console:

So far, we saw how to deploy an ML Model using the Jupyter Notebook. We shall now see how the MLOps process is done using the Console.

We repeat the steps 1 – 7 as mentioned above in the earlier sections on the MLOps Process. The Series of screenshots will help understand the console-driven deployment methodology :

Navigate to the Search Page and filter the search criteria as shown below:

Please note that “sample” under the value section in the below screenshot mentions the name of the training job for the Video Games Data set.

Video Games Dataset

The Search Criteria will give the following result of the trained models and the metrics associated with it.

result of the trained models

Based on the above Model Metrics – we are going to deploy the Training Job that had a high AUC value. Hence the model that will be deployed will be an XGBModel with a maximum depth size of 5.

Now that we have identified the Training job, we need to create the Model package for that training job to be deployed

creating jobs

training job to be deployed

We need to create an end-point to deploy the model. Refer to the series of screenshots that follow.

Navigate to the Endpoint page and click on “Create Endpoint” to initiate the process

end point

In this page, one needs to create an End Point Name, feed in EndPoint related configuration details and also choose the Model that needs to be deployed.

create an End Point Name

attach endpoint cong

Enable dataset

Now that the Endpoint gets created, a success message pops up demonstrating the same.

Endpoint gets created

Hurray! We have deployed our model into production. What’s the fun in only us doing it? We want you to implement the same.

Happy Learning and Happy Coding! Stay tuned for more such hands-on ML Implementation Blogs.