Introduction
This is the fourth part of a 10-part blog article series about my Azure MLOps Template repository. For an overview of the whole series, read this article. I will present each of the 10 parts as a separate tutorial. This will allow an interested reader to easily replicate the setup of the template.
In the first part, we have seen how to provision our Azure infrastructure with the Infrastructure as Code (IaC) open source tool called Terraform. In the second part, we have covered how to set up reliable, reproducible and easily maintainable Python environments using the popular open source package management system Conda as well as the Azure Machine Learning (AML) Environments feature. In the third part, we have seen how to store and access data using AML. In this fourth part, we will train a benchmark model with Azure Custom Vision which we can later use to assess the performance of our custom Python model.
Prerequisites to follow along:
- You have an Azure Account
- You either have a Pay-As-You-Go subscription (in which case you will incur cost) or you have some free Azure credits
- You have followed along in the first part, the second part and the third part of the blog article series
Azure Custom Vision Overview
An important step in any model development and training process is to build a quick and simple benchmark model. This benchmark model can then be used to assess how other models are performing. Without a benchmark model, we would not really know what accuracy to strive for and what we can consider to be a good model. In order to build a benchmark model, we will use the Azure Custom Vision service.
In simple terms, the Azure Custom Vision service is a blackbox AI offering from Microsoft that can be used to train image classification models on custom data without any Data Science knowledge. The trained models can then be published as endpoints. The only task for us (as an AI developer) is to upload and label our data. With a click of a button we can then train a model. Model architecture and other training implementation aspects are automatically taken care of by the service backend.
While this allows us to rapidly build models, it of course comes with the downside that customization is limited and the developer has no possibility to fine-tune the model. As we will see in the next blog articles, we can heavily outperform a trained Custom Vision model with our own custom Python model. Nonetheless, the Azure Custom Vision service is very good to build a first benchmark model and can be very valuable for Proof of Concepts or in situations where models are built by Citizen Data Scientists.
To access the Custom Vision GUI, go the Custom Vision website and log in with your Azure AD account.
To use the Azure Custom Vision service, two Azure resources need to be created: a Custom Vision Training resource and a Custom Vision Prediction resource.
More information on Azure Custom Vision can be found in the documentation.
1. Start your AML Compute Instance & Set up the Environment Variables
In previous articles, we have provisioned an AML workspace and an AML compute instance with Terraform. Together with all other required Azure infrastructure (such as the Azure Custom Vision resources), we have also provisioned an Azure DevOps project and have cloned the Azure MLOps Template Repository to this project. We have then manually cloned the newly cloned repository from the DevOps project to the AML compute instance. Now, we simply make use of the AML compute instance and our cloned repository.
Log in to the Azure portal, search for the resource group “mlopstemplaterg”, and then click on and launch the Azure Machine Learning workspace:
Alternatively, you can also directly navigate to the Azure Machine Learning website and select the provisioned AML workspace.
In the AML workspace, navigate to the “Compute” tab and start your AML compute instance.
In order to run the Custom Vision model notebook, we will make use of the .env file for the first time (we will leverage this file again in future blog articles). The .env file is located in the template root directory and contains all environment variables. The environment variables from the .env file can be easily loaded into memory with the Python dotenv library. For security purposes, the .env file is excluded from git via the .gitignore file. Instead, there is a .env.example file checked in to the Azure MLOps Template repository.
We will have to rename this .env.example file to .env. For this purpose, open VS Code on your AML compute instance:
Then navigate to the directory to which you have cloned the repository and rename the file. In my case I have created a copy of the .env.example file and have renamed this copy to not lose the original:
Note: All files that start with “.” are hidden files, so make sure that your VS Code settings enable visibility of hidden files.
After renaming the .env.example file into .env, we need to insert a couple of environment variables into the file. All environment variables that are required for this part are under the “Custom Vision Variables” heading.
Navigate to Azure Custom Vision, log in with your Azure AD account and click on “Settings” in the top right. Extend the resources to find all config values to insert into the .env file:
Now copy and paste the “Custom Vision Endpoint”, “Custom Vision Prediction Key”, “Custom Vision Prediction Resource ID” and “Custom Vision Training Key” into the .env file.
You can choose the Project Name and Publish Iteration Name. I will use “stanford_dogs” as Project Name and “iteration_1” as Publish Iteration Name.
2. Run the Custom Vision Model Notebook
02_custom_vision_benchmark_model.ipynb
notebook. Similar to as we have done previously with other notebooks, we will open Jupyter or JupyterLab on our AML compute instance and choose our development environment Jupyter kernel for code execution:Now follow the instructions from the notebook’s Markdown cells step by step. In the rest of this article, I will be giving supporting explanations and information in addition to the instructions of the Markdown cells. Feel free to skip reading through it if you prefer to go through the notebook yourself.
2.1 Custom Vision Model Notebook Overview
In the 02_custom_vision_benchmark_model.ipynb
notebook, I will first show how to set up Custom Vision so that we can train a benchmark model. We will then upload our training data to the Custom Vision service to train a code-free model with this service. To conclude with this part of the blog article series, we will use the trained model to evaluate the performance on the test set. This will serve as our performance baseline for future development.
2.2 Setting Up Custom Vision
I have created an EnvVariables class in the env_variables.py file in the src/utils directory. By creating an object of this class, we can easily retrieve and use all environment variables from the .env file in the notebook.
Using the Custom Vision environment variables from the .env file, we then create a Custom Vision training and prediction client as well as a Custom Vision project. The Custom Vision project is then visible in the Custom Vision GUI:
2.3 Uploading Data
In the last article of the Azure MLOps Template blog article series, we have downloaded the Stanford Dogs dataset from the Internet to our AML file share. We will now upload the training data from our AML file share to the Custom Vision project. This upload is done in batches. The maximum batch size supported is 64 images. We use the folder names to assign the images their labels (called tags in the Custom Vision service).
From a technical point of view, the create_images_from_files() method of the Custom Vision training client as well as the ImageFileCreateEntry and ImageFileCreateBatch classes are used to upload the images in batches to the Custom Vision project.
After running the respective notebook cell, we can observe all images in our “stanford_dogs” Custom Vision project in the Custom Vision GUI.
2.4 Training the Model
We are now ready to train the benchmark model. This can be done with the train_project() method of the Custom Vision training client. We only have to specify the training type and the reserved budget in hours. More information on this method can be found in the documentation.
Every 30 seconds we check the status of the training to see when the training process is finished.
2.5 Publishing the Endpoint
After the benchmark model training is complete, we can publish the trained model iteration to our Custom Vision prediction resource. This can be done with the publish_iteration() method of the Custom Vision training client. Having published the model, we can easily do inference with the model by calling the model endpoint of the Custom Vision prediction resource.
2.6 Evaluating the Model
The most important part of having trained a benchmark model is to evaluate how well the model performs. This will serve as a baseline which we try to improve upon by training and fine-tuning a custom Python model in a subsequent notebook and blog article.
For the purpose of model evaluation, we will calculate the accuracy on the test set. The accuracy is a proper metric in this case since we do not have imbalance in our target class and making good predictions is equally important for all 120 classes (dog breeds). If that were not the case, then we would have to look into using alternative evaluation metrics for multiclass classification problems such as Recall, Precision or the F1-score.
Note: The test set accuracy should never be used for model selection since you will then have a biased estimate of your model’s performance that is not generalizable to unseen data. In our next blog article of the series, we will use the validation set for model selection. The test set model performance is only used after the model is completely tuned to get an understanding of the general performance on new unseen data. We will see a huge difference in performance of a custom trained model and the performance of the “blackbox” model trained with Custom Vision.
From a technical point of view, we use the classify_image() method of the Custom Vision prediction client to send the binary contents of the images one-by-one to the model endpoint to retrieve a prediction. We then compare the retrieved predicted label with the ground truth label and store the results of the comparison in a list. After we have sent all images of the test set to the endpoint, we calculate the accuracy.
You should achieve a test set accuracy of around 80% with this benchmark model. This is the accuracy that we try to beat by developing a custom model with Python in AML.