A bit more than a year ago, I created and shared a Github repository containing a demo on customer churn prediction in Azure Machine Learning (you can find the blog article here). Since then, I have worked on a more advanced end-to-end template repository demoing model development with PyTorch as well as MLOps in the broader Azure ecosystem. The purpose of this template repository is to serve as a blueprint and accelerator for MLOps solutions in Azure. It leverages Azure services such as Azure DevOps, Azure Machine Learning and Azure Kubernetes Service.
I am happy to share this template repository today. Some parts of it are still work in progress but in the coming weeks I will write a 10-part series of blog articles on my learnings during the creation of the template repository and will explain how to leverage all aspects of it. With this series of blog articles, I hope to contribute to a better understanding of MLOps in Azure in general but my objective is also to empower people to use this repository for its meant purpose as a blueprint.
So stay tuned in case you are interested in ML, MLOps and/or Azure! You can expect the following aspects to be covered in the template repository and the series of blog articles:
- Creating AML infrastructure via Terraform and following Infrastructure as Code (IaC) principles
- Creating a Conda development environment and adding it as a Jupyter kernel as well as creating Azure Machine Learning environments for model development, training and deployment
- Downloading public data and uploading it to an Azure Blob Storage that is connected to the Azure Machine Learning workspace
- Training a Custom Vision model with Azure Cognitive Services to have a benchmark model
- Training a PyTorch model with transfer learning on Azure Machine Learning
- Evaluating the trained models leveraging Azure Machine Learning capabilities
- Deploying the trained model to different compute targets in Azure, such as Azure Machine Learning Compute Instance, Azure Container Instances, Azure Kubernetes Service
- Creating a Flask frontend for model serving and deploying it to Azure App Service
- Creating Azure Machine Learning pipelines for trigger-based model training, evaluation, and registration
- Building CI/CD pipelines in Azure DevOps for unit and integration testing, automated model training, automated model deployment, building and pushing images to an Azure Container Registry and automated deployment of the Flask frontend
- Running CI/CD pipelines within a docker container on Azure Pipelines Agents
Check out the Github repository here.