You are currently viewing Customer Churn Prediction with Azure ML

Customer Churn Prediction with Azure ML

Introduction

Customer churn prediction is one of the most common value-generating use cases for machine learning across organizations. Specifically, machine learning models are used to predict the propensity of a particular customer to churn based on, for example, the customer demographics and other customer characteristics. In combination with identifying the customer lifetime value, this can be useful to target marketing campaigns to customers that have a high customer lifetime value but are also likely to churn.

In this blog article, I will give a step by step walkthrough to set up an end-to-end customer churn prediction model with Azure Machine Learning. You can find all resources in the following github repository. Different features of the Azure Machine Learning Studio will be shown while working to solve a Kaggle challenge to predict customer churn. The Kaggle challenge can be found here.

Prerequisites to follow along:

  • You have an Azure Account
  • You either have a Pay-As-You-Go subscription (in which case you will incur cost) or you have some free Azure credits
  • You have at least some foundational Azure and Data Science knowledge

Hands-on Tutorial

Step 1

Create an Azure resource group:

create-resource-group

Step 2

Create an Azure Machine Learning workspace:

create-azure-ml-workspace

Step 3

Enter your Azure Machine Learning workspace by clicking “Launch Now”:

launch-workspace

Step 4

Create a compute instance in your Azure Maching Learning workspace:

create-compute-instance

Step 5

Open Jupyter Notebooks in your compute instance:

open-jupyter

Step 6

Enter the terminal, switch directories to your user directory and clone this repository:

enter-terminal

clone-git-repo

Step 7

Download the two data files from the “data” folder to your local machine:

download-data

Step 8

Create an Azure Data Lake Gen2. You do this by creating a storage account that has hierarchical namespace enabled:

create-data-lake-gen2-1

Enable hierarchical namespace:

create-data-lake-gen2-2

Create a container called “raw” in your Azure Data Lake Gen2:

create-container

Create two directories:

  • 2020/03/31
  • 2020/04/01

create-directory

Step 9

Upload the two data files in the respective directories in your Azure Data Lake Gen2 (according to the date):upload-data

Step 10

Register your Azure Data Lake Gen2 as a storage account (not as an Azure Data Lake Gen2) datastore in the Azure Machine Learning Workspace.

First retrieve your storage account key from the Azure Portal:

get-storage-account-key

create-datastore

Step 11

Register a dataset using your datastore. Important: the dataset has to be named “customer-churn”.

create-dataset-1

create-dataset-2

create-dataset-3

create-dataset-4

Step 12

Install all necessary dependencies on your compute instance:

install_dependencies

Step 13

You can now run the notebooks. Specific explanations can be found as comments in the notebooks. You can omit running “03_customer_churn_train_decision_tree” and “04_customer_churn_train_automl” without affecting the downstream workflow. The notebooks will cover the ML lifecycle from exploratory data analysis to model deployment.

Sebastian Birk

I’m a Data Scientist and Azure Data & AI Consultant. My passion is to solve the world’s toughest problems with data & AI and I love to build intelligent solutions in the cloud.

Leave a Reply