How do we start achieving this goal ?

To begin, it is crucial to conduct a simple AB test. This involves providing a discount to one group while maintaining a control group without any discount.

After the experiment we have three primary approaches.

The first approach involves building two separate models: one for the control group (without any discount) and one for the treatment group (with an discount). To build these separate models we can choose any type of ML model.

By running each client through both models, we can calculate the uplift as the difference between the predicted outcomes.

**Pros:**

**Cons:**

- It does not directly predict uplift. We estimate the probability of the users’ action (purchase).
- The two-model setup introduces double error modeling, as both models have their own errors, leading to larger overall errors.

The second approach revolves around transforming the target variable itself. By creating a new target that represents uplift, we can calculate the desired outcome directly.

We introduce a new target variable using the following formula:

Here, Y represents the original target variable, and W indicates whether the target treatment was applied or not. In other words Y represents whether the discount was given or not, and W indicates whether a purchase was made or not.

The transformed variable Z takes the value of 1 in two cases :

- The user belongs to the target group (W = 1) and Y = 1 ( the discount was given to the user and he has purchased ).
- The user belongs to the control group (W = 0) and Y = 0 ( the discount wasn’t given to the user and the user hasn’t purchased).

Then we just need to train the model ( for example logistic regression) with a new target.

To calculate uplift, we can use the following formula:

**Pros :**

- It is still easy to implement.
- It’s more robust and stable than the first approach due the fact that we have only one model.

**Cons :**

- It still does not directly predict uplift. We predict the transformed variable.

The third approach capitalizes on tree-based models.

The goal is to identify the subpopulations within a dataset that are most responsive to the treatment, thereby enabling targeted interventions for maximum impact.

The example decision tree for uplift purposes is depicted in the highlighted image above. The red color indicates the uplift values. By observing the image, we can conclude that the overall uplift difference is 0.0127 (based on a random metric). However, as we descend into the tree, we observe certain subpopulations exhibiting higher uplift differences.

These subpopulations become our target as they hold the potential for maximum benefits.

**How to build this tree ?**

There are numerous tutorials available on constructing decision trees, but here I will outline the basic approach.

- Select features and identify the target variable, which, in our case, is uplift.
- Choose a splitting criterion to determine how nodes are divided.
- Build the tree by recursively repeating the splitting process until a stopping criterion is met.

It’s worth noting that there are three commonly used splitting criteria for building uplift trees, listed below in order of popularity:

- KL divergence
- Chi-Square
- Euclidean Distance

**Pros :**

- One of the most accurate methods
- We have a decision tree , therefore we can construct the forest of trees and different ensembles that increase the accuracy and reduce variance.

**Cons :**

- It’s a decision tree method , therefore the algorithm tends to overestimate the categorical variables with many levels. To fix it we can use mean imputation.

Now we know that addressing customer churn requires strategies that go beyond just estimating the probability of churn. The ultimate goal is to apply the most appropriate treatment to each user and deliver business impact instead of churn probability.

Uplift modeling, which can be applied to various business challenges beyond churn, offers a powerful solution with immediate business impact.

There are still a lot of intriguing questions about uplift modelling such as handling multiple treatments, estimating different uplift models, and utilizing multi-armed bandits for production, but I will keep answers for the next post.