Uplift Modeling

一、背景

　　在促销活动，或者补贴、发红包活动中，如果我们能准确瞄准相应人群，那么即可以促进购买数量，也可以节约成本（本来就可以购买的人群不用额外发放红包）。

二、一些基础

　　Uplift用于筛选出这类人群。（如果对其采取了补贴，则可以转化为订单）。Uplift模型比较了treatment（投放补贴）客户与 control（不投放补贴）客户之间的差异。我们用 Y 表示 action，

Y = 1 表示购买物品，Y = 0 时表示不够买物品。W表示人群，1为treatment表示发补贴；0表示control表示不发补贴。下图 persuadable是我们要找的客户。

　　我们将客户分为4类。

Will Take Action。会购买物品不管是否发补贴。（Y = 1 && W = 1 and Y = 1 && W = 0 )
Persuadable.。发补贴购买物品否则不买。(Y=1 && W = 1 and Y = 0 && W = 0)
Do Not Disturb。发补贴不买，不发补贴买。（Y = 0 && W = 1 and Y = 1 && W = 0）
Never Will Respond。发不发补贴都不买。（Y = 0 && W = 1 and Y = 0 && W = 0）

1）上标1为treatment, 0 为control；下表 i 表示第 i 个样本

2）因果效果（Casual Effect）：$\tau _{i}=Y_{i}^{1}-Y_{i}^{0}$

3）因果效果期望：$CATE=E[Y_{i}^{1}|X_{i}]-E[Y_{i}^{0}|X_{i}]$. $X_{i}$表示i类子人群

4）不能同时用户i又发了补贴，又没发补贴。用户i的结果可以表示为: $Y_{i}^{obs} =W_{i}Y_{i}^{1}-(1-W_{i})Y_{i}^{0}$

5）由 4），CATE的估计为：$\widehat{CATE}=E[Y_{i}^{obs}|X_{i}=x,W_{i}=1] - E[Y_{i}^{obs}|X_{i}=x,W_{i}=0]$ . (注意, 在$X_{i}=x$条件下，$W_{i}$应该独立于 $Y_{i}^{obs}$, 也就是表名用户进入treatment组或者 contrl组是随机的。)

6) 倾向分 Propensity Score. $P({X_{i}}) = P(W_{i}=1|X_{i})$. 即用户进入到treatment的概率。

三、Uplift Modeling

1 Two-Model Approch

　　使用两个模型，一个训练 Treatment的数据，一个训练Control的数据。最后用 Prob(Treatment) - Prob(Control) 得到 uplift score。

2 The Class Transformation Approach

　　试用于$Y_{i}^{obs}$ 只为1、或者0的情形。构造目标函数

1）$Z_{i}=W_{i}Y_{i}^{obs}+(1-W_{i})(1-Y_{i}^{obs})$

　　在实验组且用户购买或者对照组用户无响应时，$Z_{i}=1$，其他情况 $Z_{i}=0$ 。当 $P({X_{i}}) = 1/2$时候，可以证明：

2）$\tau (X_{i})=2P(Z_{i}=1 | X_{i})-1$

　　此时对$P(Z_{i}=1 | X_{i})$建模即可。缺点是对实验组、对照组样本分布要求严格都为1/2。当样本分布不均匀时候：

3）$Y_{i}^{*} =\frac{W_{i}}{\hat{p}(X_{i})} Y_{i}^{1}-\frac{(1-W_{i})}{(1-\hat{p}(X_{i}))}Y_{i}^{0}$

　　转换后可得

4）$\tau (X_{i})=E[Y_{i}^{*}|X_{i} ]$

　　转换后得了

5）$Z_{i}=\frac{1}{2} Y_{i}^{*}+(1-W_{i}) $

　　此时可得

6）$2E[Z_{i}|X_{i}]=E[Y_{i}^{*} |X_{i}]+1$

3 Modeling Uplift Directly

　　传统的树模型的分裂规则，例如ID3，是根据信息增益计算是否按照某个特征分裂。

1）$Gain(D|A)=H(D)-H(D|A)$

　　这种思想也可以引入到 Uplift Model中。

2）$Gain=D_{after\_split}(P^{T},P^{C})-D_{before\_split}(P^{T},P^{C})$

　　D表示两组分布的差异程度。可以用KL散度，欧式距离等。

3.1 Decesion Trees For Uplift Modeling.

　　以 Decesion Trees for uplift modeling. 因果树为例子学习一下。论文下载点此处。一种分裂方法是计算 $\Delta \Delta P$.

　　论文里使用的评判标准为 KL散度。

1）$KL(P^{T}:P^{C})=\sum P^{T}log\frac{P^{T}}{P^{C}} $

　　计算在一个测试集的条件KL散度

2）$KL(P^{T}:P^{C}|Test)=\sum \frac{N^{T}(a)+N^{C}(a)}{N^{T}+N^{C}} KL(P^{T}(class|a):P^{C}(class|a))$

　　KL散度增益

3）$KL_{gain}(Test)=KL(P^{T}:P^{C}|Test)-KL(P^{T}:P^{C})$

　　在测试集上，算分裂前的KL散度和分裂后的KL散度。

　　同样可以使用欧几里当作分裂标准。欧几里得评判标准有更好的统计特性。（有限、对称性）

4）$Euclid(P^{T}:P^{C})=\sum(P^{T}-P^{C})^{2}$

　　剪枝。定义

5）$Diff(Class,node)=P^{T}(Class|node)-P^{C}(Class|node)$

6）$MD(node)=max_{Class}|Diff(Class|node)|$

7）$sign(node)=sgn(Diff(Class^{*},node))$

　　如果子树被一个叶子结点代替后，验证集的MD值变小并且如果MD的sign值在验证值和训练集一样则保留子树。一个有用的Python包为 Causalml。下载使用参考链接

4 Multi-Classification Model 多分类模型

　　在文章开始，可以将样本分为4类。

Control Non-Responders (CN)
Control Responders (CR)
Treatment Non-Responders (TN)
Treatment Responders(TR)

然后将分类后的数据喂入模型最后计算 uplift_score. $Uplift\_Score = \frac{P(TR)}{P(T)} + \frac{P(CN)}{P(C)} - \frac{P(TN)}{P(T)} - \frac{P(CR)}{P(C)}$

四、评估

1) decile chart.

　　对用户计算相应 uplift_score之后，然后对用户进行排序（按照uplift_score由高到低）。然后按照排序结果根据需求按照百分位进行分组，最后计算每个分组的平均uplift_score。无法比较模型之间的好坏。

2) AUUC

　　最常用的评估方式为AUUC。$G(i)=(\frac{N_{y=1,t=1}}{N_{t=1}}-\frac{N_{c=1,t=1}}{N_{c=1}})*(N_{t=1}+N_{c=1})$

用五的例子分别算了各种模型的AUUC。可以看出 TWO-MODEL最差，MultiClassification 最好。

五、Example

选取 Kaggle一篇关于促销活动当作学习例子。https://www.kaggle.com/datasets/davinwijaya/customer-retention

CODE：

from scipy.spatial import distance
from sklearn.neighbors import NearestNeighbors
from sklearn.cluster import DBSCAN
import numpy as np
from math import *
from scipy.spatial.distance import pdist, squareform
from sklearn.cluster.tests.common import generate_clustered_data

import matplotlib.pyplot as plt

import pandas as pd
import seaborn as sns, warnings

warnings.filterwarnings('ignore')

data_raw = pd.read_csv("C:\\Users\\nan.wu2\\PycharmProjects\\job\\study\\data\\data.csv")

df_model = data_raw.copy()

"""
['recency', 'history', 'used_discount', 'used_bogo', 'zip_code',
       'is_referral', 'channel', 'offer', 'conversion']
"""

# Rename target column
df_model = df_model.rename(columns={'conversion': 'target'})
# Rename & Label encode treatment column
df_model = df_model.rename(columns={'offer': 'treatment'})
df_model.treatment = df_model.treatment.map({'No Offer': 0, 'Buy One Get One': -1, 'Discount': 1})

df_model = pd.get_dummies(df_model)

df_model_bogo = df_model.copy().loc[df_model.treatment <= 0].reset_index(drop=True)
df_model_discount = df_model.copy().loc[df_model.treatment >= 0].reset_index(drop=True)


def declare_tc(df: pd.DataFrame):
    # CN
    df['target_class'] = 0
    # CR
    df.loc[(df.treatment == 0) & (df.target != 0), 'target_class'] = 1
    # TN:
    df.loc[(df.treatment != 0) & (df.target == 0), 'target_class'] = 2
    # TR:
    df.loc[(df.treatment != 0) & (df.target != 0), 'target_class'] = 3
    return df


df_model_bogo = declare_tc(df_model_bogo)
df_model_discount = declare_tc(df_model_discount)

from sklearn.model_selection import train_test_split
import xgboost as xgb


def uplift_split(df: pd.DataFrame):
    df.treatment = df.treatment.map({-1: 1, 0: 0})
    df['Z'] = df['treatment'] * df['target'] + (1 - df['treatment']) * (1 - df['target'])
    X = df.copy()
    y = df.Z
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
    return X_train, X_test, y_train, y_test


def uplift_class_transformation_method(X_train, X_test, y_train, y_test):
    uplift_model = xgb.XGBClassifier().fit(X_train.copy().drop(['treatment', 'target', 'target_class', 'Z'], axis=1),
                                           y_train)
    result = pd.DataFrame(X_test.copy())
    result['Z_prob'] = uplift_model.predict_proba(
        X_test.copy().drop(['treatment', 'target', 'target_class', 'Z'], axis=1))[:, 0]
    result['uplift_score'] = 2 * result['Z_prob'] - 1
    return result


def uplift_split_two_model_approach(X_train, X_test, y_train, y_test):
    df_treatment_train = X_train.copy().loc[X_train.treatment != 0].reset_index(drop=True)
    df_control_train = X_train.copy().loc[X_train.treatment == 0].reset_index(drop=True)

    df_treatment_test = X_test.copy().loc[X_test.treatment != 0].reset_index(drop=True)
    df_control_test = X_test.copy().loc[X_test.treatment == 0].reset_index(drop=True)

    Xt_train = df_treatment_train.copy()
    yt_train = df_treatment_train.target
    Xt_test = df_treatment_test.copy()
    yt_test = df_treatment_test.target

    Xc_train = df_control_train.copy()
    yc_train = df_control_train.target
    Xc_test = df_control_test.copy()
    yc_test = df_control_test.target

    return Xt_train, Xt_test, yt_train, yt_test, Xc_train, Xc_test, yc_train, yc_test


def uplift_two_model_approach_model(X_train: pd.DataFrame,
                                    y_train: pd.DataFrame):
    uplift_model = xgb.XGBClassifier().fit(X_train.copy(), y_train)
    return uplift_model


def uplift_multi_classification_model(X_train, X_test, y_train, y_test):
    result = pd.DataFrame(X_test.copy())

    y_train = X_train.target_class

    uplift_model = xgb.XGBClassifier().fit(X_train.copy().drop(['treatment', 'target', 'target_class', 'Z'], axis=1),
                                           y_train)

    uplift_prob = uplift_model.predict_proba(X_test.copy().drop(['treatment', 'target', 'target_class', 'Z'], axis=1))
    result['prob_CN'] = uplift_prob[:, 0]
    result['prob_CR'] = uplift_prob[:, 1]
    result['prob_TN'] = uplift_prob[:, 2]
    result['prob_TR'] = uplift_prob[:, 3]

    result['uplift_score'] = result.eval(
        'prob_CN/(prob_CN+prob_CR) + prob_TR/(prob_TN+prob_TR) -  prob_TN/(prob_TN+prob_TR) - prob_CR/(prob_CN+prob_CR) ')

    return result


def uplift_two_model_approach(Xt_train, Xt_test, yt_train, yt_test, Xc_train, Xc_test, yc_train, yc_test):
    t_model = uplift_two_model_approach_model(
        Xt_train.copy().drop(['treatment', 'target', 'target_class', 'Z'], axis=1), yt_train)
    c_model = uplift_two_model_approach_model(
        Xc_train.copy().drop(['treatment', 'target', 'target_class', 'Z'], axis=1), yc_train)
    result_t = pd.DataFrame(Xt_test.copy())
    tt_prob = t_model.predict_proba(Xt_test.copy().drop(['treatment', 'target', 'target_class', 'Z'], axis=1))
    tc_prob = c_model.predict_proba(Xt_test.copy().drop(['treatment', 'target', 'target_class', 'Z'], axis=1))
    result_t['uplift_score'] = tt_prob[:, 0] - tc_prob[:, 0]
    result_t['target'] = yt_test

    result_c = pd.DataFrame(Xc_test.copy())
    ct_prob = t_model.predict_proba(Xc_test.copy().drop(['treatment', 'target', 'target_class', 'Z'], axis=1))
    cc_prob = c_model.predict_proba(Xc_test.copy().drop(['treatment', 'target', 'target_class', 'Z'], axis=1))
    result_c['uplift_score'] = ct_prob[:, 0] - cc_prob[:, 0]
    result_c['target'] = yc_test

    result = result_t.append(result_c)
    result['n'] = result.uplift_score.rank(pct=True, ascending=False)
    result = result.sort_values(by='n').reset_index(drop=True)
    return result


def qini_rank(uplift: pd.DataFrame):
    ranked = pd.DataFrame(uplift.copy())
    ranked['n'] = ranked.uplift_score.rank(pct=True, ascending=False)
    ranked = ranked.sort_values(by='n').reset_index(drop=True)
    return ranked


def decile_chart(uplift_result, title):
    decile_list = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
    res = []
    last = 0.0
    for decile in decile_list:
        decile_df = uplift_result.copy().loc[(uplift_result.n > last) & (uplift_result.n <= decile)].reset_index(
            drop=True)
        score_sum = decile_df.uplift_score.sum()
        score_count = decile_df.uplift_score.count()
        last = decile
        res.append(score_sum / score_count)

    sns.set_style('whitegrid')
    plt.figure()
    ax = sns.barplot(x=decile_list, y=res)

    for p in ax.patches:
        percentage = f'{p.get_height():.3f}'  ##{:. 0%}
        x = p.get_x() + p.get_width() / 2
        y = p.get_height() + 0.05
        ax.text(x, y, percentage, ha='center', va='center', fontsize=10,
                bbox=dict(facecolor='none', edgecolor='black', boxstyle='round', linewidth=0.3))

    plt.title(title)
    plt.show()


def plotAuucValue(ranked_result, label):
    treatment = list(ranked_result['treatment'])
    target = list(ranked_result['target'])

    nt, nc, ntr, ncr = 0.0, 0.0, 0.0, 0.0

    res = []
    for i in range(0, len(target)):
        if treatment[i] == 1:
            nt += 1.0
            if target[i] == 1:
                ntr += 1.0
        if treatment[i] == 0:
            nc += 1.0
            if target[i] == 1:
                ncr += 1.0
        if nc == 0 or nt == 0:
            res.append(0)
        else:
            res.append((ntr / nt - ncr / nc) * (nt + nc))

    plt.plot([ind for ind in range(0, len(res))], res, label=label)
    plt.xlabel("nums")
    plt.ylabel("gain_mean")
    plt.legend()

    return res


X_train, X_test, y_train, y_test = uplift_split(df_model_bogo)

Xt_train, Xt_test, yt_train, yt_test, Xc_train, Xc_test, yc_train, yc_test = uplift_split_two_model_approach(
    X_train, X_test, y_train, y_test)

result_bogo_two_model_approach = uplift_two_model_approach(Xt_train, Xt_test, yt_train, yt_test, Xc_train, Xc_test,
                                                           yc_train, yc_test)
result_bogo_two_model_approach_ranked = qini_rank(result_bogo_two_model_approach)

result_bogo_class_transformation = uplift_class_transformation_method(X_train, X_test, y_train, y_test)
result_bogo_class_transformation_ranked = qini_rank(result_bogo_class_transformation)

result_bogo_multi_classification = uplift_multi_classification_model(X_train, X_test, y_train, y_test)
result_bogo_multi_classification_ranked = qini_rank(result_bogo_multi_classification)

plt.figure()
plotAuucValue(result_bogo_two_model_approach_ranked, "two_model_approach")
plotAuucValue(result_bogo_class_transformation_ranked, "class_transformation")
plotAuucValue(result_bogo_multi_classification_ranked, "multi_classification")
plotAuucValue(X_test, "base_line")
plt.title("AUUC")
plt.show()
# decile_chart(result_bogo_class_transformation_ranked, "decile chart class transformation")
# decile_chart(result_bogo_two_model_approach_ranked, "decile chart two model approach")

posted @ 2022-12-09 14:46 ylxn 阅读(517) 评论(0) 编辑收藏举报

刷新页面返回顶部

ylxn

时光旅客～