時(shí)間:2023-03-16 00:24:02 | 來(lái)源:電子商務(wù)
時(shí)間:2023-03-16 00:24:02 來(lái)源:電子商務(wù)
字段 | 說(shuō)明 |
---|---|
user_id | 整數(shù)型,用戶ID |
item_id | 整數(shù)型,商品ID |
category_id | 整數(shù)型,商品品類ID |
behavior_type | 字符型,行為類型('pv','buy','cart','fav') |
timestamp | 整數(shù)型,時(shí)間戳 |
行為類型 | 說(shuō)明 |
---|---|
pv | page view,瀏覽 |
buy | 購(gòu)買(mǎi) |
cart | 加入購(gòu)物車(chē) |
fav | 收藏 |
select *from userbehaviorgroup by user_id,item_id,behavior_type,timestamphaving count(*) > 1;
查找出來(lái)所有字段均為Null值,說(shuō)明該數(shù)據(jù)集中沒(méi)有重復(fù)值。select *from userbehaviorwhere user_id is NULL or category_id is NULLor item_id is NULL or behavior_type is NULLor timestamp is NULL;
查找出來(lái)所有字段均為Null值,說(shuō)明該數(shù)據(jù)集中沒(méi)有空值。-- 給表添加一列叫dates,用于記錄行為發(fā)生的日期,如'2017-11-25'為2017年11月25日alter table userbehavior add datesvarchar(255);-- 給新列dates賦值update userbehaviorset dates =from_unixtime(timestamp,'%Y-%m-%d');-- 給表添加一列叫hours,用于記錄發(fā)生的小時(shí)時(shí)間,如'21'為晚上9點(diǎn)alter table userbehavior add hoursvarchar(255);-- 給新列hours賦值-- 特別注意,此處的時(shí)間格式中,"小時(shí)“部分的字符需要用大寫(xiě)的H,才能轉(zhuǎn)為24小時(shí)制update userbehaviorset hours = hour(from_unixtime(timestamp,'%H:%i:%s'));-- 給表添加一列叫datetime,用于記錄行為發(fā)生的日期和時(shí)間,如'2017-11-25 21:30:00'為2017年11月25日晚上9點(diǎn)30分alter table userbehavior add datetimevarchar(255);-- 給新列datetime賦值update userbehaviorset datetime =from_unixtime(timestamp,'%Y-%m-%d %H:%i:%s');-- 給表添加一列叫weekday,用于記錄行為發(fā)生在星期幾,如'1'為星期一alter table userbehavior add weekdayvarchar(255);/*-mysql中,可以通過(guò)函數(shù)WEEKDAY()將日期轉(zhuǎn)化為星期幾,但是索引是從0開(kāi)始,即0表示星期一、1表示星期二.....-為了使該字段更符合閱讀習(xí)慣,可以將其進(jìn)行加1操作,結(jié)果即為1對(duì)應(yīng)星期一,2對(duì)應(yīng)星期二...*/update userbehaviorset weekday = WEEKDAY(datetime) + 1;
接著,我們將超出分析范圍的時(shí)間范圍剔除掉:-- 將超出指定時(shí)間范圍的數(shù)據(jù)剔除掉DELETEfrom userbehaviorwhere dates < '2017-11-25'or dates > '2017-12-03';
經(jīng)過(guò)處理后,數(shù)據(jù)集如下(以下給出部分?jǐn)?shù)據(jù)):select count(distinct user_id) as 會(huì)員數(shù)from userbehavior;
即在2017年11月25日至2017年12月3日之間,總共記錄了983名會(huì)員的行為數(shù)據(jù)。A(Awareness)認(rèn)知:廣告曝光、頁(yè)面瀏覽人群AIPL營(yíng)銷模型可以將會(huì)員行為與相應(yīng)的運(yùn)營(yíng)周期聯(lián)系起來(lái)。
I(Interested)興趣:關(guān)注互動(dòng)、瀏覽店鋪主頁(yè)、瀏覽產(chǎn)品詳情頁(yè)、收藏加購(gòu)
P(Purchase)購(gòu)買(mǎi):支付
L(Loyalty)忠誠(chéng):復(fù)購(gòu)、評(píng)論、分享
create view behavior ASselect user_id,datetime,dates,hours,max(case behavior_type when 'pv' then 1 else 0 end) as view,max(case behavior_type when 'fav' then 1 else 0 end) as favor,max(case behavior_type when 'cart' then 1 else 0 end) as cart,max(case behavior_type when 'buy' then 1 else 0 end) as buyfrom userbehaviorgroup by user_id,datetime;
創(chuàng)建出的視圖如下所示(給出部分?jǐn)?shù)據(jù)):select sum(view) as 'A',sum(favor)+sum(cart) as 'I',sum(buy) as 'P'from behavior;
計(jì)算L的值:select user_id,datetime,buy,dense_rank() over (partition by user_id order by datetime) as n_consumefrom behaviorwhere buy = 1order by user_id,datetime;
第二步,剔除掉用戶第一次購(gòu)買(mǎi)的記錄并計(jì)算復(fù)購(gòu)次數(shù)select sum(buy) as Lfrom(select user_id,datetime,buy,dense_rank() over (partition by user_id order by datetime asc) as n_consumefrom behaviorwhere buy = 1) as twhere n_consume > 1;
3、總結(jié)#第一步,查詢出每個(gè)小時(shí)會(huì)員的瀏覽次數(shù)select hour(datetime) as '小時(shí)',count(*) as 'A-瀏覽'from behaviorwhere view = 1group by hour(datetime)order by hour(datetime) ASC;
#第二步,查詢出每個(gè)小時(shí)會(huì)員進(jìn)行收藏與加入購(gòu)物車(chē)行為的次數(shù)select hour(datetime) as '小時(shí)',count(*) as 'I-興趣'from behaviorwhere favor = 1 or cart = 1group by hour(datetime)order by hour(datetime) asc;
#第三步,根據(jù)前兩步計(jì)算出A→I轉(zhuǎn)化率的均值select a.`小時(shí)`,avg(`A-瀏覽`) as `A-瀏覽 均值`,avg(`I-興趣`) as `I-興趣 均值`,concat(round(avg(`I-興趣`)/avg(`A-瀏覽`),3)*100,'%') as 'A→I轉(zhuǎn)化率'from(select hour(datetime) as '小時(shí)',count(*) as 'A-瀏覽'from behaviorwhere view = 1group by hour(datetime)) as aleft join(select hour(datetime) as '小時(shí)',count(*) as 'I-興趣'from behaviorwhere favor = 1 or cart = 1group by hour(datetime)) as ion a.`小時(shí)` = i.`小時(shí)`group by a.`小時(shí)`order by a.`小時(shí)`;
可以看到,在查詢結(jié)果中,有很多不同的值,根據(jù)對(duì)比分析方法,我們需要【比較基準(zhǔn)】才知道這些轉(zhuǎn)化率哪些高,哪些低,才可進(jìn)一步判斷。這里,以常見(jiàn)的“平均值”作為對(duì)比的基準(zhǔn),代碼實(shí)現(xiàn)如下(即上一步去掉分組的步驟):select a.`小時(shí)`,avg(`A-瀏覽`) as `A-瀏覽 均值`,avg(`I-興趣`) as `I-興趣 均值`,concat(round(avg(`I-興趣`)/avg(`A-瀏覽`),3)*100,'%') as 'A→I轉(zhuǎn)化率'from(select hour(datetime) as '小時(shí)',count(*) as 'A-瀏覽'from behaviorwhere view = 1group by hour(datetime)) as aleft join(select hour(datetime) as '小時(shí)',count(*) as 'I-興趣'from behaviorwhere favor = 1 or cart = 1group by hour(datetime)) as ion a.`小時(shí)` = i.`小時(shí)`order by a.`小時(shí)`;
每個(gè)時(shí)段的瀏覽人數(shù)都不同,選擇瀏覽值大于平均值3733的時(shí)段來(lái)觀察,在這些時(shí)段中,我們選擇轉(zhuǎn)化率大于平均值9.1%的時(shí)段。select count(distinct item_id) as 'A'from userbehaviorwhere behavior_type = 'pv';
結(jié)果如圖所示,共有60592款商品被有效瀏覽。select count(distinct item_id) as 'I'from userbehaviorwhere behavior_type in ('favor','cart');
接下來(lái),結(jié)合【假設(shè)檢驗(yàn)分析方法】,針對(duì)有效瀏覽與產(chǎn)生興趣的商品數(shù)差異較大的情況,提出以下的假設(shè)。#第一步,查詢出被瀏覽最多的前100款商品,按降序排列select item_id,count(*) as 'A'from userbehaviorwhere behavior_type = 'pv'group by item_idorder by A DESClimit 100;
查詢結(jié)果如下(給出部分?jǐn)?shù)據(jù)預(yù)覽):#第二步,查詢出用戶最感興趣的前100款商品,按降序排列select item_id,count(*) as 'I'from userbehaviorwhere behavior_type in ('favor','cart')group by item_idorder by I DESClimit 100;
查詢結(jié)果如下(給出部分?jǐn)?shù)據(jù)預(yù)覽):#第三步,計(jì)算交叉數(shù)量select count(*) as 'AI交叉商品數(shù)'from(select item_id,count(*) as 'A'from userbehaviorwhere behavior_type = 'pv'group by item_idorder by A DESClimit 100) as ACinner JOIN(select item_id,count(*) as 'I'from userbehaviorwhere behavior_type in ('favor','cart')group by item_idorder by I DESClimit 100) as ICon AC.item_id = IC.item_id;
結(jié)果顯示,僅有5款商品既是高瀏覽量,也是高收藏/加購(gòu)的,活動(dòng):天貓雙12年終品牌盛典
預(yù)熱時(shí)間:2017年12月7日00:00:00-2017年12月11日23:59:59;
上線時(shí)間:2017年12月12日00:00:00-2017年12月12日23:59:59。
#創(chuàng)建一個(gè)視圖,記錄所有用戶的購(gòu)買(mǎi)記錄CREATE view consumeas select `user_id`,`dates`,dense_rank() OVER (PARTITION BY `user_id` ORDER BY `dates`) AS `consume_num` from `behavior` where `buy` = 1group by `user_id`,`dates` order by `user_id`,`dates`;
結(jié)果如下(給出部分?jǐn)?shù)據(jù)預(yù)覽):-- 計(jì)算出每個(gè)用戶每次消費(fèi)的回購(gòu)周期select a.user_id,a.dates,a.consume_num,datediff(a.dates,b.dates) as '回購(gòu)周期(天)'FROM(select * from consume) as ainner join(select * from consume) as bon a.user_id = b.user_idand b.consume_num = a.consume_num - 1;
結(jié)果如下(給出部分?jǐn)?shù)據(jù)預(yù)覽):-- 將每人次的回購(gòu)周期進(jìn)行平均,取得最終的平均回購(gòu)周期select avg(datediff(a.dates,b.dates)) as '平均回購(gòu)周期(天)'from(select * from consume) as ainner join(select * from consume) as bwhere a.user_id = b.user_idand b.consume_num = a.consume_num - 1;
結(jié)論:根據(jù)計(jì)算結(jié)果,預(yù)計(jì)在用戶購(gòu)買(mǎi)行為發(fā)生后的2-3天內(nèi)對(duì)其進(jìn)行觸達(dá)會(huì)有較好的效果。#創(chuàng)建視圖,統(tǒng)計(jì)每個(gè)客戶的購(gòu)買(mǎi)率與其他數(shù)據(jù)create view user_featureASselectuser_id,sum(view) as '瀏覽',sum(favor) as '收藏',sum(cart) as '加購(gòu)',sum(buy) as '購(gòu)買(mǎi)',(sum(cart) + sum(favor))/sum(view) as '加購(gòu)率',sum(buy)/(sum(view) + sum(cart) + sum(favor)) as '購(gòu)買(mǎi)率',dense_rank() over (order by sum(buy)/(sum(view) + sum(cart) + sum(favor)) desc) as '購(gòu)買(mǎi)率排序'from behaviorgroup by user_id;
創(chuàng)建完的視圖如下所示(給出部分?jǐn)?shù)據(jù)預(yù)覽):select *from user_featureorder by 購(gòu)買(mǎi)率 desc;
計(jì)算完每個(gè)客戶的購(gòu)買(mǎi)率之后,接著挖掘高購(gòu)買(mǎi)率客戶與低購(gòu)買(mǎi)率客戶的特征。select count(distinct 購(gòu)買(mǎi)率排序) as 總名次from user_feature;
根據(jù)387 * 20% ≈ 77,高購(gòu)買(mǎi)率客戶可以被定義為前77名的客戶,而根據(jù)387 - 77 = 310,低購(gòu)買(mǎi)率客戶可以被定義為310名之后的客戶。select avg(瀏覽) as '平均瀏覽數(shù)',avg(收藏 + 加購(gòu)) as '平均興趣數(shù)',avg(加購(gòu)率) as '平均加購(gòu)率'from user_featurewhere '購(gòu)買(mǎi)率排序' <= 77;
select count(distinct category_id) as '購(gòu)買(mǎi)品類集中度' from userbehaviorwhere behavior_type ='buy'and user_id in (select distinct user_id from user_featurewhere `購(gòu)買(mǎi)率排序` <= 77);
第三步,查詢出低購(gòu)買(mǎi)率客戶的特征select avg(`瀏覽`) as '平均瀏覽數(shù)',avg(`收藏`+`加購(gòu)`) as '平均興趣數(shù)',avg(`加購(gòu)率`) as '平均加購(gòu)率'from user_featurewhere `購(gòu)買(mǎi)率排序` > 310 and `購(gòu)買(mǎi)率` > 0;
select count(distinct category_id) as '購(gòu)買(mǎi)品類集中度' from userbehaviorwhere behavior_type ='buy'and user_id in (select distinct user_id from user_featurewhere `購(gòu)買(mǎi)率排序` > 310);
總結(jié):#第一步,計(jì)算每位用戶最近一次購(gòu)買(mǎi)行為的間隔天數(shù)select user_id,datediff('2017-12-03',dates) as 'R'from consume as awhere consume_num = (select max(consume_num)from consume as bwhere a.user_id = b.user_id);
查詢結(jié)果如下(給出部分?jǐn)?shù)據(jù)):select user_id,(sum(favor) + sum(cart)) as 'F',sum(buy) as 'M'from behaviorgroup by user_id;
查詢結(jié)果如下(給出部分?jǐn)?shù)據(jù)):create view RFMasselect r.user_id,R,F,Mfrom(select user_id,datediff('2017-12-03',dates) as 'R'from consume as awhere consume_num = (select max(consume_num)from consume as bwhere a.user_id = b.user_id)) as rleft join (select user_id,(sum(favor) + sum(cart)) as 'F',sum(buy) as 'M'from behaviorgroup by user_id) as fmon r.user_id = fm.user_id;
查詢結(jié)果如下(給出部分?jǐn)?shù)據(jù)):#第四步,計(jì)算出RFM的閾值(即比較基準(zhǔn))select avg(R),avg(F),avg(M)from RFM;
第五步,根據(jù)RFM的比較基準(zhǔn)以及RFM的定義,將每位可以劃分到不同的用戶價(jià)值組中create view user_RFM asselect user_id, case when R < 2.4501 and F > 9.3115 and M > 2.8018 then '重要價(jià)值用戶' when R < 2.4501 and F > 9.3115 and M < 2.8018 then '一般價(jià)值用戶' when R < 2.4501 and F < 9.3115 and M > 2.8018 then '重要發(fā)展用戶' when R < 2.4501 and F < 9.3115 and M < 2.8018 then '一般發(fā)展用戶' when R > 2.4501 and F > 9.3115 and M > 2.8018 then '重要喚回用戶' when R > 2.4501 and F > 9.3115 and M < 2.8018 then '一般喚回用戶' when R > 2.4501 and F < 9.3115 and M > 2.8018 then '重要挽留用戶' when R > 2.4501 and F < 9.3115 and M < 2.8018 then '一般挽留用戶' end as '用戶價(jià)值族群'from RFM;
結(jié)果如下所示(給出部分?jǐn)?shù)據(jù)預(yù)覽):select `用戶價(jià)值族群`, count(user_id) '用戶數(shù)', concat(round(count(user_id)/983*100,3),'%') as '用戶數(shù)占比'from user_RFMgroup by `用戶價(jià)值族群`order by `用戶價(jià)值族群`;
結(jié)果如下:關(guān)鍵詞:數(shù)據(jù),分析,用戶,平臺(tái)
客戶&案例
營(yíng)銷資訊
關(guān)于我們
客戶&案例
營(yíng)銷資訊
關(guān)于我們
微信公眾號(hào)
版權(quán)所有? 億企邦 1997-2025 保留一切法律許可權(quán)利。