I am using Python Pandas. I have 2 data-frames (namely: df1, df2). 'df1' contains header-level data, like card-id, issued-on date etc. 'df2' has granular-level data, like each transaction performed by a specific card-id. 'Card-id' is common between the two dataframes.
df1:
first_active_month card_id feature_1 feature_2 feature_3
2017-06 C_ID_92a2005557 5 2 1
2017-01 C_ID_3d0044924f 4 1 0
2016-08 C_ID_d639edf6cd 2 2 0
2017-09 C_ID_186d6a6901 4 3 0
2017-11 C_ID_cdbd2c0db2 1 3 0
df2:
junk_id authorized_flag card_id city_id Authorized
13292136 Y C_ID_92a2005557 101 N
20069042 Y C_ID_7a238b3713 69 N
5029656 Y C_ID_92a2005557 17 N
16356907 N C_ID_3d0044924f -1 Y
8203441 Y C_ID_fcf33361c2 17 N
I want to add a column "frequency" to df1 which will show me a count of occurrences of each card-id of df1 in df2. So, df1 should look like below:
df1 (after executing the command):
first_active_month card_id feature_1 feature_2 feature_3 frequency
2017-06 C_ID_92a2005557 5 2 1 2
2017-01 C_ID_3d0044924f 4 1 0 5
2016-08 C_ID_d639edf6cd 2 2 0 3
2017-09 C_ID_186d6a6901 4 3 0 1
2017-11 C_ID_cdbd2c0db2 1 3 0 7
Please note: I am new to Python / Pandas. I have already gone through multiple threads of this site, but all of them referred to counting in the same data-frame. I am looking for a counting using join/merge functionality. Threads which I have already browsed: this, this, this, this, this, this, this.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…