Take the following dataframe:
df = pd.DataFrame({'col_1':[0, 1], 'col_2':['here 123', 'here 456']})
Result:
col_1 col_2
0 0 here 123
1 1 here 456
I need to create a 3rd column (broadcasting), using a condition on col_1
, and splitting the string on col_2
. This is ok to do:
df['col_3'] = float('NaN')
df.loc[df['col_1'] == 1, ['col_3']] = df['col_2'].str.slice(5, 8)
Result:
col_1 col_2 col_3
0 0 here 123 NaN
1 1 here 456 456
But I need to specify dynamic indexes to split the string on col_2
, instead of (5, 8).
When I try to run the following code it does not work, because df['col_2']
is treated as a Series
:
df.loc[df['col_1'] == 1, ['col_3']] = df['col_2'].split(' ')[0]
I'm spending a huge time trying to solve this without needing to iterate the dataframe.
question from:
https://stackoverflow.com/questions/65893903/how-can-i-use-split-in-a-string-when-broadcasting-a-dataframes-column 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…