Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
93 views
in Technique[技术] by (71.8m points)

python - CSV conditional changes in rows and column

I have data like below:

idx A B C D
0 0.0 0.0 0.0 apple
1 0.5 0.5 0.6 car
2 0.7 0.7 0.2 vegetables
3 0.8 0.9 0.4 fruits
4 0.9 1.0 0.8 metal
idx E 
0 0.000006
idx A B C D
0 1.0 1.1 0.1 computer
1 0.8 1.6 1.0 books
2 0.9 1.9 1.1 textile
idx E
0 1.000009
idx A B C D
0 0.7 2.5 2 mouse
1 0.6 2.9 3 animals
2 0.5 3.0 2 birds
3 0.9 3.3 4 flower
4 1.0 3.4 5 garden
5 1.0 3.8 1 desk
6 0.85 3.9 8 tea
7 0.2 4.2 9 bread
8 0.1 4.9 3 paper
9 0.7 7.6 6 butter
idx E
0 0.9

I want to change where there is idx E remove the repeated header, repeat the last row above and make a dot instead of value of column D, and displace the E to column with its value (repeated to the whole corresponding). I want to change it conditionally as below with python like below :

idx A B C D E
0 0.0 0.0 0.0 apple 0.000006
1 0.5 0.5 0.6 car 0.000006
2 0.7 0.7 0.2 vegetables 0.000006
3 0.8 0.9 0.4 fruits 0.000006
4 0.9 1.0 0.8 metal 0.000006
5 0.9 1.0 0.0 . 0.000006
6 1.0 1.1 0.1 computer 1.000009
7 0.8 1.6 1.0 books 1.000009
8 0.9 1.9 1.1 textile 1.000009
9 0.9 1.9 . 1.000009
10 0.7 2.5 2 mouse 0.9
11 0.6 2.9 3 animals 0.9
12 0.5 3.0 2 birds 0.9
13 0.9 3.3 4 flower 0.9
14 1.0 3.4 5 garden 0.9
15 1.0 3.8 1 desk 0.9
16 0.85 3.9 8 tea 0.9
17 0.2 4.2 9 bread 0.9
18 0.1 4.9 3 paper 0.9
19 0.7 7.6 6 butter 0.9
20 0.7 7.6 0.0 . 0.9

Is there any possibility to make a conditional looping? with such dataframe?

question from:https://stackoverflow.com/questions/65902248/csv-conditional-changes-in-rows-and-column

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

First remove rows with A and E in column A by Series.isin in inverted mask by ~ in boolean indexing, create default index:

df = df[~df['A'].isin(['A','E'])].reset_index(drop=True)

Then set columns by mask for test Nr - set NaNs to D by Series.where and back filling misisng values, then set missing values by DataFrame.mask in A, B and forward filling misisng values and last set . in C column:

m = df['A'].shift().eq('E')
m1 = df['A'].eq('E')

df['E'] = df['A'].where(m).bfill()

df[['A','B', 'C']] = df[['A','B', 'C']].mask(m | m1).ffill()
df.loc[m, 'D'] = '.'
df.loc[m, 'C'] = 0

df = df[~m1].reset_index(drop=True)
print (df)
       A    B    C           D         E
0    0.0  0.0  0.0       apple  0.000006
1    0.5  0.5  0.6         car  0.000006
2    0.7  0.7  0.2  vegetables  0.000006
3    0.8  0.9  0.4      fruits  0.000006
4    0.9  1.0  0.8       metal  0.000006
5    0.9  1.0    0           .  0.000006
6    1.0  1.1  0.1    computer  1.000009
7    0.8  1.6  1.0       books  1.000009
8    0.9  1.9  1.1     textile  1.000009
9    0.9  1.9    0           .  1.000009
10   0.7  2.5    2       mouse       0.9
11   0.6  2.9    3     animals       0.9
12   0.5  3.0    2       birds       0.9
13   0.9  3.3    4      flower       0.9
14   1.0  3.4    5      garden       0.9
15   1.0  3.8    1        desk       0.9
16  0.85  3.9    8         tea       0.9
17   0.2  4.2    9       bread       0.9
18   0.1  4.9    3       paper       0.9
19   0.7  7.6    6      butter       0.9
20   0.7  7.6    0           .       0.9

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...