Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
185 views
in Technique[技术] by (71.8m points)

python - Replace duplicate values across columns in Pandas

I have a simple dataframe as such:

df = [    {'col1' : 'A', 'col2': 'B', 'col3':   'C', 'col4':'0'},
          {'col1' : 'M', 'col2':   '0', 'col3': 'M', 'col4':'0'},
          {'col1' : 'B', 'col2':  'B', 'col3':  '0', 'col4':'B'},
          {'col1' : 'X', 'col2':  '0', 'col3':  'Y', 'col4':'0'}
          ]
df = pd.DataFrame(df)
df = df[['col1', 'col2', 'col3', 'col4']]
df  

Which looks like this:

| col1 | col2 | col3 | col4 |
|------|------|------|------|
| A    | B    | C    | 0    |
| M    | 0    | M    | 0    |
| B    | B    | 0    | B    |
| X    | 0    | Y    | 0    |

I just want to replace repeated characters with the character '0', across the rows. It boils down to keeping the first duplicate value we come across, as like this:

| col1 | col2 | col3 | col4 |
|------|------|------|------|
| A    | B    | C    | 0    |
| M    | 0    | 0    | 0    |
| B    | 0    | 0    | 0    |
| X    | 0    | Y    | 0    |

This seems so simple but I'm stuck. Any nudges in the right direction would be really appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use the duplicated method to return a boolean indexer of whether elements are duplicates or not:

In [214]: pd.Series(['M', '0', 'M', '0']).duplicated()
Out[214]:
0    False
1    False
2     True
3     True
dtype: bool

Then you could create a mask by mapping this across the rows of your dataframe, and using where to perform your substitution:

is_duplicate = df.apply(pd.Series.duplicated, axis=1)
df.where(~is_duplicate, 0)

  col1 col2 col3 col4
0    A    B    C    0
1    M    0    0    0
2    B    0    0    0
3    X    0    Y    0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...