Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
316 views
in Technique[技术] by (71.8m points)

python - DF from list of tuples encoded by first value of tuple

I have a list of tuples where each tuple has following structure:

(domain, field_1, field_2, field_3,..., field_n)

each field_x has some value (string or integer) of is None. The domain field has a limited amount of unique values. I would like to end up with a dataframe where each column represents unique value of the domain and each column represents percentage of None values in field_x per domain.

Example:

[(domain1, None, str,  str,  None),
 (domain2, int , str,  None, str),
 (domain1, int , None, str,  str)]

becomes:

          domain1     domain2
field_1    0.5          0

field_2    0.5          0

field_3     0           1

field_4    0.5          0

Thanks for any help!

question from:https://stackoverflow.com/questions/65841853/df-from-list-of-tuples-encoded-by-first-value-of-tuple

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

First convert list to DataFrame constructor, create index by first column 0, test None by DataFrame.isna, get percentages by mean, add prefix field, remove 0 and last transpose:

df = (pd.DataFrame(L)
        .set_index(0)
        .isna()
        .mean(level=0)
        .add_prefix('field_')
        .rename_axis(None)
        .T)
print (df)
         domain1  domain2
field_1      0.5      0.0
field_2      0.5      0.0
field_3      0.0      1.0
field_4      0.5      0.0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...