Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
3.9k views
in Technique[技术] by (71.8m points)

python - Iteratively saving outputs in a pandas dataframe

My dataset contains a column with names that I want to check in a for loop:

Name      Age

John      32
Luke      23
Christine  54
Mary      39
AnneMarie  42
Eoin      23

I would need to check them via a website which generates a pair ('name', score), where score is a number. This pair comes from the following code (it cannot work as it was extracted only for showing how I have got data that I would like in my dataframe)

for name in df['Name']: 

   # missing code
    for c in zip(names, scores):
        print(c)

For example, when name = John, c gives me the following output:

('Julie', 6.7)
('Michael', 3.4)
('John John', 3.1)
('Ludo', 3.0)
('Chris', 3.0)

when name = Luke, c gives me the following output:

('Mary', 2.7)
('Michael', 2.1)
('Bill', 3.5)
('Jess', 3.2)

and so on.

I would like to add this information in my dataframe in order to have something like this:

 Name      Age                  Friends                        Score
    
    John      32     [Julie, Michael, John John, Ludo, Chris]  [6.7, 3.4, 3.1, 3.0, 3.0]
    Luke      23     [Mary, Michael, Bill, Jess]               [2.7,2.1, 3.5, 3.2]
    Christine  54
    Mary      39
    AnneMarie  42         ....
    Eoin      23

I would appreciate your help on this, on how I can get a similar dataframe by using the results c for each name in the Name column.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Try:

# add index here
for idx,name in df['Name'].iteritems(): 

    # missing code
    for c in zip(names, scores):
         print(c)

    df.loc[idx, 'Friends'] = names
    df.loc[idx, 'Score'] = scores

Or you can better aggregate all the names and scores and assign once after the for loop:

# initialization
name_lists, score_lists = [], []

for name in df['Name']: 

    # missing code
    for c in zip(names, scores):
         print(c)

    name_lists.append(names)
    score_lists.append(scores)

# update the data frame
df['Friends'] = name_lists
df['Score'] = score_lists

The latter code is slightly faster than the first for not-so-big dataframes. For bigger dataframes, append repeatedly can be very slow.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...