Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
286 views
in Technique[技术] by (71.8m points)

python - Retrieving Unknown Column Names from DataFrame.apply

How I can retrieve column names from a call to DataFrame apply without knowing them in advance?

What I'm trying to do is apply a mapping from column names to functions to arbitrary DataFrames. Those functions might return multiple columns. I would like to end up with a DataFrame that contains the original columns as well as the new ones, the amount and names of which I don't know at build-time.

Other solutions here are Series-based. I'd like to do the whole frame at once, if possible.

What am I missing here? Are the columns coming back from apply lost in destructuring unless I know their names? It looks like assign might be useful, but will likely require a lot of boilerplate.

import pandas as pd

def fxn(col):
    return pd.Series(col * 2, name=col.name+'2')

df = pd.DataFrame({'A': range(0, 10), 'B': range(10, 0, -1)})
print(df)

# [Edit:]
#    A   B
# 0  0  10
# 1  1   9
# 2  2   8
# 3  3   7
# 4  4   6
# 5  5   5
# 6  6   4
# 7  7   3
# 8  8   2
# 9  9   1

df = df.apply(fxn)
print(df)

# [Edit:]
# Observed: columns changed in-place.
#     A   B
# 0   0  20
# 1   2  18
# 2   4  16
# 3   6  14
# 4   8  12
# 5  10  10
# 6  12   8
# 7  14   6
# 8  16   4
# 9  18   2

df[['A2', 'B2']] = df.apply(fxn)
print(df)

# [Edit: I am doubling column values, so missing something, but the question about the column counts stands.]
# Expected: new columns added. How can I do this at runtime without knowing column names?
#     A   B  A2  B2
# 0   0  40   0  80
# 1   4  36   8  72
# 2   8  32  16  64
# 3  12  28  24  56
# 4  16  24  32  48
# 5  20  20  40  40
# 6  24  16  48  32
# 7  28  12  56  24
# 8  32   8  64  16
# 9  36   4  72   8
question from:https://stackoverflow.com/questions/65516973/retrieving-unknown-column-names-from-dataframe-apply

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Answer on the behalf of OP:

This code does what I wanted:

import pandas as pd

# Simulated business logic: for an input row, return a number of columns
# related to the input, and generate names for them, such that we don't
# know the shape of the output or the names of its columns before the call.
def fxn(row):
    length = row[0]
    indicies = [row.index[0] + str(i) for i in range(0, length)]
    series = pd.Series([i for i in range(0, length)], index=indicies)
    return series

# Sample data: 0 to 18, inclusive, counting by 2.
df1 = pd.DataFrame(list(range(0, 20, 2)), columns=['A'])

# Randomize the rows to simulate different input shapes.
df1 = df1.sample(frac=1)

# Apply fxn to rows to get new columns (with expand). Concat to keep inputs.
df1 = pd.concat([df1, df1.apply(fxn, axis=1, result_type='expand')], axis=1)
print(df1)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...