python - Pandas: Sampling columns based on weights

Question

Welcome To Ask or Share your Answers For Others

python - Pandas: Sampling columns based on weights

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Pandas: Sampling columns based on weights

I have a pandas data frame with three columns containing probabilities:

Prob0 Prob1 Prob2
 0.1   0.6    0.3
 0.2   0.1    0.7

I need to generate a column that contains, for each row, the value 0 with probability Prob0, the value 1 with probability Prob1 and the value 2 with probability Prob2.

Alternatively, I am happy if I generate a column that contains the value Prob0 with probability Prob0, the value Prob1 with probability Prob1 and the value Prob2 with probability Prob2.

I have tried with the sample function, but it does not work:

population['ChoiceProba'] = population[['Prob0', 'Prob1', 'Prob2']].sample(weights=population[['Prob0', 'Prob1', 'Prob2']], axis=1)

I receive the error message:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I have also considered the numpy.random.choicefunction, but did not manage to combine it with a pandasstatement without a loop. I would like to avoid a loop, as I have 1000000 rows.

question from:https://stackoverflow.com/questions/65897751/pandas-sampling-columns-based-on-weights

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:16:03+0000

Hope I got your point.

Sicne I have no information about your factors, I followed your descprition and put 0, 1 and 0.5.

Here are three options to create this:

Using dot() recommended

weights = [0,1,0.5]
df['ChoiceProba'] = df.dot(weights)

using sum()

weights = [0,1,0.5]
df['ChoiceProba'] = (df * weights).sum(1)

using apply()

weights = [0,1,0.5]
df['ChoiceProba'] = df.apply(lambda x: weights[0]*x['Prob0']+weights[1]*x['Prob1']+weights[2]*x['Prob2'], axis=1)

Output

The Outpur is the same for all variations:

   Prob0  Prob1  Prob2  ChoiceProba
0    0.1    0.6    0.3         0.75
1    0.2    0.1    0.7         0.45

Categories

python - Pandas: Sampling columns based on weights

python - Pandas: Sampling columns based on weights

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags