Creating a tag from one DataFrame column according to a group from another column in pandas python

Question

Welcome To Ask or Share your Answers For Others

Creating a tag from one DataFrame column according to a group from another column in pandas python

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

Creating a tag from one DataFrame column according to a group from another column in pandas python

The following is a simplification of my data frame: I have thousands of gene pairs that repeat in different cell_types and 3 cell types (9 combinations possible)

Gene pairs	cell_types	other_data
gene4_gene5	cell1_cell2
gene1_gene2	cell1_cell1
gene1_gene2	cell1_cell3
gene2_gene3	cell3_cell2
gene4_gene5	cell2_cell2
gene4_gene5	cell1_cell2

question from:https://stackoverflow.com/questions/65942626/creating-a-tag-from-one-dataframe-column-according-to-a-group-from-another-colum

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T18:56:37+0000

Given that you'd like to stick to your original data structure a solution could be to use df.loc to find all values in the cell_types column that match the given value in the 'Gene pairs' column, convert that to a list and check if all of the values in a predefined list of cell types that defines a "universal sender" appear in that list:

import pandas as pd

data = [ { "Gene pairs": "gene4_gene5", "cell_types": "cell1_cell2" }, { "Gene pairs": "gene1_gene2", "cell_types": "cell1_cell1" }, { "Gene pairs": "gene1_gene2", "cell_types": "cell1_cell3" }, { "Gene pairs": "gene2_gene3", "cell_types": "cell3_cell2" }, { "Gene pairs": "gene4_gene5", "cell_types": "cell1_cell1" }, { "Gene pairs": "gene4_gene5", "cell_types": "cell1_cell3" } ]
df=pd.DataFrame(data)
df['new column'] = df['Gene pairs'].apply(lambda x: "universal sender" if all(item in df.loc[df['Gene pairs'] == x]['cell_types'].tolist() for item in ["cell1_cell2", "cell1_cell3", "cell1_cell1"]) else None)

Output:

|    | Gene pairs   | cell_types   | new column       |
|---:|:-------------|:-------------|:-----------------|
|  0 | gene4_gene5  | cell1_cell2  | universal sender |
|  1 | gene1_gene2  | cell1_cell1  |                  |
|  2 | gene1_gene2  | cell1_cell3  |                  |
|  3 | gene2_gene3  | cell3_cell2  |                  |
|  4 | gene4_gene5  | cell1_cell1  | universal sender |
|  5 | gene4_gene5  | cell1_cell3  | universal sender |

Or you can wrap it in a function for better readability or if you want to add additional filters:

def lookup(row):
  cells = sorted(df.loc[df['Gene pairs'] == row['Gene pairs']]['cell_types'].tolist())
  if all(item in cells for item in ["cell1_cell2", "cell1_cell3", "cell1_cell1"]):
    return_value = "universal sender" 
  else:
    return_value = None
  return return_value

df['new column'] = df.apply(lambda row: lookup(row), axis=1)

Categories

Creating a tag from one DataFrame column according to a group from another column in pandas python

Creating a tag from one DataFrame column according to a group from another column in pandas python

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags