python - Select multiple columns by labels in pandas

Question

Welcome To Ask or Share your Answers For Others

python - Select multiple columns by labels in pandas

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Select multiple columns by labels in pandas

I've been looking around for ways to select columns through the python documentation and the forums but every example on indexing columns are too simplistic.

Suppose I have a 10 x 10 dataframe

df = DataFrame(randn(10, 10), index=range(0,10), columns=['A', 'B', 'C', 'D','E','F','G','H','I','J'])

So far, all the documentations gives is just a simple example of indexing like

subset = df.loc[:,'A':'C']

or

subset = df.loc[:,'C':]

But I get an error when I try index multiple, non-sequential columns, like this

subset = df.loc[:,('A':'C', 'E')]

How would I index in Pandas if I wanted to select column A to C, E, and G to I? It appears that this logic will not work

subset = df.loc[:,('A':'C', 'E', 'G':'I')]

I feel that the solution is pretty simple, but I can't get around this error. Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T00:10:13+0000

Name- or Label-Based (using regular expression syntax)

df.filter(regex='[A-CEG-I]')   # does NOT depend on the column order

Note that any regular expression is allowed here, so this approach can be very general. E.g. if you wanted all columns starting with a capital or lowercase "A" you could use: df.filter(regex='^[Aa]')

Location-Based (depends on column order)

df[ list(df.loc[:,'A':'C']) + ['E'] + list(df.loc[:,'G':'I']) ]

Note that unlike the label-based method, this only works if your columns are alphabetically sorted. This is not necessarily a problem, however. For example, if your columns go ['A','C','B'], then you could replace 'A':'C' above with 'A':'B'.

The Long Way

And for completeness, you always have the option shown by @Magdalena of simply listing each column individually, although it could be much more verbose as the number of columns increases:

df[['A','B','C','E','G','H','I']]   # does NOT depend on the column order

Results for any of the above methods

          A         B         C         E         G         H         I
0 -0.814688 -1.060864 -0.008088  2.697203 -0.763874  1.793213 -0.019520
1  0.549824  0.269340  0.405570 -0.406695 -0.536304 -1.231051  0.058018
2  0.879230 -0.666814  1.305835  0.167621 -1.100355  0.391133  0.317467

Categories

python - Select multiple columns by labels in pandas

python - Select multiple columns by labels in pandas

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Name- or Label-Based (using regular expression syntax)

Location-Based (depends on column order)

The Long Way

Results for any of the above methods

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags