Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
247 views
in Technique[技术] by (71.8m points)

Is it possible in Pandas to drop just empty cells?

is it possible to drop empty cells (i mean just the cells, not the row or column). I've got a dataframe where i just need all numbers to the left side, similar to this one:

Frame with empty cells

Frame with empty cells

Frame how it should look like

Frame how it should look like


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

One way to do it is using np.isnan and create a boolean and then argsort them which will all NaN to the end while maintaining the original order non-nan values. Then drop column whose all column values are NaN

idx = np.isnan(df.values).argsort(axis=1)
df = pd.DataFrame(
    df.values[np.arange(df.shape[0])[:, None], idx],
    index=df.index,
    columns=df.columns,
).dropna(how="all", axis=1)
print(df)

      0     1     2     3     4     5     6     7
0   1.0   3.0   4.0   5.0   6.0   8.0   9.0   NaN
1  11.0  12.0  15.0  16.0  17.0  19.0  20.0   NaN
2  22.0  23.0  24.0  25.0  27.0  28.0  29.0  30.0

Details

np.isnan(df.values)

#      non-nan val  nan value
#          |        |
# array([[False,  True, False, False, False, False,  True, False, False, True],
#        [False, False,  True,  True, False, False, False,  True, False, False],
#        [ True, False, False, False, False,  True, False, False, False, False]])

# False ⟶ 0 True ⟶ 1
# When sorted all True values i.e nan are pushed to the right.

idx = np.isnan(df.values).argsort(axis=1)

# array([[0, 2, 3, 4, 5, 7, 8, 1, 6, 9],
#        [0, 1, 4, 5, 6, 8, 9, 2, 3, 7],
#        [1, 2, 3, 4, 6, 7, 8, 9, 0, 5]], dtype=int64)

# Now, indexing `df.values` using `idx`

pd.DataFrame(
    df.values[np.arange(df.shape[0])[:, None], idx],
    index=df.index,  # This is important if you have custom index like say a, b, c...
    columns=df.columns # If you custom column names
)

#       0     1     2     3     4     5     6     7   8   9
# 0   1.0   3.0   4.0   5.0   6.0   8.0   9.0   NaN NaN NaN
# 1  11.0  12.0  15.0  16.0  17.0  19.0  20.0   NaN NaN NaN
# 2  22.0  23.0  24.0  25.0  27.0  28.0  29.0  30.0 NaN NaN
#                                                    ⤓ ⤓ 
#                             `_.dropna(how='all', axis=1)` All Nans in column 
#                              so drop them

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...