python - transform dataframe to dataframe with continuous index and columns

Question

Welcome To Ask or Share your Answers For Others

python - transform dataframe to dataframe with continuous index and columns

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - transform dataframe to dataframe with continuous index and columns

let's assume that we have the following data frame p

x    1     2     5
y                 
1  0.5  0.00  0.25
3  0.0  0.25  0.00

As you can see the x values are missing the values 3,4 and the y values are missing the value 2 in order for the column and index names to be continuous. As I want to plot the array via imshow, I need extend the dataframe p by the missing values, resulting in:

x    1     2  3  4     5
y                       
1  0.5  0.00  0  0  0.25
2  0.0  0.00  0  0  0.00
3  0.0  0.25  0  0  0.00

I can achieve this by writing custom functions:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

def make_columns_continuous(p):
    for val in range(p.columns.min(),p.columns.max()):
        if val not in p.columns:
            p[val] = 0
            
    p = p.sort_values(by=p.columns.name,axis=1)
    return p

def make_rows_continuous(p):
    for val in range(p.index.min(),p.index.max()):
        if val not in p.index:
            s = pd.Series([0]*len(p.columns), index=p.columns, name=val)
            p = p.append(s)
            
    p = p.sort_values(by=p.index.name,axis=0)
    return p

df = pd.DataFrame({'x':[1,1,2,5],'y':[1,1,3,1]})

p = pd.crosstab(df.y,df.x,normalize=True)

#creates the data frame
#x    1     2     5
#y                 
#1  0.5  0.00  0.25
#3  0.0  0.25  0.00 


p = make_columns_continuous(p)
p = make_rows_continuous(p)

#yields:
#x    1     2  3  4     5
#y                       
#1  0.5  0.00  0  0  0.25
#2  0.0  0.00  0  0  0.00
#3  0.0  0.25  0  0  0.00

Is there a better way to achieve this transformation? Are there even built-in pandas functions? Something like DataFrame to sparse matrix?

question from:https://stackoverflow.com/questions/65830215/transform-dataframe-to-dataframe-with-continuous-index-and-columns

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:37:16+0000

Another option would be to create a new DataFrame of the final size you want, and then fill it with the data from p:

p.columns = p.columns.astype(int)
newrows = range(p.index.min(), p.index.max()+1)
newcols = range(p.columns.min(), p.columns.max()+1)

df = pd.DataFrame(index=newrows, columns=newcols, data=0)

#    1  2  3  4  5
# 1  0  0  0  0  0
# 2  0  0  0  0  0
# 3  0  0  0  0  0

df.loc[p.index, p.columns] = p

#      1     2  3  4     5
# 1  0.5  0.00  0  0  0.25
# 2  0.0  0.00  0  0  0.00
# 3  0.0  0.25  0  0  0.00

Categories

python - transform dataframe to dataframe with continuous index and columns

python - transform dataframe to dataframe with continuous index and columns

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags