Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
202 views
in Technique[技术] by (71.8m points)

python - Construct NetworkX graph from Pandas DataFrame

I'd like to create some NetworkX graphs from a simple Pandas DataFrame:

        Loc 1   Loc 2   Loc 3   Loc 4   Loc 5   Loc 6   Loc 7
Foo     0       0       1       1       0       0           0
Bar     0       0       1       1       0       1           1
Baz     0       0       1       0       0       0           0
Bat     0       0       1       0       0       1           0
Quux    1       0       0       0       0       0           0

Where Foo… is the index, and Loc 1 to Loc 7 are the columns. But converting to Numpy matrices or recarrays doesn't seem to work for generating input for nx.Graph(). Is there a standard strategy for achieving this? I'm not averse the reformatting the data in Pandas --> dumping to CSV --> importing to NetworkX, but it seems as if I should be able to generate the edges from the index and the nodes from the values.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

NetworkX expects a square matrix (of nodes and edges), perhaps* you want to pass it:

In [11]: df2 = pd.concat([df, df.T]).fillna(0)

Note: It's important that the index and columns are in the same order!

In [12]: df2 = df2.reindex(df2.columns)

In [13]: df2
Out[13]: 
       Bar  Bat  Baz  Foo  Loc 1  Loc 2  Loc 3  Loc 4  Loc 5  Loc 6  Loc 7  Quux
Bar      0    0    0    0      0      0      1      1      0      1      1     0
Bat      0    0    0    0      0      0      1      0      0      1      0     0
Baz      0    0    0    0      0      0      1      0      0      0      0     0
Foo      0    0    0    0      0      0      1      1      0      0      0     0
Loc 1    0    0    0    0      0      0      0      0      0      0      0     1
Loc 2    0    0    0    0      0      0      0      0      0      0      0     0
Loc 3    1    1    1    1      0      0      0      0      0      0      0     0
Loc 4    1    0    0    1      0      0      0      0      0      0      0     0
Loc 5    0    0    0    0      0      0      0      0      0      0      0     0
Loc 6    1    1    0    0      0      0      0      0      0      0      0     0
Loc 7    1    0    0    0      0      0      0      0      0      0      0     0
Quux     0    0    0    0      1      0      0      0      0      0      0     0

In[14]: graph = nx.from_numpy_matrix(df2.values)

This doesn't pass the column/index names to the graph, if you wanted to do that you could use relabel_nodes (you may have to be wary of duplicates, which are allowed in pandas' DataFrames):

In [15]: graph = nx.relabel_nodes(graph, dict(enumerate(df2.columns))) # is there nicer  way than dict . enumerate ?

*It's unclear exactly what the columns and index represent for the desired graph.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...