Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
451 views
in Technique[技术] by (71.8m points)

python - Label Ordering in Scipy Dendrogram

In python, I have an N by N distance matrix dmat, where dmat[i,j] encodes the distance from entity i to entity j. I'd like to view a dendrogram. I did:

from scipy.cluster.hierarchy import dendrogram, linkage
import matplotlib.pylab as plt

labels=[name of entity 1,2,3,...]

Z=linkage(dmat)
dn=dendrogram(Z,labels=labels)
plt.show()

But the label ordering looks wrong. There are entities which are very close from dmat, but that's not reflected in the dendrogram. What's going on?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The first argument to linkage must be either the distances in condensed format, or the array of points being clustered. If you pass the square (N x N) distance matrix, linkage interprets it as N points in N-dimensional space.

You can convert from your square matrix to the condensed form with scipy.spatial.distance.squareform.

Add this to the beginning of your file

from scipy.spatial.distance import squareform

and replace this

Z=linkage(dmat)

with

Z = linkage(squareform(dmat))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...