We don't implement proximity matrix in Scikit-Learn (yet).
However, this could be done by relying on the apply
function provided in our implementation of decision trees. That is, for all pairs of samples in your dataset, iterate over the decision trees in the forest (through forest.estimators_
) and count the number of times they fall in the same leaf, i.e., the number of times apply
give the same node id for both samples in the pair.
Hope this helps.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…