I have a large set of data in which I need to compare the distances of a set of samples from this array with all the other elements of the array. Below is a very simple example of my data set.
import numpy as np
import scipy.spatial.distance as sd
data = np.array(
[[ 0.93825827, 0.26701143],
[ 0.99121108, 0.35582816],
[ 0.90154837, 0.86254049],
[ 0.83149103, 0.42222948],
[ 0.27309625, 0.38925281],
[ 0.06510739, 0.58445673],
[ 0.61469637, 0.05420098],
[ 0.92685408, 0.62715114],
[ 0.22587817, 0.56819403],
[ 0.28400409, 0.21112043]]
)
sample_indexes = [1,2,3]
# I'd rather not make this
other_indexes = list(set(range(len(data))) - set(sample_indexes))
sample_data = data[sample_indexes]
other_data = data[other_indexes]
# compare them
dists = sd.cdist(sample_data, other_data)
Is there a way to index a numpy array for indexes that are NOT the sample indexes? In my above example I make a list called other_indexes. I'd rather not have to do this for various reasons (large data set, threading, a very VERY low amount of memory on the system this is running on etc. etc. etc.). Is there a way to do something like..
other_data = data[ indexes not in sample_indexes]
I read that numpy masks can do this but I tried...
other_data = data[~sample_indexes]
And this gives me an error. Do I have to create a mask?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…