Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
209 views
in Technique[技术] by (71.8m points)

python - How to use a ndarray of stored ndarrays with memmap as a big ndarray tensor

I recently started to use numpy memmap to link an array in my project since I have a 3 dimensions tensor for a total of 133 billions values for a graph of the dataset I am using as example.

I am trying to calculate the heat kernel signature of a 5748 nodes graph (21st of DD dataset). My code to calculate the projectors (where I use memmap) is:

Path('D:/hks_temp').mkdir(parents=True, exist_ok=True)
for l, ll in enumerate(L):
    pl = np.zeros((n, n))
    for k in ll:
        pl += np.outer(evecs[:, k], evecs[:, k])
    fp = np.memmap('D:/hks_temp/{}_hks.npy'.format(l), dtype='float32', mode='w+', shape=(n, n))
    fp[:] = pl[:]
    fp.flush()

inside all the X_hks.npy there is a n by n ndarray (from the example 5748 * 5748).

Then I want all these computed arrays to form the 3 dimension tensor so I "link" (I don't know if it's the right term) them in this way:

P = np.array([None] * len(L))    # len(L) = 4043
for l in range(len(L)):
    P[l] = np.memmap('D:/hks_temp/{}_hks.npy'.format(l), dtype='float32', mode='r', shape=(n, n))

P is used later only to do inside a cycle H = np.einsum('ijk,i->jk', P, np.exp(-unique_eval * t)).

However, that raises an error: ValueError: einstein sum subscripts string contains too many subscripts for operand 0. Since the method is correct for smaller graphs that doesn't require memmap, my thought was that P isn't well structured for numpy and I must arrange the data, maybe doing a reshape. So I tried to do a P.reshape(len(L), n, n) but it doesn't work giving ValueError: cannot reshape array of size 4043 into shape (4043,5748,5748). How can I make it work?

I already found this question but it doesn't fit this case. I think I can't store all inside one big object since it did 497GB of memmap files (126MB each). If I can do it, please tell me.

If it is impossible to do it I will reduce the use case, however I am quite interested to make it work for all the possibilities.

question from:https://stackoverflow.com/questions/66050844/how-to-use-a-ndarray-of-stored-ndarrays-with-memmap-as-a-big-ndarray-tensor

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...