I have a 256x256x256
Numpy array, in which each element is a matrix. I need to do some calculations on each of these matrices, and I want to use the multiprocessing
module to speed things up.
The results of these calculations must be stored in a 256x256x256
array like the original one, so that the result of the matrix at element [i,j,k]
in the original array must be put in the [i,j,k]
element of the new array.
To do this, I want to make a list which could be written in a pseudo-ish way as [array[i,j,k], (i, j, k)]
and pass it to a function to be "multiprocessed".
Assuming that matrices
is a list of all the matrices extracted from the original array and myfunc
is the function doing the calculations, the code would look somewhat like this:
import multiprocessing
import numpy as np
from itertools import izip
def myfunc(finput):
# Do some calculations...
...
# ... and return the result and the index:
return (result, finput[1])
# Make indices:
inds = np.rollaxis(np.indices((256, 256, 256)), 0, 4).reshape(-1, 3)
# Make function input from the matrices and the indices:
finput = izip(matrices, inds)
pool = multiprocessing.Pool()
async_results = np.asarray(pool.map_async(myfunc, finput).get(999999))
However, it seems like map_async
is actually creating this huge finput
-list first: My CPU's aren't doing much, but the memory and swap get completely consumed in a matter of seconds, which is obviously not what I want.
Is there a way to pass this huge list to a multiprocessing function without the need to explicitly create it first?
Or do you know another way of solving this problem?
Thanks a bunch! :-)
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…