Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
377 views
in Technique[技术] by (71.8m points)

python - multiprocessing.Pool seems to work in Windows but not in ubuntu?

SOLVED: The problem was Wingware Python IDE. I guess the natural question now is how it is possible and how this could be fixed.

I asked a question yesterday ( Problem with multiprocessing.Pool in Python ) and this question is almost the same but I have figured out that it seems to work on a Windows computer and not in my ubuntu. At the end of this post I will post a slightly different version of the code that does the same thing.

Short summary of my problem: When using multiprocessing.Pool in Python I am not always able to get the amount of workers that I am asking for. When this happens, the program just stalls.

I have been working for a solution all day, and then I came to think about Noahs' comment on my previous question. He said that it worked on his machine so I gave the code to my colleague who runs a Windows machine with Enthoughts 64-bit Python 2.7.1 distribution. I have the same with the big difference that mine runs on ubuntu. I also mention that we both have Wingware Python IDE, but I doubt that this is of any importance?

There are two problems with my code that don't arise when my colleague runs the code on his machine.

  1. I am not always able to get the four workers I am asking for (Although my machine has 12 workers). When this happens, the process just stalls and does not continue. No exception or Error is raised.

  2. When I am able to get the four workers I ask for (which happens approximately 1 out 5 times or so), the figures that are produced (plain random numbers) are EXACTLY the same for all four pictures. This is not the case for my colleague.

Something is very fishy and I am very thankful for any kind of help you guys can offer.

The code:

import multiprocessing as mp
import scipy as sp
import scipy.stats as spstat
import pylab

def testfunc(x0, N):
    print 'working with x0 = %s' % x0
    x = [x0]
    for i in xrange(1,N):
        x.append(spstat.norm.rvs(size = 1)) # stupid appending to make it slower
        if i % 10000 == 0:
            print 'x0 = %s, i = %s' % (x0, i)
    return sp.array(x)

def testfuncParallel(fargs):
    return testfunc(*fargs)


# Define Number of tasks.
nTasks = 4
N = 100000

if __name__ == '__main__':

    """
    Try number 1. Using multiprocessing.Pool together with Pool.map_async
    """
    pool = mp.Pool(processes = nTasks) # I have 12 threads (six cores) available so I am suprised that it does not get access to nTasks = 4 amount of workers

    # Define tasks:
    tasks = [(x, n) for x, n in enumerate(nTasks*[N])] # nTasks different tasks

    # Compute parallel: async - asynchronically, i.e. not necessary in order.
    result = pool.map_async(testfuncParallel, tasks)

    pool.close() # These are needed if map_async is used
    pool.join()

    # Get results:
    sim = sp.zeros((N, nTasks)) 

    for nn, res in enumerate(result.get()):    
        sim[:, nn] = res

    pylab.figure()
    for i in xrange(nTasks):
        pylab.subplot(nTasks,1, i + 1)
        pylab.plot(sim[:, i])

    pylab.show()

Thanks in advance.

Sincerely, Matias

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I don't have a solution for your first problem. In fact, I can run your code repeatedly without fail on my 64-bit Ubuntu box with Enthought's Python 2.7.1 [EPD 7.0-2 (64-bit)]. edit: It turns out the problem was being caused by your IDE (Wingware). The obvious workaround is to run the script from outside the IDE.

As to the second question, what happens is that on Unix every worker process inherits the same state of the random number generator from the parent process. This is why they generate identical pseudo-random sequences. All you have to do to fix this is call scipy.random.seed at the top of testfunc:

def testfunc(x0, N):
    sp.random.seed()
    print 'working with x0 = %s' % x0
    ...

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...