Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.2k views
in Technique[技术] by (71.8m points)

shell - Python, running command line tools in parallel

I am using Python as a script language to do some data processing and call command-line tools for number crunching. I wish to run command-line tools in parallel since they are independent with each other. When one command-line tool is finished, I can collect its results from the output file. So I also need some synchronization mechanism to notify my main Python program that one task is finished so that the result could be parsed into my main program.

Currently, I use os.system(), which works fine for one-thread, but cannot be parallelized.

Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you want to run commandline tools as separate processes, just use os.system (or better: The subprocess module) to start them asynchronously. On Unix/linux/macos:

subprocess.call("command -flags arguments &", shell=True)

On Windows:

subprocess.call("start command -flags arguments", shell=True)

As for knowing when a command has finished: Under unix you could get set up with wait etc., but if you're writing the commandline scripts, I'd just have them write a message into a file, and monitor the file from the calling python script.

@James Youngman proposed a solution to your second question: Synchronization. If you want to control your processes from python, you could start them asynchronously with Popen.

p1 = subprocess.Popen("command1 -flags arguments")
p2 = subprocess.Popen("command2 -flags arguments")

Beware that if you use Popen and your processes write a lot of data to stdout, your program will deadlock. Be sure to redirect all output to a log file.

p1 and p2 are objects that you can use to keep tabs on your processes. p1.poll() will not block, but will return None if the process is still running. It will return the exit status when it is done, so you can check if it is zero.

while True:
    time.sleep(60)
    for proc in [p1, p2]:
        status = proc.poll()
        if status == None:
            continue
        elif status == 0:
            # harvest the answers
        else:
            print "command1 failed with status", status

The above is just a model: As written, it will never exit, and it will keep "harvesting" the results of completed processes. But I trust you get the idea.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...