python - Using subprocess.Popen for Process with Large Output

Question

Welcome To Ask or Share your Answers For Others

python - Using subprocess.Popen for Process with Large Output

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Using subprocess.Popen for Process with Large Output

I have some Python code that executes an external app which works fine when the app has a small amount of output, but hangs when there is a lot. My code looks like:

p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
errcode = p.wait()
retval = p.stdout.read()
errmess = p.stderr.read()
if errcode:
    log.error('cmd failed <%s>: %s' % (errcode,errmess))

There are comments in the docs that seem to indicate the potential issue. Under wait, there is:

Warning: This will deadlock if the child process generates enough output to a stdout or stderr pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that.

though under communicate, I see:

Note The data read is buffered in memory, so do not use this method if the data size is large or unlimited.

So it is unclear to me that I should use either of these if I have a large amount of data. They don't indicate what method I should use in that case.

I do need the return value from the exec and do parse and use both the stdout and stderr.

So what is an equivalent method in Python to exec an external app that is going to have large output?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T01:04:04+0000

You're doing blocking reads to two files; the first needs to complete before the second starts. If the application writes a lot to stderr, and nothing to stdout, then your process will sit waiting for data on stdout that isn't coming, while the program you're running sits there waiting for the stuff it wrote to stderr to be read (which it never will be--since you're waiting for stdout).

There are a few ways you can fix this.

The simplest is to not intercept stderr; leave stderr=None. Errors will be output to stderr directly. You can't intercept them and display them as part of your own message. For commandline tools, this is often OK. For other apps, it can be a problem.

Another simple approach is to redirect stderr to stdout, so you only have one incoming file: set stderr=STDOUT. This means you can't distinguish regular output from error output. This may or may not be acceptable, depending on how the application writes output.

The complete and complicated way of handling this is select (http://docs.python.org/library/select.html). This lets you read in a non-blocking way: you get data whenever data appears on either stdout or stderr. I'd only recommend this if it's really necessary. This probably doesn't work in Windows.

Categories

python - Using subprocess.Popen for Process with Large Output

python - Using subprocess.Popen for Process with Large Output

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags