Thanks to the helpful suggestions below:
So it seems to be fixed when I
- separate commands into individual calls to Popen
- stderr=subprocess.PIPE as an argument to each Popen chain.
The New code:
import subprocess
import shlex
import logging
def run_shell_commands(cmds):
""" Run commands and return output from last call to subprocess.Popen.
For usage see the test below.
"""
# split the commands
cmds = cmds.split("|")
cmds = list(map(shlex.split,cmds))
logging.info('%s' % (cmds,))
# run the commands
stdout_old = None
stderr_old = None
p = []
for cmd in cmds:
logging.info('%s' % (cmd,))
p.append(subprocess.Popen(cmd,stdin=stdout_old,stdout=subprocess.PIPE,stderr=subprocess.PIPE))
stdout_old = p[-1].stdout
stderr_old = p[-1].stderr
return p[-1]
pattern = '"^85567 "'
file = "j"
cmd1 = 'grep %s %s | sort -g -k3 | head -10 | cut -d" " -f2,3' % (pattern, file)
p = run_shell_commands(cmd1)
out = p.communicate()
print(out)
Original Post:
I've spent too long trying to solve a problem piping a simple subprocess.Popen.
Code:
import subprocess
cmd = 'cat file | sort -g -k3 | head -20 | cut -f2,3' % (pattern,file)
p = subprocess.Popen(cmd,shell=True,stdout=subprocess.PIPE)
for line in p.stdout:
print(line.decode().strip())
Output for file ~1000 lines in length:
...
sort: write failed: standard output: Broken pipe
sort: write error
Output for file >241 lines in length:
...
sort: fflush failed: standard output: Broken pipe
sort: write error
Output for file <241 lines in length is fine.
I have been reading the docs and googling like mad but there is something fundamental about the subprocess module that I'm missing ... maybe to do with buffers. I've tried p.stdout.flush() and playing with the buffer size and p.wait(). I've tried to reproduce this with commands like 'sleep 20; cat moderatefile' but this seems to run without error.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…