I promise there is a solution at the end :P
Alright... so here we are, 10 days later and I believe that I have solved this issue. I didn't want to add onto an already longish post so I'll include in this answer some of the things that I tried.
Taking @sym's advice, and reading more into the documentation and the comments on the documentation, the pcntl_waitpid()
description states :
If a child as requested by pid has already exited by the time of the call (a so-called
"zombie" process), the function returns immediately. Any system resources used by the child
are freed...
So I setup my pcntl_signal()
handler like this -
function sig_handler($signo){
global $childProcesses;
$pid = pcntl_waitpid(-1, $status, WNOHANG);
echo "Sound the alarm! ";
if ($pid != 0){
if (posix_kill($pid, 9)){
echo "Child {$pid} has tragically died!".PHP_EOL;
unset($childProcesses[$pid]);
}
}
}
// These define the signal handling
// pcntl_signal(SIGTERM, "sig_handler");
// pcntl_signal(SIGHUP, "sig_handler");
// pcntl_signal(SIGINT, "sig_handler");
pcntl_signal(SIGCHLD, "sig_handler");
For completion, I'll include the actual code I'm using for forking a child process -
function broadcastData($socketArray, $data){
global $db,$childProcesses;
$pid = pcntl_fork();
if($pid == -1) {
// Something went wrong (handle errors here)
// Log error, email the admin, pull emergency stop, etc...
echo "Could not fork()!!";
} elseif($pid == 0) {
// This part is only executed in the child
foreach($socketArray AS $socket) {
// There's more happening here but the essence is this
socket_write($socket,$msg,strlen($msg));
// TODO : Consider additional forking here for each client.
}
// This is where the signal is fired
exit(0);
}
// If the child process did not exit above, then this code would be
// executed by both parent and child. In my case, the child will
// never reach these commands.
$childProcesses[] = $pid;
// The child process is now occupying the same database
// connection as its parent (in my case mysql). We have to
// reinitialize the parent's DB connection in order to continue using it.
$db = dbEngine::factory(_dbEngine);
}
Yea... That's a ratio of 1:1 comments to code :P
So this was looking great and I saw the echo of :
Sound the alarm! Child 12345 has tragically died!
However when the socket server loop did it's next iteration, the socket_select()
function failed throwing this error :
PHP Warning: socket_select(): unable to select [4]: Interrupted system call...
The server would now hang and not respond to any requests other than manual kill commands from a root terminal.
I'm not going to get into why this was happening or what I did after that to debug it... lets just say it was a frustrating week...
much coffee, sore eyes and 10 days later...
Drum roll please
TL&DR - The Solution :
Mentioned here in a comment from 2007 in the php sockets documentation and in this tutorial on stuporglue (search for "good parenting"), one can simply "ignore" signals comming in from the child processes (SIGCHLD
) by passing SIG_IGN
to the pcntl_signal()
function -
pcntl_signal(SIGCHLD, SIG_IGN);
Quoting from that linked blog post :
If we are ignoring SIGCHLD, the child processes will be reaped automatically upon completion.
Believe it or not - I included that pcntl_signal()
line, deleted all the other handlers and things dealing with the children and it worked! There were no more <defunct>
processes left hanging around!
In my case, it really did not interest me to know exactly when a child process died, or who it was, I wasn't interested in them at all - just that they didn't hang around and crash my entire server :P