Please consider the following fork()
/SIGCHLD
pseudo-code.
// main program excerpt
for (;;) {
if ( is_time_to_make_babies ) {
pid = fork();
if (pid == -1) {
/* fail */
} else if (pid == 0) {
/* child stuff */
print "child started"
exit
} else {
/* parent stuff */
print "parent forked new child ", pid
children.add(pid);
}
}
}
// SIGCHLD handler
sigchld_handler(signo) {
while ( (pid = wait(status, WNOHANG)) > 0 ) {
print "parent caught SIGCHLD from ", pid
children.remove(pid);
}
}
In the above example there's a race-condition. It's possible for "/* child stuff */
" to finish before "/* parent stuff */
" starts which can result in a child's pid being added to the list of children after it's exited, and never being removed. When the time comes for the app to close down, the parent will wait endlessly for the already-finished child to finish.
One solution I can think of to counter this is to have two lists: started_children
and finished_children
. I'd add to started_children
in the same place I'm adding to children
now. But in the signal handler, instead of removing from children
I'd add to finished_children
. When the app closes down, the parent can simply wait until the difference between started_children
and finished_children
is zero.
Another possible solution I can think of is using shared-memory, e.g. share the parent's list of children and let the children .add
and .remove
themselves? But I don't know too much about this.
EDIT: Another possible solution, which was the first thing that came to mind, is to simply add a sleep(1)
at the start of /* child stuff */
but that smells funny to me, which is why I left it out. I'm also not even sure it's a 100% fix.
So, how would you correct this race-condition? And if there's a well-established recommended pattern for this, please let me know!
Thanks.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…