When a Heroku worker is restarted (either on command or as the result of a deploy), Heroku sends SIGTERM
to the worker process. In the case of delayed_job
, the SIGTERM
signal is caught and then the worker stops executing after the current job (if any) has stopped.
If the worker takes to long to finish, then Heroku will send SIGKILL
. In the case of delayed_job
, this leaves a locked job in the database that won't get picked up by another worker.
I'd like to ensure that jobs eventually finish (unless there's an error). Given that, what's the best way to approach this?
I see two options. But I'd like to get other input:
- Modify
delayed_job
to stop working on the current job (and release the lock) when it receives a SIGTERM
.
- Figure out a (programmatic) way to detect orphaned locked jobs and then unlock them.
Any thoughts?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…