Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.2k views
in Technique[技术] by (71.8m points)

django - Issues with Celery configuration on AWS Elastic Beanstalk - "No config updates to processes"

I've a Django 2 application deployed on AWS Elastic Beanstalk and I'm trying to configure Celery in order to exec async tasks on the same machine.

My files:

02_packages.config

files:
  "/usr/local/share/pycurl-7.43.0.tar.gz" :
    mode: "000644"
    owner: root
    group: root
    source: https://pypi.python.org/packages/source/p/pycurl/pycurl-7.43.0.tar.gz

packages:
  yum:
    python34-devel: []
    libcurl-devel: []

commands:
  01_download_pip3:
    # run this before PIP installs requirements as it needs to be compiled with OpenSSL
    command: 'curl -O https://bootstrap.pypa.io/get-pip.py'
  02_install_pip3:
    # run this before PIP installs requirements as it needs to be compiled with OpenSSL
    command: 'python3 get-pip.py'

container_commands:
  03_pycurl_reinstall:
    # run this before PIP installs requirements as it needs to be compiled with OpenSSL
    # the upgrade option is because it will run after PIP installs the requirements.txt file.
    # and it needs to be done with the virtual-env activated
    command: 'source /opt/python/run/venv/bin/activate && pip3 install /usr/local/share/pycurl-7.43.0.tar.gz --global-option="--with-nss" --upgrade'

03_django.config

container_commands:
  01_migrate_db:
    command: "django-admin.py migrate --noinput"
    leader_only: true
  02_createsu: # custom django-admin command to create the "admin" superuser
    command: "source /opt/python/run/venv/bin/activate && python manage.py createsu"
    leader_only: true
  03_update_permissions: # custom django-admin command to update user perms
    command: "source /opt/python/run/venv/bin/activate && python manage.py update_permissions"
    leader_only: true
  04_collectstatic:
    command: "django-admin.py collectstatic --noinput"
  05_pip_upgrade:
    command: /opt/python/run/venv/bin/pip install --upgrade pip
    ignoreErrors: false

option_settings:
  aws:elasticbeanstalk:application:environment:
    DJANGO_SETTINGS_MODULE: "my_proj.settings_prod"
    APP_ENV: "test"
    PYCURL_SSL_LIBRARY: "nss"
  aws:elasticbeanstalk:container:python:
    WSGIPath: myproj/wsgi.py
    NumProcesses: 3
    NumThreads: 20
  aws:elasticbeanstalk:container:python:staticfiles:
    "/static/": "static/"

requirements.txt

boto3==1.6.3
botocore==1.9.3
Django==2.0.3
django-cors-headers==2.2.0
django-filter==1.1.0
django-storages==1.6.5
djangorestframework==3.7.7
djangorestframework-jwt==1.11.0
docutils==0.14
jmespath==0.9.3
Markdown==2.6.11
olefile==0.44
Pillow==5.0.0
psycopg2==2.7.3.2
PyJWT==1.5.3
python-dateutil==2.6.1
pytz==2018.3
reportlab==3.4.0
s3transfer==0.1.13
six==1.11.0
Wand==0.4.4
uwsgi==2.0.17 # WSGI for production deployment
gevent==1.2.2 # Non-blocking Python network library, required by uWSGI
celery==4.1.0
django_celery_beat==1.1.1
django_celery_results==1.0.1

celery_conf/config.py

AWS_ACCESS_KEY_ID = ...
AWS_SECRET_ACCESS_KEY = ...

CELERY_BROKER_TRANSPORT = 'sqs'
CELERY_BROKER_URL = 'sqs://' # 'sqs://%s:%s@' % (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)

CELERY_BROKER_USER = AWS_ACCESS_KEY_ID
CELERY_BROKER_PASSWORD = AWS_SECRET_ACCESS_KEY
CELERY_WORKER_STATE_DB = '/var/run/celery/worker.db'
CELERY_BEAT_SCHEDULER = 'django_celery_beat.schedulers:DatabaseScheduler'
CELERY_WORKER_PREFETCH_MULTIPLIER = 0 # See https://github.com/celery/celery/issues/3712

CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TASK_SERIALIZER = 'json'

CELERY_DEFAULT_QUEUE = 'myproj-django' # Queue name
CELERY_QUEUES = {
    CELERY_DEFAULT_QUEUE: {
        'exchange': CELERY_DEFAULT_QUEUE,
        'binding_key': CELERY_DEFAULT_QUEUE,
    }
}

CELERY_BROKER_TRANSPORT_OPTIONS = {
    "region": "us-east-1", # US East (N. Virginia)
    'visibility_timeout': 360,
    'polling_interval': 1
}

CELERY_RESULT_BACKEND = 'django-db'

myproj/celery.py

from __future__ import absolute_import, unicode_literals
import os

from celery import Celery
from celery.schedules import crontab

# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'myproj.settings_prod')

app = Celery('myproj')

# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
#   should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')

# Load task modules from all registered Django app configs.
app.autodiscover_tasks()

if __name__ == '__main__':
    app.start()

@app.task(bind=True)
def debug_task(self):
        print('Request: {0!r}'.format(self.request))

myproj/myapp/tasks.py

from __future__ import absolute_import, unicode_literals
from celery.decorators import task

from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)

@task()
def do_something():
    logger.info('******** CALLING ASYNC TASK WITH CELERY **********')

settings_prod.py

# Importing base settings
from .settings import *

DEBUG = False

# Importing Celery configurations
from celery_conf.config import *
INSTALLED_APPS += ('django_celery_beat',)

UPDATE 1

Since according to /var/log/celery-beat.log, it seems that celery is not able to find my project module. I think my project structure is not the one that Celery is expecting. How I can make it works without changing the whole project structure?

My project structure is the following:

-- myprof-folder/
   -- requirements.txt
   -- .ebextensions/
   -- celery_conf/
      -- __init__.py
      -- config.py
   -- myproj/
      -- __init__.py
      -- settings.py # base settings
      -- settings_prod.py # production settings
      -- urls.py
      -- wsgi.py
      -- myapp1/
         -- models.py
         -- urls.py
         -- apps.py
         -- views.py
         -- tasks.py # here my app's tasks
         -- ...
      -- myapp2/
      -- myapp3/
      -- ...
      -- myappN/

UPDATE 2

99_celery.config was using the --workdir option with /tmp as directory. That option is not needed. I also applied a few changes to that file.

99_celery.config

files:
  "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh":
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash

      # Create required directories
      sudo mkdir -p /var/log/celery/
      sudo mkdir -p /var/run/celery/

      # Create group called 'celery'
      sudo groupadd -f celery
      # add the user 'celery' if it doesn't exist and add it to the group with same name
      id -u celery &>/dev/null || sudo useradd -g celery celery
      # add permissions to the celery user for r+w to the folders just created
      sudo chown -R celery:celery /var/log/celery/
      sudo chown -R celery:celery /var/run/celery/

      # Get django environment variables
      celeryenv=`cat /opt/python/current/env | tr '
' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g' | sed 's/%/%%/g'`
      celeryenv=${celeryenv%?}

      # Create celery configuration script
      celeryconf="[program:celeryd-worker]
      ; Set full path to celery program if using virtualenv
      command=/opt/python/run/venv/bin/celery worker -A myproj --loglevel=INFO --logfile="/var/log/celery/%%n%%I.log" --pidfile="/var/run/celery/%%n.pid"

      directory=/opt/python/current/app
      user=celery
      numprocs=1
      stdout_logfile=/var/log/celery-worker.log
      stderr_logfile=/var/log/celery-worker.log
      autostart=true
      autorestart=true
      startsecs=10

      ; Need to wait for currently executing tasks to finish at shutdown.
      ; Increase this if you have very long running tasks.
      stopwaitsecs = 600

      ; When resorting to send SIGKILL to the program to terminate it
      ; send SIGKILL to its whole process group instead,
      ; taking care of its children as well.
      killasgroup=true

      ; if rabbitmq is supervised, set its priority higher
      ; so it starts first
      priority=998

      environment=$celeryenv

      [program:celeryd-beat]
      ; Set full path to celery program if using virtualenv
      command=/opt/python/run/venv/bin/celery beat -A myproj --loglevel=INFO --logfile="/var/log/celery/celery-beat.log" --pidfile="/var/run/celery/celery-beat.pid"

      directory=/opt/python/current/app
      user=celery
      numprocs=1
      stdout_logfile=/var/log/celery-beat.log
      stderr_logfile=/var/log/celery-beat.log
      autostart=true
      autorestart=true
      startsecs=10

      ; Need to wait for currently executing tasks to finish at shutdown.
      ; Increase this if you have very long running tasks.
      stopwaitsecs = 600

      ; When resorting to send SIGKILL to the program to terminate it
      ; send SIGKILL to its whole process group instead,
      ; taking care of its children as well.
      killasgroup=true

      ; if rabbitmq is supervised, set its priority higher
      ; so it starts first
      priority=998

      environment=$celeryenv"

      # Create the celery supervisord conf script
      echo "$celeryconf" | tee /opt/python/etc/celery.conf

      # Add configuration script to supervisord conf (if not there already)
      if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf
        then
        echo "[include]" | tee -a /opt/python/etc/supervisord.conf
        echo "files: celery.conf" | tee -a /opt/python/etc/supervisord.conf
      fi

      # Enable supervisor to listen for HTTP/XML-RPC requests.
      # supervisorctl will use XML-RPC to communicate with supervisord over port 9001.
      # Source: https://askubuntu.com/questions/911994/supervisorctl-3-3-1-http-localhost9001-refused-connection
      if ! grep -Fxq "[inet_http_server]" /opt/python/etc/supervisord.conf
        then
        echo "[inet_http_server]" | tee -a /opt/python/etc/supervisord.conf
        echo "port = 127.0.0.1:9001" | tee -a /opt/python/etc/supervisord.conf
      fi

      # Reread the supervisord config
      supervisorctl -c /opt/python/etc/supervisord.conf reread

      # Update supervisord in cache without restarting all services
      supervisorctl -c /opt/python/etc/supervisord.conf update

      # Start/Restart celeryd through supervisord
      supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-beat
      supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-worker


container_commands:
  00_celery_tasks_run:
    command: "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh"
    leader_only: true

My logs:

I SSH my EC2 instance and the following are the log files:

/var/log/celery-worker.log

Traceback (most recent call last):
  File "/opt/python/run/venv/bin/celery", line 11, in <module>
    sys.exit(main())
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/celery/__main__.py", line 14, in main
    _main()
  File "/opt/python/run/venv/local/lib/python3.6/site-packages/celery/bin/celery.py", line 326, in main
    cmd.execute_f

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

So there were multiple issues with your configs

1. Incorrect class for celery

command=/opt/python/run/venv/bin/celery worker -A myproj --loglevel=INFO --logfile="/var/log/celery/%%n%%I.log" --pidfile="/var/run/celery/%%n.pid"

As per your structure it should have been

command=/opt/python/run/venv/bin/celery worker -A celery_conf.celery_app:app --loglevel=INFO --logfile="/var/log/celery/%%n%%I.log" --pidfile="/var/run/celery/%%n.pid"

2. Unicode dash instead of -

You had a unicode en-dash or something in the config, may be you copied from some website and it was not - in your below command

command=/opt/python/run/venv/bin/celery worker -A myproj --loglevel=INFO --logfile="/var/log/celery/%%n%%I.log" --pidfile="/var/run/celery/%%n.pid"

3. Not append celerybeat.conf and celery.conf

Below code was not working

if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf
 then
    echo "[include]" | tee -a /opt/python/etc/supervisord.conf
    echo "files: celery.conf" | tee -a /opt/python/etc/supervisord.conf
fi

As include was already a part from uswgi.conf

4. Missing inet_server config

Although you had below in your question

if ! grep -Fxq "[inet_http_server]" /opt/python/etc/supervisord.conf
    then
        echo "[inet_http_server]" | tee -a /opt/python/etc/supervisord.conf
        echo "port = 127.0.0.1:9001" | tee -a /opt/python/etc/supervisord.conf
fi

But you were not using it and this causes the below issue

[Instance: i-00c786a77c1f5ec11] Command failed on instance. Return code: 2 Output: (TRUNCATED)... ERROR: already shutting down error: , : file: /usr/lib64/python2.7/xmlrpclib.py line: 800 error: , : file: /usr/lib64/python2.7/xmlrpclib.py line: 800. Hook /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh failed.

Final Config

Below is the final config that you needed to use

files:
  "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh":
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash

      # Create required directories
      sudo mkdir -p /var/log/celery/
      sudo mkdir -p /var/run/celery/

      # Create group called 'celery'
      sudo groupadd -f celery
      # add the user 'celery' if it doesn't exist and add it to the group with same name
      id -u celery &>/dev/null || sudo useradd -g celery celery
      # add permissions to the celery user for r+w to the folders just created
      sudo chown -R celery:celery /var/log/celery/
      sudo chown -R celery:celery /var/run/celery/

      # Get django environment variables
      celeryenv=`cat /opt/python/current/env | tr '
' ',' | sed 's/%/%%/g' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`
      celeryenv=${celeryenv%?}

      # Create CELERY configuration script
      celeryconf="[program:celeryd]
      directory=/opt/python/current/app
      ; Set full path to celery program if using virtualenv
      command=/opt/python/run/venv/bin/celery worker -A celery_conf.celery_app:app --loglevel=INFO --logfile="/var/log/celery/%%n%%I.log" --pidfile="/var/run/celery/%%n.pid"

      user=celery
      numprocs=1
      stdout_logfile=/var/log/celery-worker.log
      stderr_logfile=/var/log/celery-worker.log
      autostart=true
      autorestart=true
      startsecs=10

      ; Need to wait for currently executing tasks to finish at shutdown.
      ; Increase this if you have very long running tasks.
      stopwaitsecs = 60

      ; When resorting to send SIGKILL to the program to terminate it
      ; send SIGKILL to its whole process group instead,
      ; taking care of its children as well.
      killasgroup=true

      ; if rabbitmq is supervised, set its priority higher
      ; so it starts first
      priority=998

      environment=$celeryenv"


      # Create CELERY BEAT configuraiton script
      celerybeatconf="[program:celerybeat]
      ; Set full path to celery program if using virtualenv
      command=/opt/python/run/venv/bin/celery beat -A celery_conf.celery_app:app --loglevel=INFO --logfile="/var/log/celery/celery-beat.log" --pidfile="/var/run/celery/celery-beat.pid"

      directory=/opt/python/current/app
      user=celery
      numprocs=1
      stdout_logfile=/var/log/celerybeat.log
      stderr_logfile=/var/log/celerybeat.log
      autostart=true
      autorestart=true
      startsecs=10

      ; Need to wait for currently executing tasks to finish at shutdown.
      ; Increase this if you have very long running tasks.
      stopwaitsecs = 60

      ; When resorting to send SIGKILL to the program to terminate it
      ; send SIGKILL to its whole process group instead,
      ; taking care of its children as well.
      killasgroup=true

      ; if rabbitmq is supervised, set its priority higher
      ; so it starts first
      priority=999

      environment=$celeryenv"

      # Create the celery supervisord conf script
      echo "$celeryconf" | tee /opt/python/etc/celery.conf
      echo "$celerybeatconf" | tee /opt/python/etc/celerybeat.conf

      # Add configuration script to supervisord conf (if not there already)
      if ! grep -Fxq "celery.conf" /opt/python/etc/supervisord.conf
        then
          echo "[include]" | tee -a /opt/python/etc/supervisord.conf
          echo "files: uwsgi.conf celery.conf celerybeat.conf" | tee -a /opt/python/etc/supervisord.conf
      fi

      # Enable supervisor to listen for HTTP/XML-RPC requests.
      # supervisorctl will use XML-RPC to communicate with supervisord over port 9001.
      # Source: https://askubuntu.com/questions/911994/supervisorctl-3-3-1-http-localhost9001-refused-connection
      if ! grep -Fxq "[inet_http_server]" /opt/python/etc/supervisord.conf
        then
          echo "[inet_http_server]" | tee -a /opt/python/etc/supervisord.conf
          echo "port = 127.0.0.1:9001" | tee -a /opt/python/etc/supervisord.conf
      fi

      # Reread the supervisord config
      supervisorctl -c /opt/python/etc/supervisord.conf reread

      # Update supervisord in cache without restarting all services
      supervisorctl -c /opt/python/etc/supervisord.conf update

      # Start/Restart celeryd through supervisord
      supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd
      supervisorctl -c /opt/python/etc/supervisord.conf restart celerybeat


commands:
  01_killotherbeats:
    command: "ps auxww | grep 'celery beat' | awk '{print $2}' | sudo xargs kill -9 || true"
    ignoreErrors: true
  02_restartbeat:
    command: "supervisorctl -c /opt/python/etc/supervisord.conf restart celerybeat"
    leader_only: true

Case Resolved!!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...