# models.py
from django.db import models
class Person(models.Model):
first_name = models.CharField(max_length=30)
last_name = models.CharField(max_length=30)
text_blob = models.CharField(max_length=50000)
# tasks.py
import celery
@celery.task
def my_task(person):
# example operation: does something to person
# needs only a few of the attributes of person
# and not the entire bulky record
person.first_name = person.first_name.title()
person.last_name = person.last_name.title()
person.save()
In my application somewhere I have something like:
from models import Person
from tasks import my_task
import celery
g = celery.group([my_task.s(p) for p in Person.objects.all()])
g.apply_async()
- Celery pickles p to send it to the worker right?
- If the workers are running on multiple machines, would the entire person object (along with the bulky text_blob which is primarily not required) be transmitted over the network? Is there a way to avoid it?
How can I efficiently and evenly distribute the Person records to workers running on multiple machines?
Could this be a better idea? Wouldn't it overwhelm the db if Person has a few million records?
# tasks.py
import celery
from models import Person
@celery.task
def my_task(person_pk):
# example operation that does not need text_blob
person = Person.objects.get(pk=person_pk)
person.first_name = person.first_name.title()
person.last_name = person.last_name.title()
person.save()
#In my application somewhere
from models import Person
from tasks import my_task
import celery
g = celery.group([my_task.s(p.pk) for p in Person.objects.all()])
g.apply_async()
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…