The bug is reported here but it's not solved yet even in Django 1.11. The issue is related to joining two tables in reverse relations.
Notice that distinct parameter works well for Count but not for Sum. So you can use a trick and write an ORM like below:
Projects.objects.annotate(
temp_tasks_duration=Sum('task__duration'),
temp_subtasks_duration=Sum('task__subtask__duration'),
tasks_count=Count('task'),
tasks_count_distinct=Count('task', distinct=True),
task_subtasks_count=Count('task__subtask'),
task_subtasks_count_distinct=Count('task__subtask', distinct=True),
).annotate(
tasks_duration=F('temp_tasks_duration')*F('tasks_count_distinct')/F('tasks_count'),
subtasks_duration=F('temp_subtasks_duration')*F('subtasks_count_distinct')/F('subtasks_count'),
)
Update:
I found that you need to use Subquery. In the following solution, firstly you filter tasks for related to the outerref (OuterRef references to the outer query, so the tasks are filtered for each Project), then you group the tasks by 'project', so that the Sum applies on all the tasks of each projects and returns just one result if any task exists for the project (you have filtered by 'project' and then grouped by that same field; That's why just one group can be there.) or None otherwise. The result would be None if the project has no task, that means we can not use [0] to select the calculated sum.
from django.db.models import Subquery, OuterRef
Projects.objects.annotate(
tasks_duration=Subquery(
Task.objects.filter(
project=OuterRef('pk')
).values(
'project'
).annotate(
the_sum=Sum('task__duration'),
).values('the_sum')[:1]
),
subtasks_duration=Sum('task__subtask__duration')
)
Running this code will send just one query to the database, so the performance is great.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…