Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

select - django select_related - when to use it

I'm trying to optimize my ORM queries in django. I use connection.queries to view the queries that django generate for me.

Assuming I have these models:

class Book(models.Model):
    name   = models.CharField(max_length=50)
    author = models.ForeignKey(Author)

class Author(models.Model):
    name   = models.CharField(max_length=50)

Let's say, when I generate a specific webpage, I want to display all books, with their author name next to each of them. Also, I display seperately all the authors.

So should I use

Book.objects.all().select_related("author")

Which will result in a JOIN query. Even if I do a line before:

Author.objects.all()

Obviously in template I will write something like {{book.author.name}}.
So the question is, when I access a foreign key value (author), if django already has that object from another query, will that still result in additional query (for each book)? If no, so in that case, does using select_related actually creates performance overhead?

question from:https://stackoverflow.com/questions/33230540/django-select-related-when-to-use-it

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You are actually asking two different questions:

1. does using select_related actually creates performance overhead?

You should see documentation about Django Query Cache:

Understand QuerySet evaluation

To avoid performance problems, it is important to understand:

  • that QuerySets are lazy.

  • when they are evaluated.

  • how the data is held in memory.

So in summary, Django caches in memory results evaluated within the same QuerySet object, that is, if you do something like that:

books = Book.objects.all().select_related("author")
for book in books:
    print(book.author.name)  # Evaluates the query set, caches in memory results
first_book = books[1]  # Does not hit db
print(first_book.author.name)  # Does not hit db  

Will only hit db once as you prefetched Authors in select_related, all this stuff will result in a single database query with INNER JOIN.

BUT this won't do any cache between querysets, nor even with the same query:

books = Book.objects.all().select_related("author")
books2 = Book.objects.all().select_related("author")
first_book = books[1]  # Does hit db
first_book = books2[1]  # Does hit db

This is actually pointed out in docs:

We will assume you have done the obvious things above. The rest of this document focuses on how to use Django in such a way that you are not doing unnecessary work. This document also does not address other optimization techniques that apply to all expensive operations, such as general purpose caching.

2. if django already has that object from another query, will that still result in additional query (for each book)?

You are actually meaning if Django does ORM queries caching, which is a very different matter. ORM Queries caching, that is, if you do a query before and then you do the same query later, if database hasn't changed, the result is coming from a cache and not from an expensive database lookup.

The answer is not Django, not officially supported, but yes unofficially, yes through 3rd-party apps. The most relevant third-party apps that enables this type of caching are:

  1. Johnny-Cache (older, not supporting django>1.6)
  2. Django-Cachalot (newer, supports 1.6, 1.7, and still in dev 1.8)
  3. Django-Cacheops (newer, supports Python 2.7 or 3.3+, Django 1.8+ and Redis 2.6+ (4.0+ recommended))

Take a look a those if you look for query caching and remember, first profile, find bottlenecks, and if they are causing a problem then optimize.

The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming. Donald Knuth.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...