Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
206 views
in Technique[技术] by (71.8m points)

performance - ElasticSearch 2.3 _search for more than 10,000 paged items

In ElasticSearch 2.3 (and in the latest releases) there is a index.max_result_window setting which restricts the search query to a from + size value that is less than 10,000 entries. e.g.

from: 0 size: 10,000 is ok
from: 0 size: 10,001 is not ok
from: 9,000 size: 1,001 is not ok

In the latest release, 7.10, the documentation says this can be worked around by using search-after. However, due to legacy data, I need something similar in ES 2.3. I'm curious if there are any good options?

Why do I need this? In our data we've a child / parent hierarchy. One query we run against this data is to determine all the unique parents over a certain date range. Currently we retrieve this information using an aggregate query. i.e.

{
  "query": { "match_all_in_date_range": {} },
  "aggs": {
    "parents": {
      "terms": {
        "field": "parentId"
      }
    }
  }
}

Which, interestingly, returns all the parents even if there are more than 10,000. i.e. It does not appear to be affected by the index.max_result_window limit.

But this aggregation is expensive and time consuming. As a result I'm evaluating if it's possible to remove it and "aggregate" the data in our own code. i.e. Retrieve all the objects, read their parentId field, and record the unique ids.

But it looks like the index.max_result_window limit may break that idea. i.e. Unless I'm mistaken. Two ideas I had to work around this would be

  • Rather than paging I should modify the query to exclude the parentIds I've already retrieved (the downside being that it could take longer to run and will cause the query to grow until the end)
  • To move over to the more heavy duty scroll API (which may be more suitable for other usages)

But I'd be curious to hear if there are other options available to me?

question from:https://stackoverflow.com/questions/65908713/elasticsearch-2-3-search-for-more-than-10-000-paged-items

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...