Using dynamic mappings (lots of fields)
It looks like you adjusted the doc structure compared to your original question.
The query above was thought for nested fields which cannot be easily iterated in a script for performance reasons. Having said that, the above is an even slower workaround which accesses the docs' _source
and iterates its contents. But keep in mind that it's not recommended to access the _source
in scripts!
If your docs aren't nested anymore, you can access the so-called doc values which are much more optimized for query-time access:
{
"query": {
"function_score": {
...
"functions": [
{
...
"script_score": {
"script": {
"lang": "painless",
"source": """
try {
if (doc['boost.boostType.keyword'].value == params.preferredBoostType) {
return doc['boost.boostFactor'].value;
} else {
throw new Exception();
}
} catch(Exception e) {
return doc['fallbackBoostFactor'].value;
}
""",
"params": {
"preferredBoostType": "Type1"
}
}
}
}
]
}
}
}
thus speeding up your function score query.
Alternative using an ordered list of values
Since the nested iteration is slow and dynamic mappings are blowing up your index, you could store your boosts in a standardized ordered list in each document:
"boostValues": [1.0001, 1.002, 1.0005, ..., 1.1]
and keep track of the corresponding boost types' order in the backend where you construct the queries:
var boostTypes = ["Type1", "Type2", "Type3", ..., "TypeN"]
So something like n-hot vectors.
Then, as you construct the Elasticsearch query, you'd look up the array index of the boostValues
based on the boostType
and pass this array index to the script query from above which'd access the corresponding boostValues
doc-value.
This is guaranteed to be faster than _source
access. But it's required that you always keep your boostTypes
and boostValues
in sync -- preferably append-only (as you add new boostTypes
, the list grows in one dimension).
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…