The point is that the ElasticSearch regex you are using requires a full string match:
Lucene’s patterns are always anchored. The pattern provided must match the entire string.
Thus, to match any character (but a newline), you can use .*
pattern:
match: { text: '.*google.*'}
^^ ^^
One more variation is for cases when your string can have newlines: match: { text: '(.|
)*google(.|
)*'}
. This awful (.|
)*
is a must in ElasticSearch because this regex flavor does not allow any [sS]
workarounds, nor any DOTALL/Singleline flags. "The Lucene regular expression engine is not Perl-compatible but supports a smaller range of operators."
However, if you do not plan to match any complicated patterns and need no word boundary checking, regex search for a mere substring is better performed with a mere wildcard search:
{
"query": {
"wildcard": {
"text": {
"value": "*google*",
"boost": 1.0,
"rewrite": "constant_score"
}
}
}
}
See Wildcard search for more details.
NOTE: The wildcard pattern also needs to match the whole input string, thus
google*
finds all strings starting with google
*google*
finds all strings containing google
*google
finds all strings ending with google
Also, bear in mind the only pair of special characters in wildcard patterns:
?, which matches any single character
*, which can match zero or more characters, including an empty one
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…