Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
286 views
in Technique[技术] by (71.8m points)

FInding the number of matched keywords in text index in mongodb

Sample collection "test", (with text index on field1, field2, field3) :

/* Item 1 */
{
    "_id" : ObjectId("6011862888de9cd2347828e6"),
    "field1" : "Denver Segment1",
    "field2" : "student1 zero1",
    "field3" : "cat2 dog0"
}
/* Item 2 */
{
    "_id" : ObjectId("6011866b88de9cd234782906"),
    "field1" : "meow cap",
    "field2" : "teacher eleven1",
    "field3" : "cat2 cow"
}

/* Item 3 */
{
    "_id" : ObjectId("6011868b88de9cd234782909"),
    "field1" : "bark cake",
    "field2" : "admin hey",
    "field3" : "bird chirp"
}

And I am trying to sort the records based on number of keywords matched. I have the following query (keywords are cat2 and student1)

db.getCollection('test').find(
   { $text: { $search: "cat2 student1" } },
   { score: { $meta: "textScore" } }
).sort( { score: { $meta: "textScore" } } )

The result is the following:

/* 1 */
{
    "_id" : ObjectId("6011862888de9cd2347828e6"),
    "field1" : "Denver Segment1",
    "field2" : "student1 zero1",
    "field3" : "cat2 dog0",
    "score" : 1.5
}

/* 2 */
{
    "_id" : ObjectId("6011866b88de9cd234782906"),
    "field1" : "meow cap",
    "field2" : "teacher eleven1",
    "field3" : "cat2 cow",
    "score" : 0.75
}

This is fine, but I do not how to get the actually keywords shown as well. For example expected result should be the following:

/* 1 */
{
    "_id" : ObjectId("6011862888de9cd2347828e6"),
    "field1" : "Denver Segment1",
    "field2" : "student1 zero1",
    "field3" : "cat2 dog0",
    "score" : 1.5,
    "matched-keywords":["cat2","student1"]

}

/* 2 */
{
    "_id" : ObjectId("6011866b88de9cd234782906"),
    "field1" : "meow cap",
    "field2" : "teacher eleven1",
    "field3" : "cat2 cow",
    "score" : 0.75,
    "matched-keywords":["cat2"]

}
```

How can I do this ?
question from:https://stackoverflow.com/questions/65922538/finding-the-number-of-matched-keywords-in-text-index-in-mongodb

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Using the aggregation pipeline you can achieve the result you are looking for.

Stages:

  1. Text search.
  2. Sort by descending score, use the $meta aggregation expression in the $sort stage.
  3. Split it and add a new field called matched-keywords.
  4. Filter the matched-keywords using the $regexMatch.

.

db.collection.aggregate([
  {
    "$match": {
      "$text": {
        "$search": "cat2 cow"
      }
    }
  },
  {
    "$sort": {
      score: {
        "$meta": "textScore"
      }
    }
  },
  {
    "$addFields": {
      "matched-keywords": {
        "$concatArrays": [
          {
            "$split": [
              "$field1",
              " "
            ]
          },
          {
            "$split": [
              "$field2",
              " "
            ]
          },
          {
            "$split": [
              "$field3",
              " "
            ]
          }
        ]
      }
    }
  },
  {
    "$addFields": {
      "matched-keywords": {
        "$filter": {
          "input": "$matched-keywords",
          "as": "word",
          "cond": {
            "$regexMatch": {
              "input": "cat2 cow",
              "regex": "$$word"
            }
          }
        }
      }
    }
  }
])

Result:

[
  {
    "_id": 1,
    "field1": "Denver Segment1",
    "field2": "student1 zero1",
    "field3": "cat2 dog0",
    "matched-keywords": [
      "cat2"
    ]
  },
  {
    "_id": 2,
    "field1": "meow cap",
    "field2": "teacher eleven1",
    "field3": "cat2 cow",
    "matched-keywords": [
      "cat2",
      "cow"
    ]
  }
]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...