Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
335 views
in Technique[技术] by (71.8m points)

python - Google App Engine Search API

When querying a search index in the Python version of the GAE Search API, what is the best practice for searching for items where documents with words match the title are first returned, and then documents where words match the body?

For example given:

body = """This is the body of the document, 
with a set of words"""

my_document = search.Document(
  fields=[
    search.TextField(name='title', value='A Set Of Words'),
    search.TextField(name='body', value=body),
   ])

If it is possible, how might one perform a search on an index of Documents of the above form with results returned in this priority, where the phrase being searched for is in the variable qs:

  1. Documents whose title matches the qs; then
  2. Documents whose body match the qs words.

It seems like the correct solution is to use a MatchScorer, but I may be off the mark on this as I have not used this search functionality before. It is not clear from the documentation how to use the MatchScorer, but I presume one subclasses it and overloads some function - but as this is not documented, and I have not delved into the code, I cannot say for sure.

Is there something here that I am missing, or is this the correct strategy? Did I miss where this sort of thing is documented?


Just for clarity here is a more elaborate example of the desired outcome:

documents = [
  dict(title="Alpha", body="A"),          # "Alpha"
  dict(title="Beta", body="B Two"),       # "Beta"
  dict(title="Alpha Two", body="A"),      # "Alpha2"
]

for doc in documents: 
  search.Document(
    fields=[
       search.TextField(name="title", value=doc.title),
       search.TextField(name="body", value=doc.body),
    ]
  )
  index.put(doc)  # for some search.Index

# Then when we search, we search the Title and Body.
index.search("Alpha")
# returns [Alpha, Alpha2]

# Results where the search is found in the Title are given higher weight.
index.search("Two")
# returns [Alpha2, Beta]  -- note Alpha2 has 'Two' in the title.
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...