I have a text that I have tokenized, or in general a list of words is ok as well. For example:
>>> from nltk.tokenize import word_tokenize
>>> s = '''Good muffins cost $3.88
in New York. Please buy me
... two of them.
Thanks.'''
>>> word_tokenize(s)
['Good', 'muffins', 'cost', '$', '3.88', 'in', 'New', 'York', '.',
'Please', 'buy', 'me', 'two', 'of', 'them', '.', 'Thanks', '.']
If I have a Python dict that contains single word as well as multi-word keys, how can I efficiently and correctly check for their presence in the text? The ideal output would be key:location_in_text pairs, or something as convenient.
Thanks in advance!
P.S. To explain "correctly" - If I have "lease" in my dict, I do not wish Please marked. Also, recognizing plurals is required. I am wondering if this can be elegantly solved without many if-else clauses.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…