I am using NLTK to POS-tag hundereds of tweets in a web request. As you know, Django instantiates a request handler for each request.
I noticed this: for a request (~200 tweets), the first tweet needs ~18 seconds to tag, while all subsequent tweets need ~120 milliseconds to tag. What can I do to speed up the process?
Can I do a "pre-warming request" so that the module data is already loaded for each request?
class MyRequestHandler(BaseHandler):
def read(self, request): #this runs for a GET request
#...in a loop:
tokens = nltk.word_tokenize( tweet)
tagged = nltk.pos_tag( tokens)
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…