I have a TfidfVectorizer
that vectorizes collection of articles followed by feature selection.
vectroizer = TfidfVectorizer()
X_train = vectroizer.fit_transform(corpus)
selector = SelectKBest(chi2, k = 5000 )
X_train_sel = selector.fit_transform(X_train, y_train)
Now, I want to store this and use it in other programs. I don't want to re-run the TfidfVectorizer()
and the feature selector on the training dataset. How do I do that? I know how to make a model persistent using joblib
but I wonder if this is the same as making a model persistent.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…