python - How to use the a k-fold cross validation in scikit with naive bayes classifier and NLTK

Question

Welcome To Ask or Share your Answers For Others

python - How to use the a k-fold cross validation in scikit with naive bayes classifier and NLTK

1 Reply

深蓝 · Answer 1 · 2021-10-23T18:43:51+0000

Your options are to either set this up yourself or use something like NLTK-Trainer since NLTK doesn't directly support cross-validation for machine learning algorithms.

I'd recommend probably just using another module to do this for you but if you really want to write your own code you could do something like the following.

Supposing you want 10-fold, you would have to partition your training set into 10 subsets, train on 9/10, test on the remaining 1/10, and do this for each combination of subsets (10).

Assuming your training set is in a list named training, a simple way to accomplish this would be,

num_folds = 10
subset_size = len(training)/num_folds
for i in range(num_folds):
    testing_this_round = training[i*subset_size:][:subset_size]
    training_this_round = training[:i*subset_size] + training[(i+1)*subset_size:]
    # train using training_this_round
    # evaluate against testing_this_round
    # save accuracy

# find mean accuracy over all rounds

Categories

python - How to use the a k-fold cross validation in scikit with naive bayes classifier and NLTK

python - How to use the a k-fold cross validation in scikit with naive bayes classifier and NLTK

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags