The problem is that when you specify a training set -t train.arff
and a test set test.arff
, the mode of operation is to calculate the performance of the model based on the test set. But you can't calculate a performance of any kind without knowing the actual class. Without the actual class, how will you know if your prediction if right or wrong?
I used the data you gave as train.arff
and as test.arff
with arbitrary class labels assigned by me. The relevant output lines are:
=== Error on training data ===
Correctly Classified Instances 4 80 %
Incorrectly Classified Instances 1 20 %
Kappa statistic 0.6154
Mean absolute error 0.2429
Root mean squared error 0.4016
Relative absolute error 50.0043 %
Root relative squared error 81.8358 %
Total Number of Instances 5
=== Confusion Matrix ===
a b <-- classified as
2 1 | a = 1
0 2 | b = -1
and
=== Error on test data ===
Total Number of Instances 0
Ignored Class Unknown Instances 5
=== Confusion Matrix ===
a b <-- classified as
0 0 | a = 1
0 0 | b = -1
Weka can give you those statistics for the training set, because it knows the actual class labels and the predicted ones (applying the model on the training set). For the test set, it can't get any information about the performance, because it doesn't know about the true class labels.
What you might want to do is:
java -cp weka.jar weka.classifiers.bayes.NaiveBayes -t train.arff -T test.arff -p 1-4
which in my case would give you:
=== Predictions on test data ===
inst# actual predicted error prediction (feature1,feature2,feature3,feature4)
1 1:? 1:1 1 (1,7,1,0)
2 1:? 1:1 1 (1,5,1,0)
3 1:? 2:-1 0.786 (-1,1,1,0)
4 1:? 2:-1 0.861 (1,1,1,1)
5 1:? 2:-1 0.861 (-1,1,1,1)
So, you can get the predictions, but you can't get a performance, because you have unlabeled test data.