machine learning - svm scaling input values

Question

Welcome To Ask or Share your Answers For Others

machine learning - svm scaling input values

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

machine learning - svm scaling input values

I am using libSVM. Say my feature values are in the following format:

                         instance1 : f11, f12, f13, f14
                         instance2 : f21, f22, f23, f24
                         instance3 : f31, f32, f33, f34
                         instance4 : f41, f42, f43, f44
                         ..............................
                         instanceN : fN1, fN2, fN3, fN4

I think there are two scaling can be applied.

scale each instance vector such that each vector has zero mean and unit variance.

    ( (f11, f12, f13, f14) - mean((f11, f12, f13, f14) ). /std((f11, f12, f13, f14) )

scale each colum of the above matrix to a range. for example [-1, 1]

According to my experiments with RBF kernel (libSVM) I found that the second scaling (2) improves the results by about 10%. I did not understand the reason why (2) gives me a improved results.

Could anybody explain me what is the reason for applying scaling and why the second option gives me improved results?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T02:51:10+0000

The standard thing to do is to make each dimension (or attribute, or column (in your example)) have zero mean and unit variance.

This brings each dimension of the SVM into the same magnitude. From http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf:

The main advantage of scaling is to avoid attributes in greater numeric ranges dominating those in smaller numeric ranges. Another advantage is to avoid numerical diculties during the calculation. Because kernel values usually depend on the inner products of feature vectors, e.g. the linear kernel and the polynomial ker- nel, large attribute values might cause numerical problems. We recommend linearly scaling each attribute to the range [-1,+1] or [0,1].

Categories

machine learning - svm scaling input values

machine learning - svm scaling input values

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags