Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
291 views
in Technique[技术] by (71.8m points)

Which Normalization method, min-max or z-scaling (Zero mean unit variance), works best for deep learning?

I have data that is representing relative counts (0.0-1.0) as presented in the example below. calculated with the formula

cell value(E.g.23)/sum of the colum(E.g. 1200)  = 0.01916

Example data

 f1       f2         f3        f5        f6      f7      f8     class  
0.266    0.133     0.200     0.133    0.066    0.133    0.066     1 
0.250    0.130     0.080     0.160    0.002    0.300    0.111     0 
0.000    0.830     0.180     0.016    0.002    0.059    0.080     1
0.300    0.430     0.078     0.100    0.082    0.150    0.170     0

before applying Deep learning algorithm I remove features that shows a high correlation.

I am confused at the time of normalization, which method is correct before model generation.

  1. Use data directly because the data already scaled (0.0-1.0).
  2. Perform min-max scaling (https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html)
  3. Perform (https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html)

Because, when I use classical supervised algorithms min-max and z-scaling improve performance. But in the case of Deep learning using "TensorFlow-GPU" I am not able to see any significant difference between the two.

Thank you.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

z-scaling is a good idea when your data is approximately normally distributed, this can often be the case.

min-max scaling is the right thing to do when you expect a largely uniform distribution.

In short, it depends on your data and your neuronal network.

But both are sensitive to outliers, you could try median-mad scaling.

See also: https://stats.stackexchange.com/questions/7757/data-normalization-and-standardization-in-neural-networks


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...