I have data that is representing relative counts (0.0-1.0) as presented in the example below. calculated with the formula
cell value(E.g.23)/sum of the colum(E.g. 1200) = 0.01916
Example data
f1 f2 f3 f5 f6 f7 f8 class
0.266 0.133 0.200 0.133 0.066 0.133 0.066 1
0.250 0.130 0.080 0.160 0.002 0.300 0.111 0
0.000 0.830 0.180 0.016 0.002 0.059 0.080 1
0.300 0.430 0.078 0.100 0.082 0.150 0.170 0
before applying Deep learning algorithm I remove features that shows a high correlation.
I am confused at the time of normalization, which method is correct before model generation.
- Use data directly because the data already scaled (0.0-1.0).
- Perform min-max scaling (https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html)
- Perform (https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html)
Because, when I use classical supervised algorithms min-max and z-scaling improve performance. But in the case of Deep learning using "TensorFlow-GPU" I am not able to see any significant difference between the two.
Thank you.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…