I have a model for a denoising autoencoder, and I'm running into inconsistencies. The goal of this project is to "clean" vectors with 2000 features each (1D). We want the prediction vectors to look like the target vectors based on cosine similarity. I'm using a U-net with Conv1D layers. I have ~ 1.3 million data for training/testing and by network has ~900k parameters. Batch size is 256 and learning rate is 0.0001.
cosine_loss = tf.keras.losses.CosineSimilarity(axis=1)
metric = tf.keras.metrics.CosineSimilarity(name='cosine_similarity', dtype=None, axis=1)
model.compile(optimizer=adam, loss=cosine_loss, metrics=[metric])
The loss function drops from -0.969 to -0.9712 (approximate for both validation /training) over the course of training.
However when I look at the averages of cosine scores for data the model has already seen after training between input/target/predictions the results are as follows:
Input vs. Target (original data) = 0.61; Target vs. Predicted = 0.62;
Input vs. Predicted = 0.98
Based on the success of the low loss function, this seems wrong, shouldn't the cosine similarity scores for the Target and Predicted vectors be closer? Why does the data look more like the Input data, and what can I do to further diagnose the problem?
Distributions of all cosine similarity scores shows a similar story.
So far I've made sure all the data is properly preprocessed and normalized the same way and looked at PCA of all data (input, target, predicted) to see if anything crazy is happening.
question from:
https://stackoverflow.com/questions/65929233/loss-function-inconsistent-with-post-hoc-analysis 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…