I am trying to train an autoencoder in pytorch.
When I am loading the data into a dataloader object like this, the autoencoder training works:
total_set = Dataset(df.iloc[:15000,1:], df.iloc[:15000,0])
However, when I load the data after shuffling or sampling from the dataset the autoencoder train loss is always nan.
df = df.sample(frac = 1).reset_index(drop=True)
df = df[(df != 0).all(1)].dropna().reset_index(drop=True)
total_set = Dataset(df.iloc[:15000,1:], df.iloc[:15000,0])
I made sure I removed the 0 or nan values from the training set. Am I missing something here?
df.info() before and after shuffling:
before:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 513408 entries, 0 to 513407
Columns: 673 entries, 0 to 672
dtypes: float64(672), object(1)
memory usage: 2.6+ GB
None
after:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 513408 entries, 0 to 513407
Columns: 673 entries, 0 to 672
dtypes: float64(672), object(1)
memory usage: 2.6+ GB
None
question from:
https://stackoverflow.com/questions/65898821/loading-pandas-df-to-pytorch-dataset-loader-results-in-nan-training-loss-if-df 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…