I am trying to convert pandas dataframe to TF records,
I use following data to convert my pandas dataframe to TF dataset, train_x
, val_x
, test_x
are my training, validation and test data, all are pandas dataframe, and train_y
, val_y
, test_y
are the labels. They are all time series data.
shift_window_size = 125
window_size = 250
# sliding window
train_ds = tf.data.Dataset.from_tensor_slices((train_x, train_y)).window(size=window_size, shift=shift_window_size, drop_remainder=True)
val_ds = tf.data.Dataset.from_tensor_slices((val_x, val_y)).window(size=window_size, shift=shift_window_size, drop_remainder=True)
test_ds = tf.data.Dataset.from_tensor_slices((test_x, test_y)).window(size=window_size, shift=shift_window_size, drop_remainder=True)
# use flat_map to match feature and label
train_ds = train_ds.flat_map(lambda feature, label: tf.data.Dataset.zip((feature, label))).batch(window_size, drop_remainder=True)
val_ds = val_ds.flat_map(lambda feature, label: tf.data.Dataset.zip((feature, label))).batch(window_size, drop_remainder=True)
test_ds = test_ds.flat_map(lambda feature, label: tf.data.Dataset.zip((feature, label))).batch(window_size, drop_remainder=True)
then use print(train_ds)
, I got, <BatchDataset shapes: ((250, 6), (250, 1)), types: (tf.float64, tf.int64)>
.
then I try to convert my value to byte by using,
def _bytes_feature(value):
"""Returns a bytes_list from a string / byte."""
if isinstance(value, type(tf.constant(0))):
value = value.numpy() # BytesList won't unpack a string from an EagerTensor.
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
When I use print(_byte_feature(train_ds))
, I got an type error,
TypeError: Failed to convert object of type <class 'tensorflow.python.data.ops.dataset_ops._BatchDataset'> to Tensor. Contents: <BatchDataset shapes: ((250, 6), (250, 1)), types: (tf.float64, tf.int64)>. Consider casting elements to a supported type.
it shows:
<_VariantDataset shapes: ((250, 6), (250, 1)), types: (tf.float64, tf.int64)>
I also used value = tf.io.serialize_tensor(value)
t oconvert my dataset to tensor, but it also shows error.
TypeError: Failed to convert object of type <class 'tensorflow.python.data.ops.dataset_ops._VariantDataset'> to Tensor. Contents: <_VariantDataset shapes: ((250, 6), (250, 1)), types: (tf.float64, tf.int64)>. Consider casting elements to a supported type.
have no idea which part went wrong, so I have to ask for help here.
Thanks in advance.