I extracted training and testing data from GEE to TFRecord using the below code:(after using sampleRegions)
Export.table.toCloudStorage({
collection:trainingPartition,
description:'Training_Export',
fileNamePrefix:trainFilePrefix,
bucket:outputBucket,
fileFormat:'TFRecord'});
Later, I created a dataset from the TFRecord file on colab and tried to parse the data:
size = 128
feature_columns = [
tf.io.FixedLenFeature(shape=[size, size], dtype=tf.float32) for k in featureNames
]
features_dict = dict(zip(featureNames, feature_columns))
def parse_tfrecord(example_proto):
parsed_features = tf.io.parse_single_example(example_proto, features_dict)
labels = parsed_features.pop(label)
return parsed_features, tf.cast(labels, tf.int32)
# Map the function over the dataset.
parsedDataset = trainDataset.map(parse_tfrecord, num_parallel_calls=4)
from pprint import pprint
# Print the first parsed record to check.
pprint(iter(parsedDataset).next())
Error I'm getting:
InvalidArgumentError: Key: 3_B12_min. Can't parse serialized Example.
[[{{node ParseSingleExample/ParseExample/ParseExampleV2}}]]
This only happens when I set the size to any value larger than 1
question from:
https://stackoverflow.com/questions/66060130/cant-parse-serialized-example-tfrecord 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…