Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
539 views
in Technique[技术] by (71.8m points)

tensorflow - What are the advantages of using tf.train.SequenceExample over tf.train.Example for variable length features?

Recently I read this guide on undocumented featuers in TensorFlow, as I needed to pass variable length sequences as input. However, I found the protocol for tf.train.SequenceExample relatively confusing (especially due to lack of documentation), and managed to build an input pipe using tf.train.Example just fine instead.

Are there any advantages to using tf.train.SequenceExample? Using the standard example protocol when there is a dedicated one for variable length sequences seems like a cheat, but does it bear any consequence?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here are the definitions of the Example and SequenceExample protocol buffers, and all the protos they may contain:

message BytesList { repeated bytes value = 1; }
message FloatList { repeated float value = 1 [packed = true]; }
message Int64List { repeated int64 value = 1 [packed = true]; }
message Feature {
    oneof kind {
        BytesList bytes_list = 1;
        FloatList float_list = 2;
        Int64List int64_list = 3;
    }
};
message Features { map<string, Feature> feature = 1; };
message Example { Features features = 1; };

message FeatureList { repeated Feature feature = 1; };
message FeatureLists { map<string, FeatureList> feature_list = 1; };
message SequenceExample {
  Features context = 1;
  FeatureLists feature_lists = 2;
};

An Example contains a Features, which contains a mapping from feature name to Feature, which contains either a bytes list, or a float list or an int64 list.

A SequenceExample also contains a Features, but it also contains a FeatureLists, which contains a mapping from list name to FeatureList, which contains a list of Feature. So it can do everything an Example can do, and more. But do you really need that extra functionality? What does it do?

Since each Feature contains a list of values, a FeatureList is a list of lists. And that's the key: if you need lists of lists of values, then you need SequenceExample.

For example, if you handle text, you can represent it as one big string:

from tensorflow.train import BytesList

BytesList(value=[b"This is the first sentence. And here's another."])

Or you could represent it as a list of words and tokens:

BytesList(value=[b"This", b"is", b"the", b"first", b"sentence", b".", b"And", b"here",
                 b"'s", b"another", b"."])

Or you could represent each sentence separately. That's where you would need a list of lists:

from tensorflow.train import BytesList, Feature, FeatureList

s1 = BytesList(value=[b"This", b"is", b"the", b"first", b"sentence", b"."])
s2 = BytesList(value=[b"And", b"here", b"'s", b"another", b"."])
fl = FeatureList(feature=[Feature(bytes_list=s1), Feature(bytes_list=s2)])

Then create the SequenceExample:

from tensorflow.train import SequenceExample, FeatureLists

seq = SequenceExample(feature_lists=FeatureLists(feature_list={
    "sentences": fl
}))

And you can serialize it and perhaps save it to a TFRecord file.

data = seq.SerializeToString()

Later, when you read the data, you can parse it using tf.io.parse_single_sequence_example().


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...