Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
207 views
in Technique[技术] by (71.8m points)

python - Keras misinterprets training data shape

My training data has the form (?,15) where ? is a variable length.

When creating my model I specify this:

inp = Input(shape=(None,15))
conv = Conv1D(32,3,padding='same',activation='relu')(inp)
...

My training data has the shape (35730,?,15).

Checking this in python I get:

X.shape

Outputs: (35730,)

X[0].shape

Outputs: (513, 15)

When I try to fit my model on my training data I get the ValueError:

Error when checking input: expected input_1 to have 3 dimensions, but got array with shape (35730, 1)

I can only train my model by using model.train_on_batch() on a single sample.

How can I solve this? It seems like keras thinks the shape of my input data is (35730, 1) when it actually is (35730, ?, 15)

Is this a bug in keras or did I do something wrong?

I am using the tensorflow backend if that matters. This is keras 2

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

(Edited, according to OP's comment on this question, where they posted this link: https://github.com/fchollet/keras/issues/1920)


Your X is not a single numpy array, it's an array of arrays. (Otherwise its shape would be X.shape=(35730,513,15).

It must be a single numpy array for the fit method. Since you have a variable length, you cannot have a single numpy array containing all your data, you will have to divide it in smaller arrays, each array containing data with the same length.

For that, you should maybe create a dictionary by shape, and loop the dictionary manually (there may be other better ways to do this...):

#code in python 3.5
xByShapes = {}
yByShapes = {}
for itemX,itemY in zip(X,Y):
    if itemX.shape in xByShapes:
        xByShapes[itemX.shape].append(itemX)
        yByShapes[itemX.shape].append(itemY)
    else:
        xByShapes[itemX.shape] = [itemX] #initially a list, because we're going to append items
        yByShapes[itemX.shape] = [itemY]

At the end, you loop this dictionary for training:

for shape in xByShapes:
    model.fit(
              np.asarray(xByShapes[shape]), 
              np.asarray(yByShapes[shape]),...
              )

Masking

Alternatively, you can pad your data so all samples have the same length, using zeros or some dummy value.

Then before anything in your model you can add a Masking layer that will ignore these padded segments. (Warning: some types of layer don't support masking)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...