Graph notation would do it for you. Essentially you give every layer a unique handle then link back to the previous layer using the handle in brackets at the end:
layer_handle = Layer(params)(prev_layer_handle)
Note that the first layer must be an Input(shape=(x,y))
with no prior connection.
Then when you make your model you need to tell it that it expects multiple inputs with a list:
model = Model(inputs=[in_layer1, in_layer2, ..], outputs=[out_layer1, out_layer2, ..])
Finally when you train it you also need to provide a list of input and output data that corresponds with your definition:
model.fit([x_train1, x_train2, ..], [y_train1, y_train2, ..])
Meanwhile everything else is the same so you just need to combine together the above to give you the network layout that you want:
from keras.models import Model
from keras.layers import Input, Convolution2D, Flatten, Dense, Concatenate
# Note Keras 2.02, channel last dimension ordering
# Model 1
in1 = Input(shape=(28,28,1))
model_one_conv_1 = Convolution2D(32, (3, 3), activation='relu')(in1)
model_one_flat_1 = Flatten()(model_one_conv_1)
model_one_dense_1 = Dense(128, activation='relu')(model_one_flat_1)
# Model 2
in2 = Input(shape=(784, ))
model_two_dense_1 = Dense(128, activation='relu')(in2)
model_two_dense_2 = Dense(128, activation='relu')(model_two_dense_1)
# Model Final
model_final_concat = Concatenate(axis=-1)([model_one_dense_1, model_two_dense_2])
model_final_dense_1 = Dense(10, activation='softmax')(model_final_concat)
model = Model(inputs=[in1, in2], outputs=model_final_dense_1)
model.compile(loss='categorical_crossentropy', #continu together
optimizer='adam',
metrics=['accuracy'])
model.fit([X_train_one, X_train_two], Y_train,
batch_size=32, nb_epoch=10, verbose=1)
Documentation can be found in the Functional Model API. I'd recommend reading around other questions or checking out Keras' repo as well since the documentation currently doesn't have many examples.