I am trying to merge two networks. I can accomplish this by doing the following:
merged = Merge([CNN_Model, RNN_Model], mode='concat')
But I get a warning:
merged = Merge([CNN_Model, RNN_Model], mode='concat') __main__:1: UserWarning: The `Merge` layer is deprecated and will be removed after 08/2017. Use instead layers from `keras.layers.merge`, e.g. `add`, `concatenate`, etc.
So I tried this:
merged = Concatenate([CNN_Model, RNN_Model]) model = Sequential() model.add(merged)
and got this error:
ValueError: The first layer in a Sequential model must get an `input_shape` or `batch_input_shape` argument.
Can anyone give me the syntax as how I would get this to work?
Don't use sequential models for models with branches.
Use the Functional API:
from keras.models import Model
You're right in using the
Concatenate layer, but you must pass "tensors" to it. And first you create it, then you call it with input tensors (that's why there are two parentheses):
concatOut = Concatenate()([CNN_Model.output,RNN_Model.output])
For creating a model out of that, you need to define the path from inputs to outputs:
model = Model([CNN_Model.input, RNN_Model.input], concatOut)
This answer assumes your existing models have only one input and output each.
mathematically, the difference is this:
an embedding layer performs select operation. in keras, this layer is equivalent to:
k.gather(self.embeddings, inputs) # just one matrix
a dense layer performs dot-product operation, plus an optional activation:
outputs = matmul(inputs, self.kernel) # a kernel matrix outputs = bias_add(outputs, self.bias) # a bias vector return self.activation(outputs) # an activation function
you can emulate an embedding layer with fully-connected layer via one-hot encoding, but the whole point of dense embedding is to avoid one-hot representation. in nlp, the word vocabulary size can be of the order 100k (sometimes even a million). on top of that, it's often needed to process the sequences of words in a batch. processing the batch of sequences of word indices would be much more efficient than the batch of sequences of one-hot vectors. in addition,
gather operation itself is faster than matrix dot-product, both in forward and backward pass.
you can use the functional api model and separate four distinct groups:
from keras.models import model from keras.layers import dense, input, concatenate, lambda inputtensor = input((8,))
first, we can use lambda layers to split this input in four:
group1 = lambda(lambda x: x[:,:2], output_shape=((2,)))(inputtensor) group2 = lambda(lambda x: x[:,2:4], output_shape=((2,)))(inputtensor) group3 = lambda(lambda x: x[:,4:6], output_shape=((2,)))(inputtensor) group4 = lambda(lambda x: x[:,6:], output_shape=((2,)))(inputtensor)
now we follow the network:
#second layer in your image group1 = dense(1)(group1) group2 = dense(1)(group2) group3 = dense(1)(group3) group4 = dense(1)(group4)
before we connect the last layer, we concatenate the four tensors above:
outputtensor = concatenate()([group1,group2,group3,group4])
finally the last layer:
outputtensor = dense(2)(outputtensor) #create the model: model = model(inputtensor,outputtensor)
beware of the biases. if you want any of those layers to have no bias, use
old answer: backwards
sorry, i saw your image backwards the first time i answered. i'm keeping this here just because it's done...
from keras.models import model from keras.layers import dense, input, concatenate inputtensor = input((2,)) #four groups of layers, all of them taking the same input tensor group1 = dense(1)(inputtensor) group2 = dense(1)(inputtensor) group3 = dense(1)(inputtensor) group4 = dense(1)(inputtensor) #the next layer in each group takes the output of the previous layers group1 = dense(2)(group1) group2 = dense(2)(group2) group3 = dense(2)(group3) group4 = dense(2)(group4) #now we join the results in a single tensor again: outputtensor = concatenate()([group1,group2,group3,group4]) #create the model: model = model(inputtensor,outputtensor)
first, the backend:
backend functions are supposed to be used "inside" layers. you'd only use this in
lambda layers, custom layers, custom loss functions, custom metrics, etc.
it works directly on "tensors".
it's not the choice if you're not going deep on customizing. (and it was a bad choice in your example code -- see details at the end).
if you dive deep into keras code, you will notice that the
concatenate layer uses this function internally:
import keras.backend as k class concatenate(_merge): #blablabla def _merge_function(self, inputs): return k.concatenate(inputs, axis=self.axis) #blablabla
as any other keras layers, you instantiate and call it on tensors.
#in a functional api model: inputtensor1 = input(shape) #or some tensor coming out of any other layer inputtensor2 = input(shape2) #or some tensor coming out of any other layer #first parentheses are creating an instance of the layer #second parentheses are "calling" the layer on the input tensors outputtensor = keras.layers.concatenate(axis=someaxis)([inputtensor1, inputtensor2])
this is not suited for sequential models, unless the previous layer outputs a list (this is possible but not common).
finally, the concatenate function from the layers module:
this is not a layer. this is a function that will return the tensor produced by an internal
the code is simple:
def concatenate(inputs, axis=-1, **kwargs): #blablabla return concatenate(axis=axis, **kwargs)(inputs)
in keras 1, people had functions that were meant to receive "layers" as input and return an output "layer". their names were related to the
but since keras 2 doesn't mention or document these, i'd probably avoid using them, and if old code is found, i'd probably update it to a proper keras 2 code.
this backend function was not supposed to be used in high level codes. the coder should have used a
atoms_bonds_features = concatenate(axis=-1)([atoms, summed_bond_features]) #just this line is perfect
keras layers add the
_keras_shape property to all their output tensors, and keras uses this property for infering the shapes of the entire model.
if you use any backend function "outside" a layer or loss/metric, your output tensor will lack this property and an error will appear telling
_keras_shape doesn't exist.
the coder is creating a bad workaround by adding the property manually, when it should have been added by a proper keras layer. (this may work now, but in case of keras updates this code will break while proper codes will remain ok)
i had to ask this question on the keras github page and someone helped me on how to implement it properly... here's the issue on github...