Today I'm using Keras to solve 2 Issues Separately with the same input:
- Sequence Tagging (NER)
- Topic Classification
I have a dataset combined of sentences from different topics.
for each Topic, I have different set of Named Entities.
given a new sentence,
I would like to model to predict both topic, and the corresponding Named Entities in the sentences.
E.g.
Sentence:
"Coronavirus [ Disease ] has hit the UK [ Country ] hard, with the country recording more than 3m [ Total Cases ] cases and 90,000 [ Death Count ] deaths linked to the disease."
Topic:
Medicine
Entities:
{
'Coronavirus' : 'Disease',
'UK' : 'Country',
'3M' : 'Total_Cases',
'90,000' : 'Death_Count'
}
Assuming I have nearly 15 Topics, What would be the architecture for solving both
problems in 1 Model?
Inputs:
- tokenized sentence (
Bert Tokenizer
) : 512 tokens per sentence
Outputs per Sentence:
- 512 tags (100 classes altogether)
- sentence Topic: 15 Categories.
Data Size ~ 70000 Sentences w/ topics.
Architecture used for each problem, independently:
Sequence Tagging
inputter = Input(shape=(512,)) ## no. of Max Bert Tokens in a sentence
model = Embedding(input_dim=X_tr.max()+1, output_dim=20 #input_dim - vocab size
input_length=512)(inputter)
model = Bidirectional(LSTM(units=50, return_sequences=True,
recurrent_dropout=0.1))(model)
model = TimeDistributed(Dense(50, activation="tanh"))(model)
crf = CRF(len(y_tr.shape[-1])) #no. of labels
out = crf(model) # output
model = Model(inputter, out)
model.compile(optimizer="adam", loss=crf.loss_function, metrics=[crf.accuracy])
Topic Classification
model = Sequential()
model.add(Embedding(X_tr.max()+1, 20,input_length = 512))
model.add(Bidirectional(LSTM(50, activation='tanh',recurrent_dropout=0.2),input_shape=(512)))
model.add(Dense(36))
model.add(Dense(15 activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])
How Can I enjoy both worlds of
Many-to-Many (sequence tagging) architecture and
Many-to-One (Topic Classification) architecture
Given the same Input
to get 2 outputs, and obviously to create some sort of cross connections
to eventually make a better decision while performing sequence tagging
and get both outputs (topic and named entities)
I'm using Keras on TensorFlow backend
question from:
https://stackoverflow.com/questions/65888039/how-to-combine-topic-prediction-and-sequence-tagging-ner-in-keras-architectur