Does anyone have working examples of how to use the Attention cell in the gluon NLP package? How general purpose is it and can they be stacked to make a hierarchical attention model?
You can find an example usuage of the
AttentionCell in the Google Neural Machine Translation System example found here.
@Sergey has been working on a model using attention so maybe able to give some additional advice if you need it.
I was working with MLPAttentionCell().
Please check below codes
class SentClassificationModelAtt(gluon.HybridBlock): def __init__(self, vocab_size, num_embed, seq_len, hidden_size, **kwargs): super(SentClassificationModelAtt, self).__init__(**kwargs) self.seq_len = seq_len self.hidden_size = hidden_size with self.name_scope(): self.embed = nn.Embedding(input_dim=vocab_size, output_dim=num_embed) self.drop = nn.Dropout(0.3) self.bigru = rnn.GRU(self.hidden_size,dropout=0.2, bidirectional=True) self.attention = nlp.model.MLPAttentionCell(30, dropout=0.2) self.dense = nn.Dense(2) def hybrid_forward(self, F ,inputs): em_out = self.drop(self.embed(inputs)) bigruout = self.bigru(em_out).transpose((1,0,2)) ctx_vector, weigth_vector = self.attention(bigruout, bigruout) outs = self.dense(ctx_vector) return(outs, weigth_vector)
You can check full code on this link, but sorry for Korean.