Attention Mechanisms in GluonNLP


#1

Does anyone have working examples of how to use the Attention cell in the gluon NLP package? How general purpose is it and can they be stacked to make a hierarchical attention model?


#2

Hi @dustin.holloway,

You can find an example usuage of the AttentionCell in the Google Neural Machine Translation System example found here.

Specifically you’ll want to look at this line for usage. And the selection of the different types of cells is defined here.

@Sergey has been working on a model using attention so maybe able to give some additional advice if you need it.


#3

I was working with MLPAttentionCell().

Please check below codes

class SentClassificationModelAtt(gluon.HybridBlock):
def __init__(self, vocab_size, num_embed, seq_len, hidden_size, **kwargs):
    super(SentClassificationModelAtt, self).__init__(**kwargs)
    self.seq_len = seq_len
    self.hidden_size = hidden_size 
    with self.name_scope():
        self.embed = nn.Embedding(input_dim=vocab_size, output_dim=num_embed)
        self.drop = nn.Dropout(0.3)
        self.bigru = rnn.GRU(self.hidden_size,dropout=0.2, bidirectional=True)
        self.attention = nlp.model.MLPAttentionCell(30, dropout=0.2)
        self.dense = nn.Dense(2)  
def hybrid_forward(self, F ,inputs):
    em_out = self.drop(self.embed(inputs))
    bigruout = self.bigru(em_out).transpose((1,0,2))
    ctx_vector, weigth_vector = self.attention(bigruout, bigruout)
    outs = self.dense(ctx_vector) 
    return(outs, weigth_vector)

You can check full code on this link, but sorry for Korean. :sweat_smile: