I have a one hot tensor like this:
First dim is batch size, second dim is char, third is one hot vector. One batch looks like this:
And an embedding tensor like this:
Each element is in a certain range:
Then I use
char_embed = nd.batch_dot(one_hot, bert_embed) to lookup my embedding, the result contains
nan which confuses me for a whole day.
If there is no
bert_embed, why is
nan produced when it is
batch_dot with a one hot tensor?