Word embedding training example


#1

I am following the steps in

https://gluon-nlp.mxnet.io/examples/word_embedding/word_embedding_training.html

which returns similar results until the following step:

example_token = “vector”
get_k_closest_tokens(vocab, embedding, 10, example_token)

which does not return similar tokens to “vector”.

closest tokens to “vector”: is, in, zero, a, one, two, of, the, and, to

It appears that word vector for the example token are all zeroes and presumable all words in the vocabulary are likewise. I thought that the intialization is based on its ngrams?

When I set the model to train , the result is the same at the end of training