Word embedding training example

NiallRooney · December 5, 2018, 2:11pm

I am following the steps in

https://gluon-nlp.mxnet.io/examples/word_embedding/word_embedding_training.html

which returns similar results until the following step:

example_token = “vector”
get_k_closest_tokens(vocab, embedding, 10, example_token)

which does not return similar tokens to “vector”.

closest tokens to “vector”: is, in, zero, a, one, two, of, the, and, to

It appears that word vector for the example token are all zeroes and presumable all words in the vocabulary are likewise. I thought that the intialization is based on its ngrams?

When I set the model to train , the result is the same at the end of training

safrooze · December 13, 2018, 3:08am

I just downloaded and ran the example and it worked fine for me. Do you have the latest gluonnlp pip package installed? I’m using gluonnlp-0.5.0.post0 with mxnet 1.3.1.

NiallRooney · December 18, 2018, 9:49am

Yes I have gluonnlp==0.5.0 and mxnet==1.3.1. I am running this on python 2 on Ubuntu 16.04 LTS. Maybe it 's an issue with other libraries?

safrooze · December 31, 2018, 6:15pm

That’s very strange. What’s your development environment? I used a SageMaker notebook.

Topic		Replies	Views
GluonNLP: Numba Error with Word Embeddings Training Gluon	2	638	September 21, 2019
Gluon NLP: How to import my own pretrained W2V? Gluon	1	1056	December 16, 2018
Get Embeddings from BERT after fine tuning Gluon	0	321	December 13, 2019
Distributed Training / Model Parallelism with sparse embeddings in Gluon Gluon	2	537	June 19, 2019
Sockeye vs gluonNLP Gluon how-to , gluon-nlp	1	518	February 27, 2019

Word embedding training example

Related Topics