I asked this in the github issues… and Anirudh was kind enough to forward me to this forum.
I’m attempting to train a recommender system with MXNet 1.0.0 in Python3, but I’m running into the following problem: the dataset has rough 5M items, 200k users. This means that I am not able to have an embedding size larger than 100, since the model would not fit into memory:
user_embed = mx.symbol.Embedding(name="user_embed", data=user, input_dim=5000000, output_dim=100) item_embed = mx.symbol.Embedding(name="item_embed", data=item, input_dim=200000, output_dim=100) pred = mx.symbol.sum_axis(pred, axis=1) pred = mx.symbol.Flatten(pred) pred = mx.symbol.LinearRegressionOutput(data=pred, label=score)
I know that it’s an overly simplistic model, but even this one doesn’t fit into GPU memory… well, then a more complex model won’t fit either. The GPU I’m working with has 12GB.
Results with embedding size 100 are ok with a smaller amount of items/users, but for the full dataset they are not anymore, so I assume the problem is that the embeddings do not have enough expressive power.
Is there a way to reduce that memory footprint? Perhaps loading the embeddings “on-demand”, i.e., only those that are actually required for a specific batch?
Thanks in advance!!