Hi, I’m training this public gluon example on a p2 notebook https://github.com/awslabs/amazon-sagemaker-examples/tree/master/introduction_to_applying_machine_learning/gluon_recommender_system
looking at nvidia-smi, both GPU and GPU memory are under-utilized (mem is at 340Mib/11kMib) and GPU oscillates between 23% and 25%.
Is this expected?
I had a look on the example and you are right that the GPU utilization is rather low. One way to increase it, is to increase the batch size and set num_worker=4
in gluon.data.DataLoader()
.
The main reason for the low GPU utilization is that the model in this example is very simple: It only consists of 2 embedding layers and 1 dense layer:
with self.name_scope():
self.user_embeddings = gluon.nn.Embedding(max_users, num_emb)
self.item_embeddings = gluon.nn.Embedding(max_items, num_emb)
self.dropout = gluon.nn.Dropout(dropout_p)
self.dense = gluon.nn.Dense(num_emb, activation='relu')
So feeding data fast enough into the GPU probably becomes the major bottleneck in this example.
1 Like