Is that possible to update only certain weights of embedding layers?

robert · December 7, 2017, 3:31am

Hi Guys,

I had a matrix factorization network defined by the following graph. I could get the weights of the embedding layers. Say now I had saved the weights. If I reloaded the weights to a new instance of the same network graph, can I fix certain weights (eg, get only some weights being updated)?

For example, when training the new network instance, I actually just want to update the last weight of the 8 latent_dim weights, and leave the first 7 fixed.

Thanks!

latent_dim = 8
y_true = mx.symbol.Variable("label")
user = mx.symbol.Variable("member")
user = mx.symbol.Embedding(name='member_embedding', data=user, input_dim=n_users, output_dim=latent_dim) 

book = mx.symbol.Variable("book")
book = mx.symbol.Embedding(name='book_embedding', data=book, input_dim=n_items, output_dim=latent_dim)
    
dot = user * book
dot = mx.symbol.sum_axis(dot, axis=1)
dot = mx.symbol.Flatten(dot)
dot = 1 - dot
    
return mx.symbol.LinearRegressionOutput(data=dot, label=y_true)

safrooze · December 7, 2017, 11:11pm

If you’re using the module API for optimization, the easy way to freeze weights is by using fixed_param_names passed in module’s constructor. However what you are looking for is sub-parameter weight freezing. An easy way to achieve what you want is to split your latent space into two subsets. In your specific example, you’d write something like this:

user = mx.symbol.Variable("member")
user1 = mx.symbol.Embedding(name='member_embedding1', data=user, input_dim=n_users, output_dim=latent_dim-1)
user2 = mx.symbol.Embedding(name='member_embedding2', data=user, input_dim=n_users, output_dim=1)
user = mx.symbol.concat(user1, user2, dim=1)

Using the above trick, you get two weight sets (member_embedding1_weight and member_embedding2_weight) and you can freeze one set and optimize the other set.

robert · December 7, 2017, 11:43pm

That’s a really cool idea/trick! I learned some new stuff from it. Thanks Safrooze!

After reading your reply. I realized that my question was misleading, sorry about this. What I really wanted is after training the embedding layers, I want to add some new ‘words’ to the model, but at the same time, I want to keep the weights of the existing ‘vocabulary’ unchanged.

In my case, say I use 2 latent dims, and got three books: 0, 1, 2, the book_embedding_weight matrix may be like this:

book_embedding_weight =
[[0.3, 0.4],
[0.8, 0.5],
[0.2, 0.3]]

Now, I want to introduce one new ‘book’.
I could create a new instance of the network and reload the old weights and initialize the new weights as zeros:

[[0.3, 0.4], # book 0’ weights, keep this unchanged
[0.8, 0.5], # book 1’ weights, keep this unchanged
[0.2, 0.3], # book 2’ weights, keep this unchanged
[0, 0]] # book3, only learn this weight vector

If I start to train the network, I really just want to learn the weights for the new book. And ideally the weights to the new book are comparable to the weights of ‘old’ book.
Is this even possible in mxnet? I would want to add new ‘users’ as well.

Thanks!

safrooze · December 8, 2017, 10:48pm

In order to achieve what you want, you’d need to manually compose the embedding layer by stacking one_hot, two instances of slice_axis, FullyConnected, and sum . Example:

book = mx.sym.one_hot(indices=book, depth=n_items, name='one_hot')
book1 = mx.sym.slice_axis(data=book, axis=-1, begin=0, end=-1)
book1 = mx.sym.FullyConnected(data=book1, num_hidden=num_hidden, no_bias=True)
book2 = mx.sym.slice_axis(data=book, axis=-1, begin=-1, end=n_items)
book2 = mx.sym.FullyConnected(data=book2, num_hidden=num_hidden, no_bias=True)
book = book1 + book2

Now you have fullyconnected0_weight and fullyconnected1_weight and you can freeze one and let the other train.

robert · December 10, 2017, 3:13am

This is an elegant solution Safrooze!
Really appreciate your help!

Topic		Replies	Views
Fix weights of layers in the .fit() function Discussion	1	682	November 21, 2018
Set and Freeze weights of Embedding layer	9	1763	November 6, 2019
Confusion over implementation of Embedding: dense or row_sparse weights? Discussion	1	422	November 27, 2018
Symbolic mode: how to block the gradient in the graph? MXNet Model Server	2	744	November 20, 2019
Revert to previous symbolic graph	8	681	August 26, 2018

Is that possible to update only certain weights of embedding layers?

Related Topics