Alternated training parts of a model with mx.sym

dingran · March 20, 2018, 4:27pm

Suppose you have a model that has two parts (partA and partB, partA+partB = whole model), you want to train partA for a while leaving partB fixed, and then train partB while leaving partA fixed.

In Gluon, it seems like I can do this by using something the following

partA_trainer = gluon.Trainer(net.partA.collect_params(), 'adam', {'learning_rate': lr})
partB_trainer = gluon.Trainer(net.partB.collect_params(), 'adam', {'learning_rate': lr})

Then

with autograd.record():
    loss = net(data)

if (epoch // alternate_epochs) % 2 == 0:
    training_info = '<<< partA training only >>>'
    partA_trainer.step(data.shape[0])
else:
    training_info = '<<< partB training only >>>'
    partB_trainer.step(data.shape[0])

I wonder if there’s an easy way to do similar things with mx.sym API

ThomasDelteil · March 20, 2018, 6:39pm

This github issue seems to be what you are looking for:

Freeze the gradients of the parts that you don’t want to update

Topic		Replies	Views
Mxnet Sym - Change parameter in testing	11	1495	December 20, 2017
Gluon: Fix some parameters during training Gluon	1	1876	December 25, 2017
Resume training hybridized model Gluon	1	712	June 30, 2019
Freezing weight training for certain inputs to a hidden layer Gluon python , gluon , how-to	11	5883	February 9, 2020
Manipulate Parameters directly? Discussion	2	321	August 8, 2019

Alternated training parts of a model with mx.sym

Related Topics