Is this a correct way to copy features by the pretrained model of glunoncv?


I would like to perform features subtraction by gluoncv, could I do it as following show(pseudo codes)?

import mxnet as mx

from mxnet import gluon, nd
from mxnet.gluon import nn

from gluoncv.model_zoo import get_model

# Get the model CIFAR_ResNet20_v1, with 10 output classes, without pre-trained weights
gluon_net = get_model('ResNet50_v2', pretrained=True)

#after print, I find out net composed by two blocks, features(composed by 13 blocks) and output

#get the features part
features = gluon_net.features
new_features_net = nn.HybridSequential()

#copy first 11 blocks
for i in range(11):
#fix weights of first 11 blocks
for _, w in new_features_net.collect_params().items():
    w.grad_req = 'null'

def my_block():
    my_net = nn.HybridSequential()
    return my_net
net = nn.HybridSequential()
#load data, train, test blah blah blah

Anything I miss? Thanks

By the way, if I want to finetune, how could I set the learning rate of each layer?


That sudo code looks fine to me. To cannot set the exact learning rate of each layer, but rather set a learning rate multiplier for each parameter. To do that for each layer, you can do:

block.collect_params().setattr('lr_mult', 0.5)

Alternatively, you can have multiple Trainer objects, each initialized with a subset of network parameters, and that gives you full flexibility to not only use different learning rate for different layers, but also different optimizers, different optimization schedule, etc.