Layer access in a pre-trained model


#1

Hi, there!
I want to access the first layer of a pre-trained ResNet50 in gluoncv.

from gluoncv.model_zoo import get_model
model_name = ‘ResNet50_v1c’
finetune_net = get_model(model_name, pretrained=True)

The first conv layer is

Parameter resnetv1b0_conv0_weight (shape=(32, 3, 3, 3), dtype=<class ‘numpy.float32’>)

I want to change the shape this layer to (32,4,3,3) and use one channel of resnetv1b0_conv0_weight, say the first channel [:,0,:,:], to init the forth channel.

How can I do this?
Thank you!


#2

hi @JWarlock, you have several ways to achieve this.

One way is the following:

from gluoncv.model_zoo import get_model
model_name = "ResNet50_v1c"
finetune_net = get_model(model_name, pretrained=True)

# Prepare the new weight data
new_weight_data = mx.nd.zeros(shape=(32, 4, 3, 3))
new_weight_data[:,:3,:,:] = finetune_net.conv1[0].weight.data().copy() # Copy 3 existing channels
new_weight_data[:,3,:,:] = new_weight_data[:,0,:,:].copy() # Use first channel for last

# Reset the weights of the old conv
finetune_net.conv1[0]._in_channels = 4
finetune_net.conv1[0].weight._shape = None
finetune_net.conv1[0].weight.initialize(force_reinit=True)
finetune_net.conv1[0].weight.set_data(new_weight_data)

Testing to make sure it doesn’t crash with 4 channel inputs:

finetune_net(mx.nd.ones((1,4,224,224)))
[[-1.14802015e+00  1.68654275e+00 -1.14919551e-01 -1.33685505e+00
  -7.57148862e-01  8.47965479e-02 -1.19770670e+00 -1.06845006e-01
  -1.00977755e+00 -1.67431593e-01  7.82547235e-01 -9.10857320e-01
  -7.77129710e-01 -5.41859925e-01 -1.37483168e+00 -8.41120005e-01
................................
   1.32182956e+00 -4.63133484e-01 -1.43275237e+00 -5.06953001e-01
  -3.07285666e-01  3.21178883e-02 -1.22683585e+00 -2.23049730e-01
  -1.76394299e-01 -6.99617863e-02  6.65577471e-01  2.79300362e-02
  -7.27643251e-01 -1.12396431e+00 -7.62938499e-01 -1.30449688e+00
   2.36235097e-01 -1.21347117e+00 -2.82195270e-01  9.15250182e-03]]
<NDArray 1x1000 @cpu(0)>

#4

If you don’t want to reference private member variables, with a bit more typing you can recreate the HybridSequential block and replace the one in finetune_net

The existing HybridBlock looks like:

  # finetune_net.conv1:    
  (0): Conv2D(3 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
  (1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=32)
  (2): Activation(relu)
  (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
  (4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=32)
  (5): Activation(relu)
  (6): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)

To replicate this with the alternate first layer, create the new Conv2D layer with 4 input channels:

# Create a Conv2D with 4 input channels
new_conv2d = nn.Conv2D(32, (3,3), strides=(2,2), padding=(1,1), in_channels=4, use_bias=False)

From @ThomasDelteil’s post with one of the copy’s removed as we’ll be re-using existing data:

new_weight_data = mx.nd.zeros(shape=(32, 4, 3, 3))
new_weight_data[:,:3,:,:] = finetune_net.conv1[0].weight.data() # 3 existing channels
new_weight_data[:,3,:,:] = new_weight_data[:,0,:,:].copy() # Use first channel for last

# finetune_net.conv1[0]._in_channels = 4 # <- no need to do this now
new_conv2d.weight.initialize()
new_conv2d.weight.set_data(new_weight_data)

Create a replacement HybridSequential block with the required layers:

conv1 = nn.HybridSequential()
conv1.add(new_conv2d)
for i in range(5):
    conv1.add(finetune_net.conv1[i+1])

Finally, update finetune_net with the new HybridBlock:

finetune_net.conv1 = conv1

Vishaal


#5

Thanks for both of you!
helps me a lot


#6

The network definition works. but when I try to train the modified ResNet:

epochs = 40
lr_factor = 0.75
lr_steps = [10, 20, 30]
lr_counter = 0
num_batch = len(train_data_loader)

for epoch in range(epochs):
if epoch == lr_steps[lr_counter]:
trainer.set_learning_rate(trainer.learning_rate*lr_factor)
lr_counter += 1

tic = time.time()
train_loss = 0
metric.reset()

for i, batch in enumerate(train_data_loader):
    data = gluon.utils.split_and_load(batch[0], ctx_list=ctx, batch_axis=0, even_split=False)
    label = gluon.utils.split_and_load(batch[1], ctx_list=ctx, batch_axis=0, even_split=False)
    with ag.record():
        outputs = [finetune_net(X) for X in data]
        loss = [L(yhat, y) for yhat, y in zip(outputs, label)]
    for l in loss:
        l.backward()

    trainer.step(batch_size)
    train_loss += sum([l.mean().asscalar() for l in loss]) / len(loss)

    metric.update(label, outputs)

_, train_acc = metric.get()
train_loss /= num_batch

_, val_acc = test(finetune_net, val_data, ctx)

print('[Epoch %d] Train-acc: %.3f, loss: %.3f | Val-acc: %.3f | time: %.1f' %
         (epoch, train_acc, train_loss, val_acc, time.time() - tic))

I get following error:


MXNetError Traceback (most recent call last)
in ()
14 metric.reset()
15
—> 16 for i, batch in enumerate(train_data_loader):
17 data = gluon.utils.split_and_load(batch[0], ctx_list=ctx, batch_axis=0, even_split=False)
18 label = gluon.utils.split_and_load(batch[1], ctx_list=ctx, batch_axis=0, even_split=False)

~/anaconda3/lib/python3.7/site-packages/mxnet/gluon/data/dataloader.py in same_process_iter()
345 def same_process_iter():
346 for batch in self._batch_sampler:
–> 347 ret = self._batchify_fn([self._dataset[idx] for idx in batch])
348 if self._pin_memory:
349 ret = _as_in_context(ret, context.cpu_pinned())

~/anaconda3/lib/python3.7/site-packages/mxnet/gluon/data/dataloader.py in (.0)
345 def same_process_iter():
346 for batch in self._batch_sampler:
–> 347 ret = self._batchify_fn([self._dataset[idx] for idx in batch])
348 if self._pin_memory:
349 ret = _as_in_context(ret, context.cpu_pinned())

in getitem(self, idx)
42 image = mx.nd.concat(image,mx.image.imread(image_filepath,flag=0).reshape(0,0,1),dim=-1)
43 if self._transform is not None:
—> 44 image = self._transform(image)
45 if self._istrain:
46 label = self._labels.loc[ self.filenames[idx] ][“Target”]

~/anaconda3/lib/python3.7/site-packages/mxnet/gluon/block.py in call(self, *args)
539 hook(self, args)
540
–> 541 out = self.forward(*args)
542
543 for hook in self._forward_hooks.values():

~/anaconda3/lib/python3.7/site-packages/mxnet/gluon/nn/basic_layers.py in forward(self, x)
51 def forward(self, x):
52 for block in self._children.values():
—> 53 x = block(x)
54 return x
55

~/anaconda3/lib/python3.7/site-packages/mxnet/gluon/block.py in call(self, *args)
539 hook(self, args)
540
–> 541 out = self.forward(*args)
542
543 for hook in self._forward_hooks.values():

~/anaconda3/lib/python3.7/site-packages/mxnet/gluon/block.py in forward(self, x, *args)
906 with x.context as ctx:
907 if self._active:
–> 908 return self._call_cached_op(x, *args)
909
910 try:

~/anaconda3/lib/python3.7/site-packages/mxnet/gluon/block.py in _call_cached_op(self, *args)
812 i._finish_deferred_init()
813 cargs.append(i.data())
–> 814 out = self._cached_op(*cargs)
815 if isinstance(out, NDArray):
816 out = [out]

~/anaconda3/lib/python3.7/site-packages/mxnet/_ctypes/ndarray.py in call(self, *args, **kwargs)
148 ctypes.byref(num_output),
149 ctypes.byref(output_vars),
–> 150 ctypes.byref(out_stypes)))
151
152 if original_output is not None:

~/anaconda3/lib/python3.7/site-packages/mxnet/base.py in check_call(ret)
250 “”"
251 if ret != 0:
–> 252 raise MXNetError(py_str(_LIB.MXGetLastError()))
253
254

MXNetError: Error in operator normalize0_normalize0: [22:01:13] src/operator/image/./image_random-inl.h:110: Check failed: nchannels == 3 || nchannels == 1 The first dimension of input tensor must be the channel dimension with either 1 or 3 elements, but got input with shape [4,256,256]

It seems that mxnet doesnot allow input channel of 4. If so, how to make it work?


#7

Hi @JWarlock

There is a normalization happening that expects 3 channels or 1 channel as is customary (rgb/greyscale).

Would you be able to include a snippet that describes the transforms you’re using, if any, and the data iterators going into the data loader?

There are a few possibilities and workarounds for each of those possibilities and it’d be good to narrow down.

However, if you’re using ImageIter unfortunately only 3 channels are supported, so you’d have to work-around it by using datasets and dataloaders.

  • data_shape ( tuple ) – Data shape in (channels, height, width) format. For now, only RGB image with 3 channels is supported.

If you’re applying any transforms that don’t support 4 channels, you’d have to write one that did support 4.

In any case, we’ll see what’s going on and address the specific issue.

Vishaal


#8

You are right. That is the problem here.
I write my own dataset class, and here is the __getitem__ method:

def __getitem__(self,idx):
    image = None
    for colour in self._colours:
    #4 channels RGBY
        image_filepath = os.path.join(self._root, self._image_list[ self.filenames[idx] ][colour])
        if image is None:
            image = mx.image.imread(image_filepath,flag=0).reshape(0,0,1) #set flag=0 for reading gray image
        else:
            image = mx.nd.concat(image,mx.image.imread(image_filepath,flag=0).reshape(0,0,1),dim=-1)
    if self._transform is not None:
            image = self._transform(image)
    if self._istrain:
        label = self._labels.loc[ self.filenames[idx] ]["Target"]
        label = np.eye( len(name_labels_dict), dtype = np.float)[label].sum(axis=0)
        return image,label
    else:
        return image

And I merely use some basic transforms:

transform = transforms.Compose(
[transforms.Resize(size),
transforms.ToTensor(),
transforms.Normalize([0.08069, 0.05258, 0.05487, 0.08282], [0.13704, 0.10145, 0.15313, 0.13814])])

And yet it seems that transforms in mxnet.gluon.data doesnot support 4 channels either.
I’ve searched the document, it seems that augmenters in mx.image in below link maybe able to do it:

But is there anyway to do it using mx.gluon.data.transform?
Thanks!

By the way, I’ve just found out image.CreateAugmenter only support 1 or 3 channels of mean and std. Yet mx.image.ColorNormalizeAug does support different numbers of channels. I’m quite confused. Shouldn’t they be all the same?