If you don’t want to reference private member variables, with a bit more typing you can recreate the HybridSequential
block and replace the one in finetune_net
The existing HybridBlock looks like:
# finetune_net.conv1:
(0): Conv2D(3 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=32)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(fix_gamma=False, use_global_stats=False, eps=1e-05, momentum=0.9, axis=1, in_channels=32)
(5): Activation(relu)
(6): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
To replicate this with the alternate first layer, create the new Conv2D layer with 4 input channels:
# Create a Conv2D with 4 input channels
new_conv2d = nn.Conv2D(32, (3,3), strides=(2,2), padding=(1,1), in_channels=4, use_bias=False)
From @ThomasDelteil’s post with one of the copy’s removed as we’ll be re-using existing data:
new_weight_data = mx.nd.zeros(shape=(32, 4, 3, 3))
new_weight_data[:,:3,:,:] = finetune_net.conv1[0].weight.data() # 3 existing channels
new_weight_data[:,3,:,:] = new_weight_data[:,0,:,:].copy() # Use first channel for last
# finetune_net.conv1[0]._in_channels = 4 # <- no need to do this now
new_conv2d.weight.initialize()
new_conv2d.weight.set_data(new_weight_data)
Create a replacement HybridSequential block with the required layers:
conv1 = nn.HybridSequential()
conv1.add(new_conv2d)
for i in range(5):
conv1.add(finetune_net.conv1[i+1])
Finally, update finetune_net
with the new HybridBlock:
finetune_net.conv1 = conv1
Vishaal