Initialization of a custom block


#1

I’m working on a Convolutional Sentiment Analysis model with gluon using a custom block. When trying to pass in data to my custom block, I encountered an error:

RuntimeError: Parameter convolutionlayer0_conv0_bias has not been initialized. Note that you should initialize parameters and create Trainer with Block.collect_params() instead of Block.params because the later does not include Parameters of nested child Blocks

My custom block is implemented like this:

class ConvolutionLayer(Block):
    def __init__(self, **kwargs):
        super(ConvolutionLayer, self).__init__(**kwargs)
        self.conv_blocks = []
        self.max_blocks = []
        # self.ngram_conv = []
        with self.name_scope():
            for sz in filter_sizes:
                conv = gluon.nn.Conv2D(channels=num_filters, kernel_size=(sz, 400), strides=(1, 400))
                max = gluon.nn.MaxPool2D(pool_size=(2, 2))
                self.conv_blocks.append(conv)
                self.max_blocks.append(max)
            self.out = gluon.nn.Dense(5)

    def forward(self, x):
        for conv, max in zip(self.conv_blocks, self.max_blocks):
            conv0 = nd.relu(conv(x)).reshape(0, -1)
            max0 = max(conv0)
            x = nd.concat(x, max0, dim=1)

        x = self.out(x)
        return x

And I initialized it like this:

    net = ConvolutionLayer()
    #initialize
    print('initializing')
    net.collect_params().initialize(mx.init.Xavier(magnitude=2.24, rnd_type='gaussian'), ctx=ctx)

    #Softmax
    softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()

    #Optimizer
    trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': .1})

Can somebody help me with the error? Ignore the hardcoded dimensions for now since I just want to make it work… Thank you so much!


#2

Hi,

the problem is that you are encapsulating the gluon.nn.XXX layers inside a list. This - from my experience - doesn’t work. There are two ways around this. First: define a custom initialize operation, where you loop over all the list elements self.conv_blocks and initialize them individually. However, this is not recommended (because if you want to save your network, you need to perform something similar. To put it in another way, a list isn’t derived from a Block, so it cannot have initialize. A second way of solving this, is constructing custom layers for conv-max-concat and then adding them in a gluon.nn.Sequential() inside your class. This is a minimal (running) example. I’ve modified kernel size / pooling and stride to make something simple and reproducible (I do not know the particulars of your problem).

import essentials

import mxnet as mx
from mxnet import nd, gluon
from mxnet.gluon import Block

define custom layer that does conv-maxpool-concat.

# Custom binding operation of Convolution followed by a maxpooling and some concatenation. 
class ConvMax(Block):
    def __init__(self, _nfilters, **kwards):
        Block.__init__(self, **kwards)
        
        self.nfilters = _nfilters

        with self.name_scope():
            self.conv = gluon.nn.Conv2D(channels = self.nfilters, kernel_size=(3, 3), padding=(1,1), strides=(1, 1),activation='relu' )
            self.pool = gluon.nn.MaxPool2D(pool_size=(2, 2))
            
            
    def forward(self,_x):
        conv = self.conv(_x)
        pool = self.pool(conv)
        # for demonstrating purposes, resize the pooling layer to the original size of _x and concat with _x
        pool = nd.UpSampling(pool,scale=2,sample_type='nearest')
        # Concat in channel space (my example)
        
        x = nd.concat(_x,pool)
        
        return x

Now use the above to define your ConvolutionLayer

class ConvolutionLayer_mod(Block):
    def __init__(self,_filter_sizes,**kwards):
        Block.__init__(self,**kwards)
        
        self.filter_sizes = _filter_sizes # Some list of filter numbers 
        
        with self.name_scope():
            self.net = gluon.nn.Sequential()
            for sz in self.filter_sizes:
                self.net.add(ConvMax(sz))                
            self.net.add(gluon.nn.Dense(5))
            
    def forward(self,_x):
        x = self.net(_x)
        
        return x
 

Let’s test it:

shape = [5,12,128,128]
xx = nd.random_uniform(shape=shape)
filter_sizes = [16,32,64]
mynet = ConvolutionLayer_mod(filter_sizes)

# initialize
mynet.initialize(mx.initializer.Xavier(),ctx=mx.cpu())

# Run a forward pass
temp = mynet(xx)

The sape of temp is [5,5] Let’s see a visualization

%pylab inline
imshow(temp)

you should see something like this:

temp

Hope this helps.


#3

Thank you so much! This is very, very helpful. I’ll try this now!


#4

Putting this here for reference: see also #10101, as I just found out the elements of gluon.nn.Sequential/HybridSequential can be accesed as list elements. That is Sequential/HybridSequantial can be used also as containers without their forward pass. So instead of the list encapsulation that doesn’t work:

self.convs = []
for i in range(5):
    self.convs += [gluon.nn.Conv2D(...)] # Add parameters of choice

def forward(self, input):
    out = self.convs[0](input)
    for i in range(1,5):
       out = self.convs[i](out)

you can use:

self.convs = gluon.nn.Sequential()
for i in range(5):
    self.convs.add(gluon.nn.Conv2D(...)) 

def forward(self, input):
    out = self.convs[0](input)
    # The order of accessing the list elements can be arbitrary
    for i in range(1,5):
       out = self.convs[i](out)

Cheers


How to initialize parameters when putting the convolutional layer into the list?