Get HybridBlock layer shape on runtime


#1

Dear all,

I am trying to build a custom pooling layer and I need to know the input shape at runtime. According to the documentation, HybridBlock has the function “infer_shape”, but I don’t seem able to make it work.

For example:


import mxnet as mx
import mxnet.ndarray as nd
from mxnet.gluon import HybridBlock

class runtime_shape(HybridBlock):
    
   
    def __init__(self,  **kwards):
        HybridBlock.__init__(self,**kwards)


    def hybrid_forward(self,F,_input):

        print (self.infer_shape(_input))
        
        return _input

xx = nd.random_uniform(shape=[5,5,16,16])

mynet=runtime_shape()
mynet.hybrid_forward(nd,xx)

This is the error I get:


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-41-3f539a940958> in <module>()
----> 1 mynet.hybrid_forward(nd,xx)

<ipython-input-38-afc9785b716d> in hybrid_forward(self, F, _input)
     17     def hybrid_forward(self,F,_input):
     18 
---> 19         print (self.infer_shape(_input))
     20 
     21         return _input

/home/dia021/anaconda2/lib/python2.7/site-packages/mxnet/gluon/block.pyc in infer_shape(self, *args)
    460     def infer_shape(self, *args):
    461         """Infers shape of Parameters from inputs."""
--> 462         self._infer_attrs('infer_shape', 'shape', *args)
    463 
    464     def infer_type(self, *args):

/home/dia021/anaconda2/lib/python2.7/site-packages/mxnet/gluon/block.pyc in _infer_attrs(self, infer_fn, attr, *args)
    448     def _infer_attrs(self, infer_fn, attr, *args):
    449         """Generic infer attributes."""
--> 450         inputs, out = self._get_graph(*args)
    451         args, _ = _flatten(args)
    452         arg_attrs, _, aux_attrs = getattr(out, infer_fn)(

/home/dia021/anaconda2/lib/python2.7/site-packages/mxnet/gluon/block.pyc in _get_graph(self, *args)
    369             params = {i: j.var() for i, j in self._reg_params.items()}
    370             with self.name_scope():
--> 371                 out = self.hybrid_forward(symbol, *grouped_inputs, **params)  # pylint: disable=no-value-for-parameter
    372             out, self._out_format = _flatten(out)
    373 

/home/dia021/anaconda2/lib/python2.7/site-packages/mxnet/gluon/block.pyc in __exit__(self, ptype, value, trace)
     78         if self._block._empty_prefix:
     79             return
---> 80         self._name_scope.__exit__(ptype, value, trace)
     81         self._name_scope = None
     82         _BlockScope._current = self._old_scope

AttributeError: 'NoneType' object has no attribute '__exit__'


#2

Hi @feevos, did you figure this out?


#3

Hi @OliverColeman no I haven’t. I’ve also opened an issue on github but no luck yet. From the understanding I’ve built on mxnet over the months, it seems to me that before it is possible to call the shape, then some operator that supports automatic shape inference must precede the self.infer_shape(_input) call. This is evident (among other things) from this example, where the author changes the F.dot operation with F.FullyConnected for automatic shape inference.

The easiest solution so far is to provide the layer shape. But for the case of PSP Pooling operator that I am interested, this also means that I need to infer the shape in the depth of the convolutions that I am interested inserting the pooling layer. Not really “beautiful”.

Cheers


#4

Hey @feevos. So what you’re trying to do in a hybridblock isn’t possible. There seems to be a bit of confusion around how hybrid_forward() is called. When a block is hybridized, hybrid_forward() is only called once for creating the symbolic graph. After that the cached graph is used for performing the forward operation.

You can use a normal block and that should work perfectly fine. If you insist on having your layers hybridizable, you can split your network into Hybrid and normal sections to maximize the performance.

There is also a shape_op() in the works here that may help you write a fully hybridizble block once it makes it to a release.


#5

Thank you very much @safrooze.

Could you please provide an example/link for this? I’ve never seen it before. I’d be very happy to hybridize (even part of) my network cause it’s rather large and it should help in performance. And to be honest, If I could find an alternative for PSP Pooling, I’d get rid of it completely.

Cheers


#6

It’s really simple. Let’s say you have a custom pooling layer that is not hybridizable:

class CustomPooling(Block):
    def __init__():
        # Initialize
    def forward(x):
        # custom pooling op
        return y

Now if everything before this layer are hybridizable, then put them into a single hybrid block. Same with all layers after this custom pooling layer:

class BeforePooling(HybridBlock):
    def __init__():
        # Initialize
    def hybrid_forward(F, x):
        # perform ops
        return y

class AfterPooling(HybridBlock):
    def __init__():
        # Initialize
    def hybrid_forward(F, x):
        # perform ops
        return y

Now you can either create a custom block class, or use nn.sequential to chain these three sections up:

net = nn.sequential()
with net.name_scope():
    net.add(BeforePooling)
    net.add(CustomPooling)
    net.add(AfterPooling)
net.hybridize()

What hybridize() does in this case is that it goes through all the children and hybridizes every child that is hybridizable. Keep in mind that the hybridization in this case is done one child as a time. So if you, for example, use nn.sequential (or a custom block) and have two hybridizable layers (e.g. two conv layers) added as two separate children (e.g. in two add() calls), even though they maybe back to back, the hybridization creates a separate symbolic graph for each child if the parent is a Block. Just make sure that you collect all your hybridizable layers under one HybridBlock parent before adding on the normal Block layers and then hybridize the final parent Block.


Problems with Hybridize
#7

Thank you very much @safrooze. I don’t know though if it can help in my situation (I sure hope it can!).
I’ve tested what you suggest, like this, and it works:

from mxnet import gluon
from mxnet.gluon import  Block
from mxnet.ndarray import NDArray

# This is a simple wrapper for Conv2D + BatchNorm 
from phaino.nn.layers.conv2Dnormed import *


class PSP_Pooling(Block):

    """
    Pyramid Scene Parsing pooling layer, as defined in Zhao et al. 2017 (https://arxiv.org/abs/1612.01105)        
    This is only the pyramid pooling module. 
    INPUT:
        layer of size Nbatch, Nchannel, H, W
    OUTPUT:
        layer of size Nbatch,  Nchannel, H, W. 

    """

    def __init__(self, _nfilters, _norm_type = 'BatchNorm', **kwards):
        Block.__init__(self,**kwards)

        self.nfilters = _nfilters

        # This is used as a container (list) of layers
        self.convs = gluon.nn.HybridSequential()
        with self.name_scope():

            self.convs.add(Conv2DNormed(self.nfilters//4,kernel_size=(3,3),padding=(1,1), prefix="_conv1_"))
            self.convs.add(Conv2DNormed(self.nfilters//4,kernel_size=(3,3),padding=(1,1), prefix="_conv2_"))
            self.convs.add(Conv2DNormed(self.nfilters//4,kernel_size=(3,3),padding=(1,1), prefix="_conv3_"))
            self.convs.add(Conv2DNormed(self.nfilters//4,kernel_size=(3,3),padding=(1,1), prefix="_conv4_"))
            
        self.conv_norm_final = Conv2DNormed(channels = self.nfilters,
                                            kernel_size=(1,1),
                                            padding=(0,0),
                                            _norm_type=_norm_type)



    def forward(self,_input):
        
        layer_size = _input.shape[2]
        
        p = [_input]
        for i in range(4):

            pool_size = layer_size // (2**i) # Need this to be integer 
            x = nd.Pooling(_input,kernel=[pool_size,pool_size],stride=[pool_size,pool_size],pool_type='max')
            x = nd.UpSampling(x,sample_type='nearest',scale=pool_size)
            x = self.convs[i](x)
            p += [x]

        out = nd.concat(p[0],p[1],p[2],p[3],p[4],dim=1)

        out = self.conv_norm_final(out)

        return out


nfilters = 32

net = gluon.nn.Sequential()
with net.name_scope():
    net.add(gluon.nn.Conv2D(nfilters,kernel_size=(3,3),padding=(1,1)))
    net.add(PSP_Pooling(nfilters))

net.initialize(mx.initializer.Xavier())
net.hybridize()

however for my needs I need to use the PSP_Pooling layer inside another network and it is not sequential in nature (for semantic segmentation, it follows the encoder-decoder paradigm where one uses previous layers with addition/concatenation (skip connections)). So there I am getting errors. For example this fails:

    
#This doesn't work 
class CustomNet (HybridBlock):
    
    def __init__(self,nfilters,**kwards):
        HybridBlock.__init__(self,**kwards)
        
        
        
        with self.name_scope():
            
            self.conv1 = Conv2DNormed(nfilters,kernel_size=3,padding=1)
            self.psp = PSP_Pooling(nfilters)
            
    def hybrid_forward(self,F,x):
        
        out1 = self.conv1(x)
        out1 = F.relu(out1)
        out1 = self.psp(out1)
        
        # Need to combine layers within the network
        # Simple example: addition, can be addition and/or concatenation
        out1 = out1+x
        
        
        return out1

with the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-18-cf7dc25a350f> in <module>()
      1 nfilters  = 32
----> 2 net = CustomNet(nfilters)

<ipython-input-17-8795e81219a2> in __init__(self, nfilters, **kwards)
    109 
    110             self.conv1 = Conv2DNormed(nfilters,kernel_size=3,padding=1)
--> 111             self.psp = PSP_Pooling(nfilters)
    112 
    113     def hybrid_forward(self,F,x):

~/anaconda3/lib/python3.6/site-packages/mxnet/gluon/block.py in __setattr__(self, name, value)
    404     def __setattr__(self, name, value):
    405         """Registers parameters."""
--> 406         super(HybridBlock, self).__setattr__(name, value)
    407         if isinstance(value, HybridBlock):
    408             self._clear_cached_op()

~/anaconda3/lib/python3.6/site-packages/mxnet/gluon/block.py in __setattr__(self, name, value)
    197                 self.register_child(value)
    198         elif isinstance(value, Block):
--> 199             self.register_child(value)
    200 
    201         super(Block, self).__setattr__(name, value)

~/anaconda3/lib/python3.6/site-packages/mxnet/gluon/block.py in register_child(self, block)
    491                 "but %s has type %s. If you are using Sequential, " \
    492                 "please try HybridSequential instead"%(
--> 493                     str(block), str(type(block))))
    494         super(HybridBlock, self).register_child(block)
    495         self._clear_cached_op()

ValueError: Children of HybridBlock must also be HybridBlock, but PSP_Pooling(
  (convs): HybridSequential(
    (0): Conv2DNormed(
      (conv2d): Conv2D(None -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (norm_layer): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
    )
    (1): Conv2DNormed(
      (conv2d): Conv2D(None -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (norm_layer): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
    )
    (2): Conv2DNormed(
      (conv2d): Conv2D(None -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (norm_layer): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
    )
    (3): Conv2DNormed(
      (conv2d): Conv2D(None -> 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (norm_layer): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
    )
  )
  (conv_norm_final): Conv2DNormed(
    (conv2d): Conv2D(None -> 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
    (norm_layer): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
  )
) has type <class '__main__.PSP_Pooling'>. If you are using Sequential, please try HybridSequential instead

but this works:

class CustomNet (Block):
    
    def __init__(self,nfilters,**kwards):
        Block.__init__(self,**kwards)
        
        
        
        with self.name_scope():
            
            self.conv1 = Conv2DNormed(nfilters,kernel_size=3,padding=1)
            self.psp = PSP_Pooling(nfilters)
            
    def forward(self,x):
        
        out1 = self.conv1(x)
        out1 = nd.relu(out1)
        out1 = self.psp(out1)
        
        
        # Simple addition
        out1 = out1+x
        
        
        return out1

Any ideas if I can make it to work for my case? Thank you very much for all the help.


#8

Every child block of a HybridBlock must also be a HybridBlock. You have to split this block into three sub blocks: one before PSP, one PSP, and one after PSP. You can, of course, chain these in a sequential block. First HybridBlock would perform conv1 and relu and return both out1 and x. Second Block would take x and out1, perform psp, and return out1 and x. Last HybridBlock would take out1 and x and add them up.