How to access the output values of a sub-sub custom Block


#1

I have a large network with several custom Blocks within custom Blocks within custom Blocks, etc.

As a simplified example, let’s take:

class CustomA(HybridBlock):
    def __init__(self):
        super(CustomA, self).__init__()
        with self.name_scope():
            self.conv = nn.Conv1D(...)

    # hybrid_forward() uses self.conv

class CustomB(HybridBlock):
    def __init__(self):
        super(CustomB, self).__init__()
        with self.name_scope():
            self.a = CustomA(...)

    # hybrid_forward() uses self.a

class MyNet(HybridBlock):
    def __init__(self):
        super(MyNet, self).__init__()
        with self.name_scope():
            self.b = CustomB(...)

    # hybrid_forward() uses self.b

net = MyNet()

Then, when I call out = net(input), how can I access directly the output of one of the sub-sub-blocks? (let’s say the output of net.b.a, for instance to pass it to a loss function in a multi-task context)

At the moment, I do it by adding a long series of return statements, one in each of the hybrid_forward() functions, to pass the values back from the innermost parts of the model to the outside world. This doesn’t scale well.

Ideally I would like something like:

out, h = net(input)  # h would be some kind of handler to access all inner results of the forward pass
out_a = h.get_output_from('mynet0_customb0_customa0')  # passing the namespaced key of the custom sub-sub-block, similar to what is done to access the weights

I could implement my own, but it seems so obviously useful that I’m wondering if I overlooked an existing solution from mxnet or Gluon.

(yes, I have read Access the activation of a custom layer during forward pass and no, it is not a duplicate as the proposed solution implies to have a network built in a single sequence with all parts directly accessible rather than through nested custom blocks)


#2

In a non-hybridized network, it’s pretty simple, just assign the output of block A in forward() to a member properties, like self.output and then you can access it with net.b.a.output.

In a hybridized network I am not aware of an API that would let you do what you suggest, and would opt out for bubbling up the return values as you described.

One thing I would suggest is to consider whether deep nesting is really necessary rather than a flatter structure that would make it less cumbersome for you to return all the symbols that you need.


#3

Thanks Thomas. Yes, the context of the question is a hybridized network. In our use case, the network is quite big and custom components are re-used in various places, so abstraction makes sense, but we’ll look into how much flattening is possible.

Should I submit a feature request to mxnet github for such an API?


#4

Yes definitely, if that something that would be useful to you, it might useful to others too. I suspect the reason why it is not available just yet is that by allowing arbitrary access to any symbol of the graph, it means you need to keep in memory the output of any symbol, or have a mechanism to get it recomputed, which is not trivial when the graph has been optimized or some operator fused.