Is it safe to call the same child block more than once in a Hybrid block? Specifically will gradients still be saved correctly for each usage of the child and will those gradients be used during update?
I’m training a network by triplet loss (to learn an embedding) and so I have the same underlying network which is called three times on three different images while training on a single example instance (a triplet). The three calls to the underlying network should all use the same parameters, so those need to be shared. Can I just call the same block more than once like I’ve been doing or do I need to build separate blocks which share parameters?
Here is a simplified skeleton of what I have done so far. (It reality the underlying network is a pretrained CNN.)
class MyCNNEncoder(HybridBlock): def __init__(self, *args, **kwds): super().__init__(*args, **kwds) with self.name_scope(): self.loss = gluon.loss.TripletLoss() self.underlying = HSeq( Dense(data.embedding_size), L2Norm(), ) def hybrid_forward(self, F, data): embeddings = [self.underlying(img) for img in data.split(3, axis=1, squeeze_axis=True)] return (self.loss(*embeddings), F.stack(*embeddings, axis=1))