I trained two models separately, they have the same input data. In order to obtain the minimum processing delay, how to group the two models in the inference deployment? like this:
I try to use the code:
class ConcateNetwork(gluon.HybridBlock): def __init__(self, net1, net2): super(ConcateNetwork, self).__init__() self.net1 = net1 self.net2 = net2 def hybrid_forward(self, F, x): output_char1 = self.net1(x) output_char2 = self.net2(x) return output_char1, output_char2 net = ConcateNetwork(net1, net2)
However, tests have shown that this code still a method of executing two models in sequence. Is there any way to make these two models execute in parallel, just like a model?