I trained two models separately, they have the same input data. In order to obtain the minimum processing delay, how to group the two models in the inference deployment? like this:
I try to use the code:
class ConcateNetwork(gluon.HybridBlock):
def __init__(self, net1, net2):
super(ConcateNetwork, self).__init__()
self.net1 = net1
self.net2 = net2
def hybrid_forward(self, F, x):
output_char1 = self.net1(x)
output_char2 = self.net2(x)
return output_char1, output_char2
net = ConcateNetwork(net1, net2)
However, tests have shown that this code still a method of executing two models in sequence. Is there any way to make these two models execute in parallel, just like a model?
Thanks