Out of Order execution in symbolic graph

Hi ALL,

In order to implement a model-parallel in the symbolic loss calculation using horovod, I implemented a CustomOp to do the horovod.mxnet.allreduce.

The customzed Op is used in constructing the symbol like this:

global_a = mx.symbol.Custom(data=local_a, op_type=‘hvd_allreduce’, average=average, op_name=‘a’)
global_b = mx.symbol.Custom(data=local_b, op_type=‘hvd_allreduce’, average=average, op_name='b)
c = a / b
return mx.symbol.Group([mx.sym.make_loss©])

However, on different processes, the execution global_a and global_b is out of order, which causes deadlock in horovod. So the key is to force the execution of global_a and global_b in the same order across different processes.

My question is if there is any way to make the execution of global_a and global_b in order? Any suggestion is highly appreciated.