I’m trying to do some annealing in the training loop. That is I have a mx.sym function that takes the n_epoch (current epoch id), and produce a scaler, the scaler will be applied to some loss terms (so that they are weaker at the beginning of the training) and become stronger later on. What is the best way to get the n_epoch into the model? One way is through the data iterator. I wonder if there are simpler/less messy ways.