GANs - historical averaging function


Dear all,

I was going through the Improved Techniques for training GANs, can anyone please suggest how to implement efficiently the historical averaging (Section 3.3)? This is an additional loss term of the form

additional_loss = ||\theta - (1/t) \sum_{i=1}^t \theta[i]||

where \theta are the generator weights/bias. The first thing that comes up into my mind is that I need to create an identical copy “network” that will store all parameters (with grad_req='null') and be used as a “history recorder”. Then define a smoothing function (like exponential smoothing) that will take as input the parameters of the two networks, and will result the smothed out version.

Thank you in advance.


Hi @feevos, I don’t think you can avoid holding another copy of the weights here! Although it depends on your model, the model weights usually don’t take up the largest share of your memory, it’s the feature maps that do, so this hopefully isn’t a concern.

As an alternative to creating another model, you might want to work with the ParameterDicts returned from net.collect_params(). Initially take a copy, and then update each of the parameters these using (param_avg*iteration+param_cur)/(iteration+1) on each iteration.


Thank you very much @thomelane, I’ll go with your suggestion.