Dear all,

I was going through the Improved Techniques for training GANs, can anyone please suggest how to implement **efficiently** the historical averaging (Section 3.3)? This is an additional loss term of the form

```
additional_loss = ||\theta - (1/t) \sum_{i=1}^t \theta[i]||
```

where `\theta`

are the generator weights/bias. The first thing that comes up into my mind is that I need to create an identical copy “network” that will store all parameters (with `grad_req='null'`

) and be used as a “history recorder”. Then define a smoothing function (like exponential smoothing) that will take as input the parameters of the two networks, and will result the smothed out version.

Thank you in advance.