Hello, I am quite new to neural networks in general. I am currently participating in a reinforcement learning challenge to train a bomberman agent. I am quite happy with the performance of my agent so far. During the last training rounds it quickly made progress. But now I am at a point where no matter what I do, whenever I run a training round the agent ends up producing NaN values. I am not sure how to diagnose that further.
Just in case you’re interested, the agent code is here: https://github.com/cs224/bomberman_rl/tree/master/agent_code/agent_011_shred
There is also a visualization of the network structure here: https://nbviewer.jupyter.org/github/cs224/bomberman_rl/blob/master/agent_code/agent_011_shred/0000-network-structure-visualization.ipynb?flush_cache=true
And a youtube video of the current state of the agent is here: https://youtu.be/bC2APj4xf_0
I already tried to use different learning rates, different optimizers (Adam, SGD, …) and different loss functions (L2, HuberLoss). But no matter what, whenever I run now a training round I end up with NaN values in the “output = self.model(x_nf, x_nchw)” step here: https://github.com/cs224/bomberman_rl/blob/master/agent_code/agent_011_shred/model/model_base_mx.py#L783
Any ideas or suggestions on how to go forward? The only next idea I have is to replace the ELU activation with ReLU, but this would require to relearn the whole network again.