How to denormalize NN response?

I’m training a sequential NN for a regression. The input needs to be normalized and the output needs to be denormalized. Normalizing, in this case, means scaling and translating each data input coordinate to 0…1 and denormalizing means transforming the NN output data to the original ranges.

My goal is to be able to save the normalization and denormalization as layer operations when the NN is saved to file instead of saving the normalization / denormalization transformations in a separate file, which is what I am currently doing.

So far, I have approached the problem through custom layers. For example:

class NormDense( nn.Dense ):
   def __init__( self, factors, **kwargs ):
      super( NormDense, self ).__init__( **kwargs )
      with self.name_scope():
         self.scales = self.params.get( 'scales',
                                        shape = kwargs[ 'in_units' ],
                                        dtype = kwargs[ 'dtype' ],
                                        init = mx.init.Constant( factors[ 0 ].tolist() ),
                                        differentiable = False )
         self.shifts = self.params.get( 'shifts',
                                        shape = kwargs[ 'in_units' ],
                                        dtype = kwargs[ 'dtype' ],
                                        init = mx.init.Constant( factors[ 1 ].tolist() ),
                                        differentiable = False )
            
   def hybrid_forward( self, F, x, scales, shifts, *args, **kwargs ):
      xnormed = F.broadcast_add( F.broadcast_mul( x, scales ), shifts )
      return super( NormDense, self ).hybrid_forward( F, xnormed, *args, **kwargs )

class DenormDense( nn.Dense ):
   def __init__( self, factors, **kwargs ):
      super( DenormDense, self ).__init__( **kwargs )
      with self.name_scope():
         self.scales = self.params.get( 'scales',
                                        shape = kwargs[ 'units' ],
                                        dtype = kwargs[ 'dtype' ],
                                        init = mx.init.Constant( factors[ 0 ].tolist() ),
                                        differentiable = False )
         self.shifts = self.params.get( 'shifts',
                                        shape = kwargs[ 'units' ],
                                        dtype = kwargs[ 'dtype' ],
                                        init = mx.init.Constant( factors[ 1 ].tolist() ),
                                        differentiable = False )
            
   def hybrid_forward( self, F, x, scales, shifts, *args, **kwargs ):
      normed = super( DenormDense, self ).hybrid_forward( F, x, *args, **kwargs )
      return F.broadcast_add( F.broadcast_mul( normed, scales ), shifts )

def get_MunsellNet( n1, n2, n3, n4, NormDenorm ):
   Norm = numpy.transpose( NormDenorm[ 0:3 ] )
   Denorm = numpy.transpose( NormDenorm[ 3:5 ] )
   net = nn.HybridSequential()
   net.add( NormDense( Norm, units = n1, in_units = 3, activation = 'sigmoid', dtype = 'float64' ),
            nn.Dense( n2, activation = 'sigmoid', dtype = 'float64' ),
            nn.Dense( n3, activation = 'sigmoid', dtype = 'float64' ),
            nn.Dense( n4, activation = 'sigmoid', dtype = 'float64' ),
            DenormDense( Denorm, units = 2, dtype = 'float64' ) )
   net.hybridize()
   net.initialize( mx.init.Uniform(), ctx = ctx )
   return net

I tested both layers and they correctly compute the forward pass. However, when I train this network, it doesn’t learn, that is both the rmse and l2loss don’t progress at all. If I replace the DenormDense with a nn.Dense, the NN trains as expected.

Is this approach to normalization / denormalisation possible in MxNet?

If yes, what am I missing to make it work?

If no then is there another way of achieving this?