Calculating loss b/w training

I was training my model using code below:

%%time
for epoch in range(10):
    for features, labels in mxtrainloader:
        with autograd.record():
            output = mxnet(features.as_in_context(ctx))
            loss = mxobjective(output, labels.as_in_context(ctx))
        loss.backward()
        mxoptimizer.step(features.shape[0])
    print('Epoch:', epoch)

That took 1min 42 secs to complete.

Then I changed the code to:

%%time
for epoch in range(10):
    cum_loss = 0.0
    for batches, (features, labels) in enumerate (mxtrainloader):
        with autograd.record():
            output = mxnet(features.as_in_context(ctx))
            loss = mxobjective(output, labels.as_in_context(ctx))
        loss.backward()
        mxoptimizer.step(features.shape[0])
        cum_loss += loss.mean().asscalar()
        #updating cum_loss 
    print('Epoch:', epoch, 'Loss:', cum_loss/batches)

Now it’s taking 2 mins 51 secs.

That’s a 1.5 times performance drop.

The only change is that I’m updating cum_loss by adding loss.mean().asscalar() to it each iteration!! That’s it!! Only this is causing that much performance drop.
Is there any better way to see training loss each iteration??

Thanks for your time

Hi @mouryarishik,

.asscalar() is a blocking call which limits the amount of computation that can happen in parallel. I think this is the most likely reason for the slow down. It can be useful for avoiding out of memory exceptions. You could postpone calling .asscalar() until the end of the epoch (at time of print) rather than the end of each batch.

Thanks for suggestion, now the performance drop has improved from 1.5x to 1.3x.