Sigmoid activation training


Hi, I am new to Gluon. I went through some tutorials and started experimenting with logistic regression but I am confused on a simple example I tested with. I have an array of numbers and I labeled the ones above 3 as 1, below 3 as 0. I wanted to see how training would work with a single layer, sigmoid activation. The code is below:

dataset_train =[0:train_size,:],label_mx[0:train_size,:])
data_iter_train =, batch_size,shuffle=False)

net = gluon.nn.Sequential()
with net.name_scope():

net.collect_params().initialize(mx.init.Normal(sigma=1.0), ctx=model_ctx)
softmax_cross_entropy = gluon.loss.LogisticLoss(label_format=‘binary’)
trainer = gluon.Trainer(net.collect_params(), ‘sgd’, {‘learning_rate’: .01})

epochs = 10
for e in range(epochs):
for i, (data, label) in enumerate(data_iter_train):
data = data.as_in_context(model_ctx)
label = label.as_in_context(model_ctx)

    with autograd.record():
        output = net(data)
        loss = softmax_cross_entropy(output, label)   

After I train, the function looks like this when I create outputs for a range of inputs.

Can anyone tell what I am doing wrong?


You should use SigmoidBinaryCrossEntropyLoss with from_sigmoid=True in this case.

softmax_cross_entropy expect one-hot label instead.