Sigmoid activation training


#1

Hi, I am new to Gluon. I went through some tutorials and started experimenting with logistic regression but I am confused on a simple example I tested with. I have an array of numbers and I labeled the ones above 3 as 1, below 3 as 0. I wanted to see how training would work with a single layer, sigmoid activation. The code is below:

batch_size=1
dataset_train = gluon.data.ArrayDataset(data_mx[0:train_size,:],label_mx[0:train_size,:])
data_iter_train = gluon.data.DataLoader(dataset_train, batch_size,shuffle=False)

net = gluon.nn.Sequential()
with net.name_scope():
net.add(gluon.nn.Dense(1,activation=‘sigmoid’))

net.collect_params().initialize(mx.init.Normal(sigma=1.0), ctx=model_ctx)
softmax_cross_entropy = gluon.loss.LogisticLoss(label_format=‘binary’)
trainer = gluon.Trainer(net.collect_params(), ‘sgd’, {‘learning_rate’: .01})

epochs = 10
for e in range(epochs):
for i, (data, label) in enumerate(data_iter_train):
data = data.as_in_context(model_ctx)
label = label.as_in_context(model_ctx)

    with autograd.record():
        output = net(data)
        loss = softmax_cross_entropy(output, label)   
    
    loss.backward()
    trainer.step(batch_size)

After I train, the function looks like this when I create outputs for a range of inputs.

Can anyone tell what I am doing wrong?


#2

You should use SigmoidBinaryCrossEntropyLoss with from_sigmoid=True in this case.
Ref: https://mxnet.incubator.apache.org/api/python/gluon/loss.html?highlight=binary#mxnet.gluon.loss.SigmoidBinaryCrossEntropyLoss

softmax_cross_entropy expect one-hot label instead.