Softmax Regression in Gluon

mli · November 27, 2018, 10:29pm

https://d2l.ai/chapter_linear-networks/softmax-regression-concise.html

SebastienCoste · May 16, 2019, 9:53pm

My results with an 20 epoch (5 is too small to compare)

BS 1000, LR 0.1

epoch 10, loss 0.5247, train acc 0.827, test acc 0.832
epoch 20, loss 0.4783, train acc 0.839, test acc 0.842

BS100, LR 0.1

epoch 10, loss 0.4271, train acc 0.852, test acc 0.852
epoch 20, loss 0.4072, train acc 0.859, test acc 0.857

BS 1000, LR 0.5

epoch 10, loss 0.8573, train acc 0.803, test acc 0.772
epoch 20, loss 0.7668, train acc 0.813, test acc 0.837

BS10, LR 0.01

epoch 10, loss 0.4221, train acc 0.855, test acc 0.856
epoch 20, loss 0.4019, train acc 0.862, test acc 0.852

So in conclusion:

BS and LR must be correlated, and LR has an upper bound over which we’re going in every directions crazily.
the smaller the BS, the slower per epoch, but the fewer epoch we need to reach an optimum.
The training set accuracy always improves, but at some point the testing set accuracy decreases: We’re getting better at identifying the training set only. I guess we detect this stage with the testing set: When the accuracy of this set not used for training decreases, it’s time to stop the training.

Can we configure the training without a number of epoch, but instead a “stop or again” function?

thomelane · May 22, 2019, 12:16am

Hi @SebastienCoste,

Sure, you could write a while loop for the epochs, where the most recent test loss should be lower than the test loss from the epoch before. It’s called ‘early stopping’ and is used to prevent overfitting.

sandeepkundala · June 13, 2019, 9:08pm

I am getting the following error:

TypeError Traceback (most recent call last)
in
1 num_epochs = 10
----> 2 d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)

~/miniconda3/lib/python3.7/site-packages/d2l/train.py in train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, params, lr, trainer)
212 l.backward()
213 if trainer is None:
–> 214 sgd(params, lr, batch_size)
215 else:
216 trainer.step(batch_size)

~/miniconda3/lib/python3.7/site-packages/d2l/train.py in sgd(params, lr, batch_size)
60 def sgd(params, lr, batch_size):
61 “”“Mini-batch stochastic gradient descent.”""
—> 62 for param in params:
63 param[:] = param - lr * param.grad / batch_size
64

TypeError: ‘NoneType’ object is not iterable

From the above error message, i found that “params” is not passed in the function call. What should be my next step?

sandeepkundala · June 13, 2019, 11:45pm

Just figured out where I was going wrong.
the correct function call is: d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, None, None, trainer)

mass · June 25, 2019, 3:30pm

HI everyone I am experiencing troubles to training the model with the function train_ch3().
with d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, None, None, trainer).
it notifies the following error:
train_ch3() takes 6 positional arguments but 9 were given.

Can anyone help please ?

mass · June 26, 2019, 12:12pm

This is the error that i am getting. Please can you help .

TypeError Traceback (most recent call last)
in
3
4 d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, None,
----> 5 None, trainer)

TypeError: train_ch3() takes 6 positional arguments but 9 were given

sandeepkundala · July 15, 2019, 6:56pm

Seems like they have updated the function. The new definition is as follows:

def train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)

Sum_it · July 24, 2019, 3:49pm

I am getting the following error with the current version of the code:

TypeError Traceback (most recent call last)
in
1 num_epochs = 10
----> 2 d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)

C:\Anaconda3\lib\site-packages\d2l\d2l.py in train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)
186 legend=[‘train loss’, ‘train acc’, ‘test acc’])
187 for epoch in range(num_epochs):
–> 188 train_metrics = train_epoch_ch3(net, train_iter, loss, updater)
189 test_acc = evaluate_accuracy(net, test_iter)
190 animator.add(epoch+1, train_metrics+(test_acc,))

C:\Anaconda3\lib\site-packages\d2l\d2l.py in train_epoch_ch3(net, train_iter, loss, updater)
127 l = loss(y_hat, y)
128 l.backward()
–> 129 updater()
130 # measure loss and accuracy
131 train_l_sum += l.sum().asscalar()

TypeError: ‘Trainer’ object is not callable

Can you please help in this regard. Thanks in advance

Sum_it · July 25, 2019, 10:57am

I fixed the issue by installing the latest version of “d2l” using the below command,

pip install git+https://github.com/d2l-ai/d2l-en

I think that since the code is changing frequently we should always use the above installation procedure.

Daeshik_Choi · August 12, 2019, 12:27am

Two things in 3.7.2.

zj is the j-th element of the input y_linear variable.
What does the input y_linear mean?
But instead of passing softmax probabilities into our new loss function, we’ll just pass ŷ…
ŷ is the value calculated by the softmax function. So I believe that ŷ should be replaced with z.

vrengana · December 27, 2019, 4:28pm

I am getting the following error:
in
1 num_epochs = 10
----> 2 d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, trainer)

/media/vr/Storage/Python_Scripts/psenv/lib/python3.6/site-packages/d2l/d2l.py in train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)
284 legend=[‘train loss’, ‘train acc’, ‘test acc’])
285 for epoch in range(num_epochs):
–> 286 train_metrics = train_epoch_ch3(net, train_iter, loss, updater)
287 test_acc = evaluate_accuracy(net, test_iter)
288 animator.add(epoch+1, train_metrics+(test_acc,))

/media/vr/Storage/Python_Scripts/psenv/lib/python3.6/site-packages/d2l/d2l.py in train_epoch_ch3(net, train_iter, loss, updater)
233 l.backward()
234 updater(X.shape[0])
–> 235 metric.add(float(l.sum()), accuracy(y_hat, y), y.size)
236 # Return training loss and training accuracy
237 return metric[0]/metric[2], metric[1]/metric[2]

TypeError: float() argument must be a string or a number, not ‘NDArray’

Topic		Replies	Views
Stopping crioterion Discussion	2	507	November 8, 2017
Softmax from scratch doesn't work D2L Book	3	879	December 28, 2018
0 test/train accuracy for Q1.4 Courses	2	458	February 26, 2019
Validation accuracy not getting attained completely Gluon	2	561	February 12, 2020
Calculating loss b/w training Performance	2	839	May 22, 2019

Softmax Regression in Gluon

I am getting the following error:

TypeError: ‘Trainer’ object is not callable

Related Topics