Terrible classification accuracy of mxnet

julio · November 15, 2017, 8:18pm

I’m new to mxnet, I tried the mnist handwritten recognition of the tutorial and worked perfectly. However, when I modify the example to classify 2 simple classes from an hyperspectral image (220 bands-features, 2D data set: samples x bands) using simple MLP, the network does not learn on each epoch it remains with a fixed bad accuracy (60-70%), which was the best I could get varying the number of layers and hidden nodes. I tested the same data set in weka, which uses a single hidden layer and got accuracy of almost 100% (it takes forever, but it learns). If I use the same network layout in mxnet (a single hidden layer with the same number of nodes as in Weka) I get accuracy of 0% on all epochs. So the network in mxnet is not learning on each epoch and accuracy stays the same (really bad). I’ve tried other example in the web (1000 random vectors of 100 features, 10 classes) and the network learns beatifully, but not with my dataset. Any help would be really appreaciated.

eric-haibin-lin · November 15, 2017, 9:04pm

Do you mind posting your original code? What optimizer and metric are you using? Did you try different hyper-parameters? A random guess would result in an accuracy of 50% for 2-class classification.

julio · November 15, 2017, 9:19pm

Hi Eric, I’m posting here the code, I cannot upload here the images though. If you provide me an e-mail, I could send you everything you’ll need to run it. I’m puzzled by the fact that Weka does so good with a very simple network layout (but it takes forever, it would not be practical for a big image).

import mxnet
import logging
import spectral
import numpy

#import data
img = spectral.open_image(‘19920612_AVIRIS_IndianPine_Site3.lan’)
no_bands = img.shape[2]
#Training
fraction_training_samples = 0.9
labels_file = spectral.envi.open(‘IndianPineLabels.hdr’)
labels = labels_file.read_band(0)
training_data_list = []
testing_data_list = []
training_label_list = []
testing_label_list = []
no_classes = int( labels.max() )
for i in range(1, no_classes+1):
(rows, cols)=(labels == i).nonzero()
samples = len(rows)
no_training_samples = int( samples * fraction_training_samples )
for p in range(no_training_samples):
training_data_list.append( img[rows[p], cols[p], :].flatten() )
training_label_list.append(i)
for p in range(no_training_samples, samples):
testing_data_list.append( img[rows[p], cols[p], :].flatten() )
testing_label_list.append(i)
training_data = numpy.asarray(training_data_list, dtype=numpy.float32)
training_labels = numpy.asarray(training_label_list)
testing_data = numpy.asarray(testing_data_list, dtype=numpy.float32)
testing_labels = numpy.asarray(testing_label_list)

#Normalize features
for i in range(no_bands):
maximum = max(training_data[:,i].max(), testing_data[:,i].max())
minimum = min(training_data[:,i].min(), testing_data[:,i].min())
training_data[:,i] = (training_data[:,i]-minimum)/(maximum-minimum)
testing_data[:,i] = (testing_data[:,i]-minimum)/(maximum-minimum)

batch_size=100
train_iter = mxnet.io.NDArrayIter(training_data, training_labels, batch_size, shuffle=True)
test_iter = mxnet.io.NDArrayIter(testing_data, testing_labels, batch_size)

#Multilayer Perceptron
data = mxnet.sym.var(‘data’)

The first fully-connected layer and the corresponding activation function

fc1 = mxnet.sym.FullyConnected(data=data, num_hidden=110)
act1 = mxnet.sym.Activation(data=fc1, act_type=“relu”)

The second fully-connected layer and the corresponding activation function

fc2 = mxnet.sym.FullyConnected(data=act1, num_hidden = 50)
act2 = mxnet.sym.Activation(data=fc2, act_type=“relu”)
fc3 = mxnet.sym.FullyConnected(data=act2, num_hidden = 10)
act3 = mxnet.sym.Activation(data=fc3, act_type=“relu”)
fc4 = mxnet.sym.FullyConnected(data=act3, num_hidden=2)

Softmax with cross entropy loss

mlp = mxnet.sym.SoftmaxOutput(data=fc4, name=‘softmax’)

logging.getLogger().setLevel(logging.DEBUG) # logging to stdout

create a trainable module on CPU

mlp_model = mxnet.mod.Module(symbol=mlp, context=mxnet.cpu())
mlp_model.fit(train_iter, # train data
eval_data=test_iter, # validation data
optimizer=‘sgd’, # use SGD to train
optimizer_params={‘learning_rate’:0.1}, # use fixed learning rate
eval_metric=‘acc’, # report accuracy during training
batch_end_callback = mxnet.callback.Speedometer(batch_size, 100), # output progress for each 100 data batches
num_epoch=50)

Topic		Replies	Views
Optimal hyperparameters for training resnet34_v1 on ImageNet? Discussion	4	2171	July 19, 2018
Image classification example accuracy issue Discussion	3	583	September 9, 2019
How can I compute the accuracy for a multi-lable dataset? Discussion	1	453	April 25, 2019
Multi task learning (The accuracy tested of training dataset is not as high as training accurracy ) Discussion	4	555	July 24, 2018
Classifying Images into 11K classes with pretrained model	2	1624	April 1, 2018

Terrible classification accuracy of mxnet

The first fully-connected layer and the corresponding activation function

The second fully-connected layer and the corresponding activation function

Softmax with cross entropy loss

create a trainable module on CPU

Related Topics