Example difference in C++ and python


#1

Hi,

After having compiled the mxnet dll in C++, I wanted to know if everything was correct by using the lenet network in the example folder. I was compiling and running but it never converges. The data loading mecanism works perfectly, the network seems to be correctly created, initialized, the weights seems to be updates (or at least, they change), the optimizer is defined, but it doesn’t converge even close. I repeated the operation with googlenet, and no more luck…

I then tried to replace the dll on my python package folder. And it works perfectly with the python example. Thus, I am confused: what could be wrong, the DLL? If that’s the case, why does it works in python? If that is the problem of the example, why it doesn’t work even with another example? And what is wrong in it?

Thanks!


#2

Hi @dmldge,

Could you point me to the specific examples you are using (C++ and Python)? And also provide the logs from the scripts, thanks!


#3

Hi @thomelane,

Thanks for your answer. You are quite active in the community! :slight_smile: I really appreciate your help!
For the python part, that works perfectly, I execute the regular example:


it runs this network:

I get this wonderful output: https://framabin.org/p/?ecac5c7f2c33806b#pbsCv064+53U1e5W5qIuXv37jQ+1eQ6TvGpVFlCJWyo=

For the C++ part, I was using the official lenet example:


Since it was not working properly, and some choices of design seemed to be a bit particular, I tried to modify it, and at the end, stick at close as possible to the working python example:
https://framabin.org/p/?f9ce291d8f418592#ZHSWyRgtB3RSKvqlDZFi7de5OFsnKO1PsscU4XooHk4=
I tried to change a multiple time the optimizer, adjust the network, etc. No success. But I check the data loader - it is working fine. I exhausted my ideas…
Ahh, and I was forgetting to attach the output that don’t work: https://framabin.org/p/?1e435824da47c45a#1Vi7nsLFyDUq4UlzCbGvyskgSTLbJIJx635WdKHrkuA=
In this case, it doesn’t move. But sometimes, during my changes, I managed to have improvement that stall at 30% of accuracy, or even once at 60%. But… That is mnist… :slightly_frowning_face:

And since I am there: the MXNotifyShutdown(); seems to be there to delete all the static elements created. Am I right? I kinda noticed that mxnet use a lot static elements inside…


#4

Alright, I think I kinda identified what was going on. It seems that in the example, I had a call of SimpleBind to create the Executor. The thing is, after, I was erasing the input images by args_map["X"] = something. Doing it that had an impact on the Executor mapping, and thus it didn’t load the data properly.