Getting consistent builds


#1

I am trying to contribute code to MxNet, but whenever I get time, I find myself trying to get the build system to work. It there a good way to get consistent builds? Like a Dockerfile that builds my current source tree.


#2

Hi @aidan-plenert-macdon, Which part of the build are you having issues with? Are some of the unit tests failing?


#3

No just running make fails with a variety of errors depending on the day. I just run on the master branch and it fails. I know this doesn’t help you help me debug the build, but I am hoping for a better way to build. Currently I am just running make and it fails, so I have to go and fiddle with the various submods till the build gets a little further.


#4

@aidan-plenert-macdon
you can build your current source tree in a docker container with all the necessary dependencies by running this for example, for a GPU cuda9.1-cuddn7 build:
ci/build.py --platform ubuntu_build_cuda /work/runtime_functions.sh build_ubuntu_gpu_cuda91_cudnn7

Have a look at the ci folder and the README, you have a few build platforms available.

Also remember when checking out mxnet to use
git clone https://github.com/apache/incubator-mxnet --recursive to get the submodule checked out as well.


#5

@ThomasDelteil So that works in that it runs, but once it finishes and I get dropped into the container, neither python2 nor python3 have mxnet installed and further more, make still fails

$ ci/build.py -p ubuntu_cpu --into-container
into container: True
build.py: 2018-04-23 16:18:27,613 Building container tagged 'mxnet/build.ubuntu_cpu' with docker
build.py: 2018-04-23 16:18:27,614 Running command: 'docker build -f docker/Dockerfile.build.ubuntu_cpu --build-arg USER_ID=514839664 -t mxnet/build.ubuntu_cpu docker'
Sending build context to Docker daemon  112.1kB
 ... 28 steps later ...
Successfully built 43573ca2ee85
Successfully tagged mxnet/build.ubuntu_cpu:latest
groups: cannot find name for group ID 1896053708
jenkins_slave@b8c235a1940a:/work/mxnet$ python3
Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named 'mxnet'
>>> 
jenkins_slave@b8c235a1940a:/work/mxnet$ make
Makefile:240: WARNING: Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages
make: *** No rule to make target '~/incubator-mxnet/3rdparty/dmlc-core/include/dmlc/omp.h', needed by 'build/src/operator/nn/softmax.o'.  Stop.

#6

For the right make flags, follow the instructions of the build from source page tab:
http://mxnet.incubator.apache.org/install/index.html

From what I see you haven’t cloned the github repo properly, as in it is missing the submodules.
try to run
git submodule update --init --recursive
make clean
make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas

For python, you can check the instructions on the install page to add it to your python installation or just use
export PYTHONPATH=/work/mxnet/python before launching your python interpreter or notebook.

edit:
tested:

git clone https://github.com/apache/incubator-mxnet --recursive mxnet
cd mxnet
ci/build.py -p ubuntu_cpu --into-container
make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas

successfully builds mxnet CPU version