Obj Dect Tutorial - #4 train_ssd.py max recursion exceeded


#1

I am unable to successfully run the tutorial. My data is set up (completed VOC data prep successfully).
Ubuntu 16.04,
mxnet.version = 1.3.1
gluoncv.version = 0.3.0
python 3.6
newly booted server, download, train_ssd.py

(mxnetProd) jay@xps8100:~/projects/jay$ python train_ssd.py
INFO:root:Namespace(batch_size=32, data_shape=300, dataset=‘voc’, epochs=240, gpus=‘0’, log_interval=100, lr=0.001, lr_decay=0.1, lr_decay_epoch=‘160,200’, momentum=0.9, network=‘vgg16_atrous’, num_workers=4, resume=’’, save_interval=10, save_prefix=‘ssd_300_vgg16_atrous_voc’, seed=233, start_epoch=0, val_interval=1, wd=0.0005)
INFO:root:Start training from [Epoch 0]
Process Process-1:

Traceback (most recent call last):
File “/home/jay/anaconda3/envs/mxnetProd/lib/python3.6/multiprocessing/process.py”, line 258, in _bootstrap
self.run()
File “/home/jay/anaconda3/envs/mxnetProd/lib/python3.6/multiprocessing/process.py”, line 93, in run
self._target(*self._args, **self._kwargs)
File “/home/jay/anaconda3/envs/mxnetProd/lib/python3.6/site-packages/mxnet/gluon/data/dataloader.py”, line 178, in worker_loop
_recursive_fork_recordio(dataset, 0, 1000)
File “/home/jay/anaconda3/envs/mxnetProd/lib/python3.6/site-packages/mxnet/gluon/data/dataloader.py”, line 173, in _recursive_fork_recordio
_recursive_fork_recordio(v, depth + 1, max_depth)
File “/home/jay/anaconda3/envs/mxnetProd/lib/python3.6/site-packages/mxnet/gluon/data/dataloader.py”, line 173, in _recursive_fork_recordio
_recursive_fork_recordio(v, depth + 1, max_depth)
File “/home/jay/anaconda3/envs/mxnetProd/lib/python3.6/site-packages/mxnet/gluon/data/dataloader.py”, line 173, in _recursive_fork_recordio
_recursive_fork_recordio(v, depth + 1, max_depth)
[Previous line repeated 974 more times]
File “/home/jay/anaconda3/envs/mxnetProd/lib/python3.6/site-packages/mxnet/gluon/data/dataloader.py”, line 166, in _recursive_fork_recordio
if depth >= max_depth:
RecursionError: maximum recursion depth exceeded in comparison

repeats for Process-1, -2,-3,4
(1) CPU is busy. Something seems to be running
$ nvidia-smi shows no processes

thanks for supporting these tutorials, they are a HUGE help - Jay Duff


#2

consider this closed - I entered a new case in an effort to make this simpler.
same problem - can’t get the tutorial to run - but now I’m trying to run on AWS DL-AMI v19 - to make this easily repeatable