I’m using Amazon Sagemaker to build my first model using the single shot multibox detector. I’ve run through the example which uses the COCO dataset and everything worked fine. I’m now trying to train using my own images.
I have 600 images of dogs, and for my first ever model I just want to identify if there is a dog in the picture, that’s it. I am trying to use im2rec.py to create a recordio dataset from my 600 jpg images. All the jpg images are in a single directory. I run the following command to generate the list files:
python im2rec.py --num-thread 8 --list --recursive --test-ratio=0.3 --train-ratio=0.7 collie_lst ~/images/combined/
That generates 2 files, collie_lst_test.lst and collie_lst_train.lst.
When I look at the files however they do not look correct. Here are the first few lines:
571 0.000000 626.jpg
63 0.000000 158.jpg
288 0.000000 365.jpg
473 0.000000 533.jpg
614 0.000000 669.jpg
249 0.000000 329.jpg
My understanding is that there should be a lot more data for each image, but no matter what I try, this is all I get. I’ve been trying different permutations of the command for hours with no luck, if anyone can help me understand what I am doing wring I would be eternally grateful!