I am using Gluon’s pretrained object detection model. I was wondering if there is a way to gather the data for what objects were detected by the algorithm. I want to write a computer program that will do something based on what objects are detected in the photo, without me seeing the photo. I tried some print statements of class_IDs, scores, class_IDs etc. but none give much useful information. Interestingly, printing class_IDs prints a nested list composed of numbers 0 or -1 inside individual brackets. Online, it says the class_IDs variable holds the predicted class IDs detected by the model… that doesn’t seem to align. If anyone has any advice please let me know! Thank you!
Yes, this is certainly possible. Can you link to the specific model you’re using here? I presume it’s from GluonCV. One of the most important factors when using pre-trained models (without fine-tuning) is the dataset that was used for pre-training, since that determines the number and type of classes detected by the model. I recommend model’s pre-trained on COCO.
When you run the network on an image, a tuple of 3 arrays will be returned which, as you correctly point out, will be #1 class ids, #2 scores, and #3 bounding boxes. You can use the class ids array to determine the class of the object. An image with 3 detected objects, might return something like…
[5, 14, 3, -1, -1, -1, ...]
Our model predicts objects with class indexes of 5, 14 and 3. -1 is just use to pad the array when no more objects have been found. You can use
net.classes (i.e. the classes property of your network) to get a list of class labels and then use this to find out what class has been detected by the model: e.g.