I was playing with TensorFlow's brand new Object Detection API and decided to train it on some other publicly available datasets.
I happened to stumble upon this grocery dataset which consists of images of various brands of cigarette boxes on the supermarket shelf along with a text file which lists out the bounding boxes of each cigarette box in each image. 10 major brands have been labeled in the dataset and all other brands fall into the 11th "miscellaneous" category.
I followed their tutorial and managed to train the model on this dataset. Due to limitations on processing power, I used only a third of the dataset and performed a 70:30 split for training and testing data. I used the faster_rcnn_resnet101 model. All parameters in my config file are the same as the default parameters provided by TF.
After 16491 global steps, I tested the model on some images but I am not too happy with the results -
Failed to detect the Camels in top-shelf whereas it detects the product in other images
Why does it fail to detect the Marlboros in the top row?
Another issue I had is that the model never detected any other label except for label 1
Doesn't detected a crop instance of the product from the training data
It detects cigarette boxes with 99% confidence even in negative images!
Can somebody help me with what is going wrong? What can I do to improve the accuracy? And why does it detect all products to belong in category 1 even though I have mentioned that there are 11 classes in total?
Edit Added my label map:
item {
id: 1
name: '1'
}
item {
id: 2
name: '2'
}
item {
id: 3
name: '3'
}
item {
id: 4
name: '4'
}
item {
id: 5
name: '5'
}
item {
id: 6
name: '6'
}
item {
id: 7
name: '7'
}
item {
id: 8
name: '8'
}
item {
id: 9
name: '9'
}
item {
id: 10
name: '10'
}
item {
id: 11
name: '11'
}
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…