This project aims to build a predictive model that learn to identify attributes associated with each animal. It was part of hackerearth.com Challenge. It provides 18.000 images of 50 different categories, 12.600 have animal characteristics. These characteristics are treated as labels to train a predictive model that learns to map animal images to attributes. This set of 12600 images will be the training set.

alt text

Metrics

There are 85 different attributes per image in the training set. Two metrics monitor the model performance. A vector of 85 positions such that each location contains either 0 or 1 represents image characteristics. Each position corresponds to a single attribute; For instance, the first eight positions are color attributes such that position with index 0 is black and position with index 7 is yellow. The position with index 84, which is last is domestic.

Accuracy

The number of correct predicted attributes cross all examples divided by a total number of characteristics of all examples.

Loss Cross-Entropy

The cost function besides that convey information to the model on how well it is doing to adjust parameters, also indicates the confidence of the results.

$$-\frac{1}{m} \sum_{i=1}^m \Big[ y_{i} \cdot \log(\hat{y})+(1-y_{i})\cdot \log(1-\hat{y}) \Big]$$

Features and Training

The extraction of features took roughly 5 hours; using TensorFlowHub I’ve extracted features (bottlenecks) from both train and test images, with the Progressive Neural Architecture Search (PNASNet-5) architecture. After passing all pictures over this network, the result was having a vector containing 4320 features per image.

python retrain.py \
— image_dir ./DL3Dataset \
— tfhub_module https://tfhub.dev/google/imagenet/pnasnet_large/feature_vector/1

The features are feed into a two-layer network model from Keras; details are here. And using 30 epochs; where each epoch runs in less than 5 seconds, I was able to have a model with 98.3% of accuracy. The accuracy keeps in both train and validation sets. The image below shows how this happens and how the metrics perform during training.

alt text

Results

The model reports excellent performance on train and validation sets up to 98.4% of accuracy and 0.03 of cost on the loss function. Some images from the validation set are sampled and show the predicted attributes and the actual attributes below:

alt text

Two pictures are randomly selected from the test set which are unseen examples given by HackerEarth challenge, uploading the output predictions from the entire test set HackerEarth performance measurement gives 98.4% of accuracy on the test set.

Here there are two random images from the test set and its predicted attributes.

alt test

alt test

Conclusion

The final model gets excellent results; it does not show any sign of overfitting or bias out of the normal. This model even can be improved by analyzing further how the images were tagged with attributes and explore if all attributes are well tagged to each image.

All of the details an code are in this GitHub repository.

Medium Post