This project is part of the Udacity Data Scientist Nanodegree Program: Image Classifier Project and the goal was to apply Deep learning techniques to train a image classifier to recognize different species of flowers
Let’s start by using the CRISP-DM Process (Cross Industry Process for Data Mining):
Business Understanding
Data Understanding
Prepare Data
Data Modeling
Evaluate the Results
Deploy
Business Understanding
Image classification is a pretty common task nowadays and it consists in taking an image and some classes as input and outputting a probability that the input image belongs to one or more of the given classes. About this I want to recommend this awesome story from Anne Bonner. Anyway the goal of this project was to build an application that can be trained on any set of labeled images to make predictions on the given input. The specific dataset provided by Udacity was about flowers
Data Understanding
The dataset contains images of flowers belonging to 102 different categories. The images were acquired by searching the web and taking pictures. The images have large scale, pose and light variations. In addition, there are categories that have large variations within the category and several very similar categories. More information in this paper by M. Nilsback, A. Zisserman
Prepare Data and Data Modeling
The labeling of the images and the mapping between categories name and labels was provided by Udacity in a json file: cat_to_name.json.
Udacity also provided the all dataset in an organized directory tree: test, train and valid folders. In each folder there is a folder named after the category label in which we find the images in .jpeg format for that category
The project is broken down into multiple steps:
Load and preprocess the image dataset
Train the image classifier on the dataset
Use the trained classifier to predict image content
Evaluate the Results
The default network used by the application is torchvision.models.vgg16 which is a convolutional neural network model proposed by K. Simonyan and A. Zisserman from the University of Oxford in the paper “Very Deep Convolutional Networks for Large-Scale Image Recognition”.
To use this network we have resized the images to 224x224 pixels because the input for cov1 layer is of fixed size 224 x 224 RGB image. Udacity provided guidelines for the development of this project through a Jupyter Notebook so a lot of steps are pretty straightforward. To help the network generalize in order to obtain better performance we will apply transformations such as random scaling, cropping, and flipping.
The pre-trained networks you’ll use were trained on the ImageNet dataset where each color channel was normalized separately. For all three sets you’ll need to normalize the means and standard deviations of the images to what the network expects. For the means, it’s [0.485, 0.456, 0.406] and for the standard deviations [0.229, 0.224, 0.225], calculated from the ImageNet images. These values will shift each color channel to be centered at 0 and range from -1 to 1.
Now that we have a pretrained network we have to:
Define a new, untrained feed-forward network as a classifier, using ReLU activations and dropout
Train the classifier layers using backpropagation using the pre-trained network to get the features
Track the loss and accuracy on the validation set to determine the best hyperparameters
By default the training is done using the GPU if it is available by checking torch.cuda.is_available()
torch.cuda adds support for CUDA tensor types, that implement the same function as CPU tensors, but they utilize GPUs for computation. More information here,
Using 10 epochs we obtain an accuracy of 0.8944 on the training dataset.
On the validation dataset we obtain an accuracy of 0.8563.
Deploy
A dash application has been developed as user interface: it is possible to upload an image to be classified. When no image is uploaded the application shows an overview of some information about the training dataset.
When an image is provided and the Classify button is pressed, the application shows the classification probability distribution of the image and an image from the training dataset of the corresponding class for comparison.
In the example shown I have searched on Google image the flower Pink Primrose and used it as input to check the output of my classifier.
You can try it out on my website in this page.
Fun Fact
I have tried to classify my profile picture and apparently I am a Sword Lily.
Outro
I hope the post was interesting and thank you for taking the time to read it. The code for this project can be found in this GitHub repository, on my Medium you can find a more in depth story and on my Blogspot you can find the same post in italian. Let me know if you have any question and if you like the content that I create feel free to buy me a coffee.