Recently, several friends and contacts have expressed an interest in learning about deep learning and how to train a neural network. Although there are many resources available, I usually point them towards the NVIDIA DIGITS application as a learning tool. This application is a convenient tool for training, tuning, and visualizing the performance for a Convolutional Neural Network (CNN). There are several good examples and tutorials available on the DIGITS github site (https://github.com/NVIDIA/DIGITS). Someone trying to learn about deep learning applications and CNNs for the first time might start with the MNIST or CIFAR-10 datasets available online. The Getting Started section for the DIGITS application will guide a user through the MNIST dataset generation and the training for classification. However, there is not a similar tutorial for the CIFAR-10 dataset. This process is simple and straight forward, but there are a few changes needed from the MNIST example. This post is meant share a quick example for those learning how to use DIGITS. When training is completed, the application allows you to test your CNN by uploading your own images.
MNIST and Me
Let's start with a quick reminder about the MNIST example. If you worked through the getting started section in the NVIDIA github examples this will be familiar.
After training the standard LaNet model using the MNIST dataset, I created a test image using one of my handwritten numbers by simply writing down the number 3 on a white sheet of paper using a black Sharpie. I took a picture of it using the webcam on my iMac. I converted the image to black-and-white and adjusted the size to be 28x28. Here is an my test image.
Below, you can see the plots generated by DIGITS showing the results (i.e., loss, accuracy, etc.) while training the LaNet network on the MNIST training dataset.
Once the training was completed, I tested the model by uploading my own handwritten number for classification. As expected (see Fig.3), it does a nice job classifying my test case and returns the correct result with a very high probability. Keep in mind, that your own test results will depend on how you prepare the test images. Make sure you crop to the correct size, make the image black-and-white, and adjust to a good contrast.
Step up to the CIFAR-10
Once you have mastered the MNIST example, a common progression in learning how to train a CNN is to move on to the CIFAR-10 dataset. If you are using Caffe, there are some very useful scripts provided in the $CAFFE_ROOT/examples/cifar10 directory for both training and testing a CNN using this dataset. However, at the time of this post, a tutorial similar to the MNIST example did not exist for training with the CIFAR-10 dataset using DIGITS. This is not very difficult task, but there are some changes that need to be made to the example model provided in Caffe in order to get everything working properly in DIGITS. You cannot select one of the provided models in DIGITS for this dataset.
If you do not want to spend time trying to define your own custom network, you can download this custom network description and paste the text into DIGITS as a custom network for training with the CIFAR-10 dataset. Make sure you select the custom network tab in DIGITS.
Before training, set up the parameters you want to use for the number of epochs, batch size, solver, learning rate, etc. Here is a screen shot showing the parameters I used in DIGITS for training.
The training loss and validation curves are shown below. Note that it is possible to get even better performance from the CIFAR-10 dataset, but this is a quick and simple example to get you started. You might want to check out the Kaggle competition site (https://www.kaggle.com/c/cifar-10) related to this dataset to see the results from some competition attempts in working with this data.
For a quick test, I downloaded a picture of a cute dog from the interwebs. Similar to the MNIST case, I cropped and resized the image to 32x32 in order to match the training dataset size. Although this did not result in a 100% correct classification, the results are acceptable given a single test case. You may want to try out several test images, including ones from the CIFAR-10 dataset that were set aside for testing.
Although it is not necessary to use the DIGITS application to train and test a network with the CIFAR-10 dataset, it is very useful for visualizing the performance of the network and running experiments while making changes to important parameters. I find that usually the difficulty is in setting up the custom network properly in DIGITS. This often requires careful attention to syntax and some understanding of the format.
In an upcoming post, I will go through some example using DIGITS for object detection. Until then, have fun!