Knowledge Dump

image_classification (Python)

This Python 3.11 script trains a simple convolutional neural network (CNN) for the CIFAR-10 (external link) dataset image classification problem. It's implemented using the PyTorch package and briefly visualizes its performance for various hyperparameter settings.
The dataset itself consists of 60000 images of 32x32 pixel size, with each showing some object/animal of a single class. The classes are: "airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship" and "truck". There are 6000 images for each class and the data is split into 50000 training and 10000 test images.

For more information on how the data was collected, see "Learning Multiple Layers of Features from Tiny Images", Alex Krizhevsky, 2009.

The Python files consist of 5 modules for: Preprocessing and loading the data, setting up our CNN class, training the CNN on the data, visualizing the results and lastly the main module that imports the other modules to execute the functions therein.
While the CNN is very simple in structure, its performance isn't too bad (especially considering the small size of the images), with an image classification accuracy of about 74% on the testing set, when using the last hyperparameter settings tested in the code.

There would be several ways of further improving the performance of our CNN: More finetuning of the parameters (learning rate, number of epochs, batch size), adjusting the convolutional or fully connected layers, adding other techniques like dropout (dropping nodes during training), making the learning rate dynamic (updating it after every epoch) or automatically stopping the training, once the test performance isn't improving anymore for several epochs.
However, all those methods are omitted here, to keep it simple.
Download (Python 3.11 script as .zip).