Data Augmentation is one of the best ways ot quickly improve the performance of a Deep Learning model, especially in Computer Vision. Data Augmentation refers to the practice of artificially inflating the size of datasets through label-preserving transformations. An example illustrating the usefulness of this is to imagine a dog versus cat classifier. We want the classifier to be robust to different angles which it might view the dog in. Thus, one tactic for achieving this would be to randomly rotate training images before feeding them to the model. This survey will cover many different techniques for augmenting datasets, such that there is more information available in the training set for a Deep Neural Network model to learn from.
The image below provides an overview of the Data Augmentations discussed in this survey article:
Takonomy of Data Augmentations discussed, image taken from, A Survey on Image Data Augmentation for Deep Learning by Shorten and Khoshgoftaar.
Firstly, we will discuss basic image manipulations for data augmentation. This includes the rotation anecdote in the dog vesus cat classificatoin problem previously mentioned. Their are five primary ways of doing this, geometric transformations, color transformations, kernel filters, random erasing, and mixing images.
Geometric transformations include things such as rotation, vertical shifting, and flipping images. Color space transformations include things like isolating color channels, or applying coloring filters such as what is used to edit instagram photos. These are easily the most commonly used Data Augmentation techniques. They can be easily implemented using Keras’s ImageDataGenerator built-in class. Using these methods alone is surely enough to propel your image classification models further, however, the rest of these techniques may help you squeeze out that last accuracy percentage needed for top-level performance.
Random erasing, (also known as cutout regularization), is something we have previously written about on Henry AI Labs. This refers to the practice of randomly erasing patches of images as a normalization technique. Intuitively, this forces the model to look at the entire image, rather than just a subset of highly discriminative features.
Kernel filters are a much rarer applied operation, but this could include things like neraets neighbor blurring or sharpening. This works to strengthen or smoothen edges. Both of these characteristics could potentially result in a more robust classifier.
Mixing images is one of the more unreasonably effective Data Augmentations available. This invovles randomly merging images together through a pixel by pixel averaging operation. This is an unconvential regularization technique, but it has been shown to work somewhat effectively, another testament to the very high representational capacity of Deep Neural Networks.
More interestingly, we will now discuss ways in which Deep Learning methods are used to augment datasets.
The first example of this is the use of Adversarial Training. This is not in reference to the heavily popular Generative Adversarial Networks, (although using these for Data Augmentation will be discussed further in this survey). This Adversarial Training refers to the practice of using one neural net to learn augmentations in such a way that it fools the original classification model. This could be done through very innocuous noise perturbations, or more dramatic geometric augmentations. This completely depends on the way you design the search space of augmentations available to the adversarial augmenter.
Another example of using Deep Learning for data augmentation is through Neural Style Transfer. Imagining the same cat versus dog problem as before. We would want the classifier to be robust to things such as lighting. Aside from color space augmentations, we could perhaps further strengthen the decision boundary by using very dramatic style transfer augmentations in the form of novel artistic renderings of the cat and dog images.
Another very interesting idea in Data Augmentation is the idea of adding data created by Generative Adversarial Networks. Given a dataset of dogs and a dataset of cats, a GAN could interpolate new images of dogs and cats. This could increase the dataset size and result in better models. Another way of using GANs was presented in SimGAN. They use GANs to make data generated from a graphics engine more realistic.
In addition to the augmentation methods previously discussed, Meta-learning augmentations is another promising avenue for the future of data augmentation. One such augmentation previously discussed on Henry AI Labs is AutoAugment. AutoAugment uses Reinforcement Learnign to search for augmentations on a dataset, as well as to transfer these augmentation policies across datasets and tasks. This include transferring from ImageNet to SVHN classificatin.
Two other interesting ways Meta-Learning has been applied to Data Augmentation include the papers Neural Augmentation, discussed in, “The Effectiveness of Data Augmentation in Image Classification using Deep Learning”, and Smart Augmentation, discussed in, “Smart Augmentation: Learning an Optimal Data Augmentation Strategy”. Neural Augmentation uses search methods to learn Neural Style Transfer augmentations. Smart Augmentation uses a pre-pended classifier to learn how to combine training images to form new data points.