In a previous post about Supervised racing there was a short mention about data augmentation. That has been one of the most requested topics for us to blog about, so let’s open it up a bit more.
Data augmentation is basically just a method to grow your dataset using your existing data. This can be done simply by modifying the existing dataset and including the modified data as part of training data on top of the original data.
As certain machine learning tasks - such as image classification - benefit from having large amount of data, data augmentation becomes really useful as it virtually expands the dataset. With AI controlled car, it basically means you need to generate / drive a bit less training data to still get a working model.
How to augment data?
As we’re basically training our AI car to follow the track using images, we’ll concentrate on how to augment the image data. For that there are actually several approaches, but before going into those, let’s take an example.
Say we are going to teach our model to recognize if there’s a dog in an image. For example this:
A generic sleeping dog
We can use that image, to make more images in order to get a more generic data set. As the dog can be seen also from it’s left side we can simply flip the image to get another view on the dog. Also as the head can be in different orientation we can turn it slightly.
Same dog with several applied augmentations
Other common ways are translating image, adding noise to it and otherwise modifying it. There are also several libraries and tools that implement different augmentation methods e.g. https://github.com/mdbloice/Augmentor and https://github.com/aleju/imgaug
Data augmentation with our AI car
The methods of data augmentation should be chosen based on the requirements of the model. For example there’s no real reason for us to flip our images vertically as we don’t aim to teach our car to drive on its roof.
Flipping vertically makes no sense
Also as in our case the track will be flat, so there’s no need to prepare for situations where the images would be rotated in any way.
Currently we use three augmentation methods with our AI car: flipping the images horizontally, adjusting brightness, and adding artificial shadows.
When flipping input images horizontally we also need to augment corresponding steering and possible other inputs (such as acceleration sensor). When the image is mirrored also the value of X axis needs to be flipped. This also means that the steering of the car needs to be well calibrated and centered as this method won’t work properly if steering the same amount to left and right produces different results.
Both images and data in records are flipped
import numpy as np import cv2 def augment_flip(img, record): img = cv2.flip(img, 1) flippable_keys = [ 'user/angle', ] for key in flippable_keys: record[key] = 0 - record[key] return img, record
Flipping helps to teach the model to have more generic handling on turns.
One of the biggest challenges we’ve encountered have been related to changing illumination on our office test track. We have a huge windows and there’s a big difference between sunny and cloudy days (bright/dark) and also with the time of the day (angle of sunlight). Our attempt to reduce the effect of that has been to augment our data by changing the brightness in images.
Darkened and lightened images
import numpy as np import cv2 def augment_brightness(img): # convert image to hsv for easier brightness modification hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) brightness = np.random.uniform(0.5, 1.5) hsv[:, :, 2] = np.clip(hsv[:, :, 2] * brightness, 0, 255) img = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR) return img
Another way to handle the changing track conditions is to add artificial shadows to the images. This is done by darkening part of the images. The method was inspired by an article An augmentation based deep neural network approach to learn human driving behavior.
Shadow applied to images
import numpy as np import cv2 def augment_shadow(img): top_y = 320 * np.random.uniform() top_x = 0 bot_x = 160 bot_y = 320 * np.random.uniform() hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) shadow_mask = 0 * hsv[:, :, 1] X_m = np.mgrid[0:img.shape, 0:img.shape] Y_m = np.mgrid[0:img.shape, 0:img.shape] shadow_mask[((X_m - top_x) * (bot_y - top_y) - (bot_x - top_x) * (Y_m - top_y) >= 0)] = 1 shadow_density = .5 left_side = shadow_mask == 1 right_side = shadow_mask == 0 if np.random.randint(2) == 1: hsv[:, :, 2][left_side] = hsv[:, :, 2][left_side] * shadow_density else: hsv[:, :, 2][right_side] = hsv[:, :, 2][right_side] * shadow_density img = cv2.cvtColor(hsv, cv2.COLOR_HSV2BG) return img
One option we’ve discussed, but not yet implemented is to overcome one additional issue with our driving conditions: Our track tends to be a bit reflective, since it’s a concrete floor coated with shiny chemicals. Our plan is to test if adding artificial reflection would help to overcome some of that.
Additionally, we may try an offroad version of our car and there we’d be moving the images a bit (for height variation). Also adding more noise and slight rotation could make sense.
How to augment?
Data Augmentation can be done in two different phases: either before training or inline, meaning during training. With Donkeycar it may be easier to start with before training, meaning running data augmentation with existing data and storing augmented images and record-data within the tub. Inline augmentation removes this extra step and would be faster, but it also requires more tinkering.
All in all, data augmentation is a rather simple method to make your dataset larger and thus reduces the time you need to gather training material.
In addition to increasing the dataset augmentation can also be used to add variance to the data. We can artificially create images that correspond e.g. night time or cloudy days. At least in our experiments with Donkeycar it has helped our model to become more generic and less affected by environment changes.
As a bonus, here’s a simple augmentation script for augmenting DonkeyCar tubs: augment.py.