Short Introduction to Data Augmentation

In a previous post about Supervised racing there was a short mention about data augmentation. That has been one of the most requested topics for us to blog about, so let’s open it up a bit more.

Data augmentation is basically just a method to grow your dataset using your existing data. This can be done simply by modifying the existing dataset and including the modified data as part of training data on top of the original data.

As certain machine learning tasks - such as image classification - benefit from having large amount of data, data augmentation becomes really useful as it virtually expands the dataset. With AI controlled car, it basically means you need to generate / drive a bit less training data to still get a working model.

How to augment data?

As we’re basically training our AI car to follow the track using images, we’ll concentrate on how to augment the image data. For that there are actually several approaches, but before going into those, let’s take an example.

Say we are going to teach our model to recognize if there’s a dog in an image. For example this:

pic of a dog

A generic sleeping dog

We can use that image, to make more images in order to get a more generic data set. As the dog can be seen also from it’s left side we can simply flip the image to get another view on the dog. Also as the head can be in different orientation we can turn it slightly.

pic of the same dog with lot of different augmentations

Same dog with several applied augmentations

Other common ways are translating image, adding noise to it and otherwise modifying it. There are also several libraries and tools that implement different augmentation methods e.g. https://github.com/mdbloice/Augmentor and https://github.com/aleju/imgaug

Data augmentation with our AI car

The methods of data augmentation should be chosen based on the requirements of the model. For example there’s no real reason for us to flip our images vertically as we don’t aim to teach our car to drive on its roof.

vertically flipped image

Flipping vertically makes no sense

Also as in our case the track will be flat, so there’s no need to prepare for situations where the images would be rotated in any way.

Currently we use three augmentation methods with our AI car: flipping the images horizontally, adjusting brightness, and adding artificial shadows.

Flipping horizontally

When flipping input images horizontally we also need to augment corresponding steering and possible other inputs (such as acceleration sensor). When the image is mirrored also the value of X axis needs to be flipped. This also means that the steering of the car needs to be well calibrated and centered as this method won’t work properly if steering the same amount to left and right produces different results.

images and data from flipped case

flipped data

Both images and data in records are flipped

import numpy as np
import cv2

def augment_flip(img, record):
    img = cv2.flip(img, 1)

    flippable_keys = [
        'user/angle',
    ]

    for key in flippable_keys:
        record[key] = 0 - record[key]

    return img, record

Flipping helps to teach the model to have more generic handling on turns.

Brightness

One of the biggest challenges we’ve encountered have been related to changing illumination on our office test track. We have a huge windows and there’s a big difference between sunny and cloudy days (bright/dark) and also with the time of the day (angle of sunlight). Our attempt to reduce the effect of that has been to augment our data by changing the brightness in images.

images with brightness changes Darkened and lightened images

import numpy as np
import cv2

def augment_brightness(img):
    # convert image to hsv for easier brightness modification
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)  

    brightness = np.random.uniform(0.5, 1.5)
    hsv[:, :, 2] = np.clip(hsv[:, :, 2] * brightness, 0, 255)

    img = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)

    return img

Artificial shadows

Another way to handle the changing track conditions is to add artificial shadows to the images. This is done by darkening part of the images. The method was inspired by an article An augmentation based deep neural network approach to learn human driving behavior.

images with artificial shadows

Shadow applied to images

import numpy as np
import cv2

def augment_shadow(img):
    top_y = 320 * np.random.uniform()
    top_x = 0
    bot_x = 160
    bot_y = 320 * np.random.uniform()
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    shadow_mask = 0 * hsv[:, :, 1]
    X_m = np.mgrid[0:img.shape[0], 0:img.shape[1]][0]
    Y_m = np.mgrid[0:img.shape[0], 0:img.shape[1]][1]

    shadow_mask[((X_m - top_x) * (bot_y - top_y) - (bot_x - top_x) * (Y_m - top_y) >= 0)] = 1

    shadow_density = .5
    left_side = shadow_mask == 1
    right_side = shadow_mask == 0

    if np.random.randint(2) == 1:
        hsv[:, :, 2][left_side] = hsv[:, :, 2][left_side] * shadow_density
    else:
        hsv[:, :, 2][right_side] = hsv[:, :, 2][right_side] * shadow_density

    img = cv2.cvtColor(hsv, cv2.COLOR_HSV2BG)
    return img

Further options

One option we’ve discussed, but not yet implemented is to overcome one additional issue with our driving conditions: Our track tends to be a bit reflective, since it’s a concrete floor coated with shiny chemicals. Our plan is to test if adding artificial reflection would help to overcome some of that.

image with bad reflection

Additionally, we may try an offroad version of our car and there we’d be moving the images a bit (for height variation). Also adding more noise and slight rotation could make sense.

How to augment?

Data Augmentation can be done in two different phases: either before training or inline, meaning during training. With Donkeycar it may be easier to start with before training, meaning running data augmentation with existing data and storing augmented images and record-data within the tub. Inline augmentation removes this extra step and would be faster, but it also requires more tinkering.

Conclusion

All in all, data augmentation is a rather simple method to make your dataset larger and thus reduces the time you need to gather training material.

In addition to increasing the dataset augmentation can also be used to add variance to the data. We can artificially create images that correspond e.g. night time or cloudy days. At least in our experiments with Donkeycar it has helped our model to become more generic and less affected by environment changes.

As a bonus, here’s a simple augmentation script for augmenting DonkeyCar tubs: augment.py.