Saliency

Background

So what even is Saliency?

What we call saliency is a shorter version fo the more widely used term in industry, Salient Object Detection (SOD). At a high level this is exactly what it sounds like: detect the most salient objects (or regions) in images.

Saliency has evolved over the years from a binary segmentation task to one that has become increasingly dependent on deep learning. Convolutional neural networks (CNNs) in particular has been adopted by many researchers and in the industry for SOD.

An overview of the progress of SOD can be found here and here.

Overview

For our specific use case, saliency is the stage of our computer vision pipeline where we take in the original image, from the camera mounted on the bottom of our plane, and find the areas where targets exist. A bounding box is found each of the targets, which then allows us to crop down the image and send it down to the GCS.

Saliency Demo Image

Current Implementation

We are using the Faster RCNN detection network which is the third generation in a family of networks. This type of classification network derived from R-CNN, was improved into Fast R-CNN and finally into Faster R-CNN.

The current implementation of saliency is found in garretts-new-lunchbox. It is using the PyTorch machine learning framework to actually create, train, and test a model that implements the Faster R-CNN design. The actual structure and design for the traning and testing of the model is inspired directly from PyTorch references.

The actual code for saliency must reside on the onboard computer (OBC), an Nvidia Jetson TX2. Our camera takes images that are around 5k, so our saliency process must take place on the OBC, because sending down every full resolution image could potentially cause latency problems. Since saliency is the first step in our CV pipeline, it is crucial that we actually detect every single target that is on the ground. What this means practically is that we would rather have false positives compared to false negatives. In statistical terms, false positives are equivalent to type 1 errors while false negatives are equivalent to type 2 errors.

Future Work

Verify benchmark for current model implementation
Make necessary improvements to either data pipeline or model itself
Test other models and compare benchmarks against current model