Accurate and efficient corrosion detection is a difficult but important issue with immediate relevance to maintenance of Naval ships. The current process requires an inspector to physically access the space and perform a very manual visual inspection of the space. Considering the schedules of both the inspector and the ship, coordinating the inspection of hundreds of tanks and voids is not always a straightforward process. There is a significant amount of research into automatic detection of corrosion via computer vision algorithms, but performing pixel level segmentation introduces added difficulty. There are two key reasons for this: the lack of annotated data and the inherent difficulty in the type of problem. In this work, we utilized a combination of annotated data from a different domain and a small hand labeled dataset of panoramic images from our target domain: the inside of empty ship tanks and voids. We trained two High-Resolution Network (HRNet) models for our corrosion detector; the first with a dataset outside our target domain, the second with our hand annotated panoramic tank images. By ensembling our two models, the F1-score increased by about 120% and IOU score by about 176% with respect to the single baseline corrosion detector. The data collection process via LiDAR scanning allows the inspection process to be performed remotely. Additionally, the setup of the detector leads to a natural expansion of the corrosion dataset as panoramas from LiDAR scans are continually fed through the detector and the detections are validated. This allows for the corrosion models to be later retrained for potential improvement in accuracy and robustness.
KEYWORDS: Video, Cameras, Object detection, RGB color model, Education and training, Detection and tracking algorithms, Video surveillance, Image processing, Signal processing, Feature extraction
Video sensors are ubiquitous in the realm of security and defense. Successive image data from those sensors can serve as an integral part of early-warning systems by drawing attention to suspicious anomalies. Using object detection, computer vision, and machine learning to automate some of those detection and classification tasks aids in maintaining a consistent level of situational awareness in environments with ever-present threats. Specifically, the ability to detect small objects in video feeds would help people and systems to protect themselves against far away or small hazards. This work proposes a way to accentuate features in video stills by subtracting pixels from surrounding frames to extract motion information. Features extracted from a sequence of frames can be used either alone, or that signal can be concatenated onto the original image to highlight a moving object of interest. Using a two-stage object detector, we explore the impacts of frame differencing on Drone vs. Bird videos from both stationary cameras as well as cameras that pan and zoom. Our experiments demonstrate that this algorithm is capable of detecting objects that move in a scene regardless of the state of the camera.
Standard object detectors are trained on a wide array of commonplace objects and work out-of-the-box for numerous every-day applications. Training data for these detectors tends to have objects of interest that appear prominently in the scene making them easy to identify. Unfortunately, objects seen by camera sensors in the real-world scenarios typically do not always appear large, in-focus, or towards the center of an image. In the face of these problems, the performance of many detectors lags behind the necessary thresholds for their successful implementation in uncontrolled environments. Specialized applications necessitate additional training data to be reliable in-situ, especially when small objects are likely to appear in the scene. In this paper, we present an object detection dataset consisting of videos that depict helicopter exercises recorded in an unconstrained, maritime environment. Special consideration was taken to emphasize small instances of helicopters relative to the field-of-view and therefore provides a more even ratio of small-, medium-, and large-sized object appearances for training more robust detectors in this specific domain. We use the COCO evaluation metric to benchmark multiple detectors on our data as well as the WOSDETC (Drone Vs. Bird) dataset; and, we compare a variety of augmentation techniques to improve detection accuracy and precision in this setting. These comparisons yield important lessons learned as we adapt standard object detectors to process data with non-iconic views from field-specific applications.
Generating imagery using gaming engines has become a popular method to both augment or completely replace the need for real data. This is due largely to the fact that gaming engines, such as Unity3D and Unreal, have the ability to produce novel scenes and ground-truth labels quickly and with low-cost. However, there is a disparity between rendering imagery in the digital domain and testing in the real domain on a deep learning task. This disparity/gap is commonly known as domain mismatch or domain shift, and without a solution, renders synthetic imagery impractical and ineffective for deep learning tasks. Recently, Generative Adversarial Networks (GANs) have shown success at generating novel imagery and overcoming this gap between two different distributions by performing cross-domain transfer. In this research, we explore the use of state-of-the-art GANs to perform a domain transfer between a rendered synthetic domain to a real domain. We evaluate the data generated using an image-to-image translation GAN on a classification task as well as by qualitative analysis.
Rendering synthetic imagery from gaming engine environments allows us to create data featuring any number of object orientations, conditions, and lighting variations. This capability is particularly useful in classification tasks, where there is an overwhelming lack of labeled data needed to train state-of-the-art machine learning algorithms. However, the use of synthetic data is not without limit: in the case of imagery, training a deep learning model on purely synthetic data typically yields poor results when applied to real world imagery. Previous work shows that "domain adaptation," mixing real-world and synthetic data, improves performance on a target dataset. In this paper, we train a deep neural network with synthetic imagery, including ordnance and overhead ship imagery and investigate a variety of methods to adapt our model to a dataset of real images.
Explosive Ordnance Disposal (EOD) technicians are on call to respond to a wide variety of military ordnance. As experts in conventional and unconventional ordnance, they are tasked with ensuring the secure disposal of explosive weaponry. Before EOD technicians can render ordnance safe, the ordnance must be positively identified. However, identification of unexploded ordnance (UXO) in the field is made difficult due to a massive number of ordnance classes, object occlusion, time constraints, and field conditions. Currently, EOD technicians collect photographs of unidentified ordnance and compare them to a database of archived ordnance. This task is manual and slow - the success of this identification method is largely dependent on the expert knowledge of the EOD technician. In this paper, we describe our approach to automatic ordnance recognition using deep learning. Since the domain of ordnance classification is unique, we first describe our data collection and curation efforts to account for real-world conditions, such as object occlusion, poor lighting conditions, and non-iconic poses. We apply a deep learning approach using ResNet to this problem on our collected data. While the results of these experiments are quite promising, we also discuss remaining challenges and potential solutions to deploying a real system to assist EOD technicians in their extremely challenging and dangerous role.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.