Autoencoder versus pre-trained CNN networks: deep-features applied to accelerate computationally expensive object detection in real-time video streams

Vasanth Iyer; Alexander Aved; Todd B. Howlett; Jeffrey T. Carlo; Bernard Abayowa

doi:10.1117/12.2326848

9 October 2018 Autoencoder versus pre-trained CNN networks: deep-features applied to accelerate computationally expensive object detection in real-time video streams

Vasanth Iyer, Alexander Aved, Todd B. Howlett, Jeffrey T. Carlo, Bernard Abayowa

Author Affiliations +

Proceedings Volume 10794, Target and Background Signatures IV; 107940Y (2018) https://doi.org/10.1117/12.2326848
Event: SPIE Security + Defence, 2018, Berlin, Germany

Abstract

Traditional event detection from video frames are based on a batch or offline based algorithms: it is assumed that a single event is present within each video, and videos are processed, typically via a pre-processing algorithm which requires enormous amounts of computation and takes lots of CPU time to complete the task. While this can be suitable for tasks which have specified training and testing phases where time is not critical, it is entirely unacceptable for some real-world applications which require a prompt, real-time event interpretation on time. With the recent success of using multiple models for learning features such as generative adversarial autoencoder (GANS), we propose a two-model approach for real-time detection. Like GANs which learns the generative model of the dataset and further optimizes by using the discriminator which learn per sample difference between generated images. The proposed architecture uses a pre-trained model with a large dataset which is used to boost weekly labeled instances in parallel with deep-layers for the small aerial targets with a fraction of the computation time for training and detection with high accuracy. We emphasize previous work on unsupervised learning due to overheads in training labeled data in the sensor domain.

Citation Download Citation

Vasanth Iyer, Alexander Aved, Todd B. Howlett, Jeffrey T. Carlo, and Bernard Abayowa "Autoencoder versus pre-trained CNN networks: deep-features applied to accelerate computationally expensive object detection in real-time video streams", Proc. SPIE 10794, Target and Background Signatures IV, 107940Y (9 October 2018); https://doi.org/10.1117/12.2326848

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
11 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Video

Target detection

Data modeling

Detection and tracking algorithms

Video acceleration

Video processing

Feature extraction

Show All Keywords

Keywords/Phrases

Search In:

Publication Years