Paper
21 October 2015 Embedded security system for multi-modal surveillance in a railway carriage
Rhalem Zouaoui, Romaric Audigier, Sébastien Ambellouis, François Capman, Hamid Benhadda, Stéphanie Joudrier, David Sodoyer, Thierry Lamarque
Author Affiliations +
Abstract
Public transport security is one of the main priorities of the public authorities when fighting against crime and terrorism. In this context, there is a great demand for autonomous systems able to detect abnormal events such as violent acts aboard passenger cars and intrusions when the train is parked at the depot. To this end, we present an innovative approach which aims at providing efficient automatic event detection by fusing video and audio analytics and reducing the false alarm rate compared to classical stand-alone video detection. The multi-modal system is composed of two microphones and one camera and integrates onboard video and audio analytics and fusion capabilities. On the one hand, for detecting intrusion, the system relies on the fusion of “unusual” audio events detection with intrusion detections from video processing. The audio analysis consists in modeling the normal ambience and detecting deviation from the trained models during testing. This unsupervised approach is based on clustering of automatically extracted segments of acoustic features and statistical Gaussian Mixture Model (GMM) modeling of each cluster. The intrusion detection is based on the three-dimensional (3D) detection and tracking of individuals in the videos. On the other hand, for violent events detection, the system fuses unsupervised and supervised audio algorithms with video event detection. The supervised audio technique detects specific events such as shouts. A GMM is used to catch the formant structure of a shout signal. Video analytics use an original approach for detecting aggressive motion by focusing on erratic motion patterns specific to violent events. As data with violent events is not easily available, a normality model with structured motions from non-violent videos is learned for one-class classification. A fusion algorithm based on Dempster-Shafer’s theory analyses the asynchronous detection outputs and computes the degree of belief of each probable event.
© (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Rhalem Zouaoui, Romaric Audigier, Sébastien Ambellouis, François Capman, Hamid Benhadda, Stéphanie Joudrier, David Sodoyer, and Thierry Lamarque "Embedded security system for multi-modal surveillance in a railway carriage", Proc. SPIE 9652, Optics and Photonics for Counterterrorism, Crime Fighting, and Defence XI; and Optical Materials and Biomaterials in Security and Defence Systems Technology XII, 96520C (21 October 2015); https://doi.org/10.1117/12.2194262
Lens.org Logo
CITATIONS
Cited by 8 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Sensors

Analytics

Expectation maximization algorithms

Cameras

Acoustics

Embedded systems

RELATED CONTENT


Back to Top